iDrac6 Recovery Through TFTP and Serial

The History:
This week I had a Dell PowerEdge R510’s iDrac completely die on me; I attempted repairs with several utilities that Dell gives out on their site and all of them ended with failure. I thought it might have been because I upgrade the iDrac from an old version to the latest, without components like the BIOS or NIC, that the iDrac communicates with, being upgraded as well. After upgrading everything, iDrac still was not working, after a few days of messing with it, I found out through piecing together several sites how to force the iDrac in recovery mode to do a TFTP repair, writing a new image to it.

The symptoms:
The system used the Windows iDrac Updater, which stated the update had competed successfully. I then, remotely, told the system to reboot; it shut down and never came back up. When I physically went to the server, it was at the BIOS start screen stating “Error Communicating with iDrac. Press F1 to continue, or F2 for System Setup.” In restarting the server I found that “System Services” were disabled. Then the system would go through normal boot sequence, but when it tried to communicate with the iDrac it would fail then restart the server. After restarting, it would allow a full boot, but would give that same “Press F1 to continue, or F2 for System Setup” message. Thus the server would not boot without physical intervention at the machine.

This is a Dell PowerEdge R510, I attempted to upgrade the iDrac from 1.3.* to 1.6.5.

The Fix:
We need to get to the iDrac’s serial recovery mode, and then we can recover the system.

  1. Reboot the system, and after the system resets itself for not being able to reach iDrac go into “System Setup”, the F2 key
  2. Hit down until you select “Serial Communication”, enter that menu
  3. Set the following settings:
    • Serial System Setup Settings
    • Serial Communication : On With Console Redirection via COM2
    • Serial Port Address : Serial Device 1=COM1, Serial Device2=COM2
    • External Serial Connector : Serial Device 1
      • This could be Remote Access Device, but that gave me problems (I may have had a bad serial cable)
    • Failsafe Baud Rate : 115200
      • For the 11G servers this is the default baud rate
    • Remote Terminal Type : VT100/VT220
    • Redirect After Boot : Enable
  4. Then rebooted the system. I got Windows to start by manually hitting F1
  5. At this point you need to go to support.dell.com, lookup downloads for your system, then under “Embedded Server Management” there is “iDRAC6 Monolithic Release 1.97” (or whatever version is newest)
  6. There are several versions, for my system I got “iDRAC6_1.97_A00_FW_IMG.exe (50 MB)”
  7. After downloading, running this file will extract “firmimg.d6” and a readme file.
    • The readme has no useful information in it, it just tells you to search for the user guide
  8. The “firmimg.d6” file needs to be placed on a TFTP server that the iDrac can hit
  9. Using Putty in Windows I connected the COM2 at 115200 Baud, this is the iDrac being redirected. Connect to your systems Com2 however you can
    • Note all this is being done on the server and nothing is done on a other machine, I had TFTP running on this Windows system
  10. Hitting enter should show a recovery menu
    • Unfortunately I did not save pictures of the recovery screen, some of the next menu options may not be the exact wording
  11. I had DHCP on the network my iDrac was sitting on so I hit 9 to get a IP address, this can also be set manually
  12. Hit 7 to change the TFTP server IP address
  13. Now hit the option that says “Firmware Upgrade”, this will go to the TFTP server specified, download the firmware, and reinstall all pieces of the iDrac from that file. It takes about 5 minutes.
  14. Keep in mind you are in your OS, for me Windows, while the iDrac and its system upgrades and reboots
  15. After it reboots successfully the recovery console stops getting data, I was next to the server, when the iDrac reboots the fans go to full speed then calm back down. That’s how I was able to tell it restarted
  16. Now you can use the RACADM commands if open manage/iDrac tools are installed, or reboot and you should see “System Services” back online, then you can change the IP of the iDrac like normal

Everything should work now and the world is happy!

Updated Windows Sudo

Recently I updated my Windows sudo program and added a command for Super Conduit, this is what I call some tweaks that you can make to a Windows Vista+ system. This allows someone to copy sudo.exe to a systems, system32 folder; then after running “sudo cmd” you can run “sudo /write” so add ls, ifconfig, and superc as a option in the command line.

Superc has options of enable, disable, and show. Making it easy to run. :)

Newest build is always here https://github.com/daberkow/win_sudo/raw/master/sudo/sudo/bin/Release/sudo.exe

Super Conduit

Due to the high latency of the lines between my works offices, file transfers can be slow. There are settings in Windows Vista+ systems that can allow the TCP window to grow, and allow much higher utilization on these lines. I call it Super Conduit. This may be possible on *nix systems, but the way this tweak works is that it tells the other side it will be doing this tweak. That means that both sides have to be at least Windows Vista Kernel, (Server 2008 works) that also means that linux file servers will not work because them seem to be linux machines with SMB. This should be done over wired connections, because the packet loss on wireless hurts these connections more than anything else.

With the “autotuninglevel” change, I have seen speed changes from a 1megabit a second line go to 150-200 megabits a second.

WARNING: Windows Vista/7 IP stack can not handle changing this setting and using normal connections, meaning once this is done usually the internet stops working until the setting is reversed. Windows 8+ seems to have no problems with this setting, and the internet; it just makes Win 8/8.1 more awesome than it already is, which is pretty awesome.

  1. Login under a administrator account to the Windows machine
  2. Open ‘cmd’ as a administrator
    1. Title bar should be “Administrator: C:\Windows\System32\cmd.exe”
  3. “netsh interface tcp show global” will show the current settings of your machine
    1. Command Line Status
  4. “netsh interface tcp set global autotuninglevel=experimental” enables the majority of what you need for faster transfers, all you will get back in response is “Ok.”
    1. Image2
  5. Another setting I have used in the past is “netsh interface tcp set global ecncapability=enabled” this adds a flag to the packs that tells routers “I dont care if I get slowed down, please dont drop me completely”. The problem you run into with large TCP windowing is one dropped lowers the TCP window size a lot and slows the connection making it a lot more spiky. This command doesnt always help, but setting it hasnt hurt in the past.
    1. Image3
  6. The “rss” receive-side scaling state should be set to enabled, that should be the default. This allows the receiver to do these types of conenctions.
  7. When you are done your transfer just run “netsh interface tcp set global autotuninglevel=normal”

 

Troubleshooting Notes:

Windows 7 seems to act oddly when starting to use this setting, so I would enable it then restart the machine. I believe that cached sessions already in progress do not take the new setting.

 

YAY MATH:

http://bradhedlund.com/2008/12/19/how-to-calculate-tcp-throughput-for-long-distance-links/

Default window size: 65536 bytes * 8 = 524288 bits

73ms latency between cross country offices, 524288 bits / 0.073 seconds = 7,182,027 Bits throughput, theoretically. 897,753 B/s, max.

This setting increases that window size to something larger, much larger, and thus gives better speeds. The only interesting downside is that since the TCP window is big, if a packet is then lost, TCP resizes the window to a much smaller setting; forcing the window to climb again.

That is a 1GB link going across the country.

That is a 1GB link going across the country.

VM Experimentation

I am the type of programmer/IT person who enjoys having all my experimentation of systems done inside a virtual machine. That way if I break something, I can easily role back the virtual machine or just delete it. As seen in my last post, I recently built a new NAS. The original plan was to turn my old server into a Proxmox or ESXi box, the downside to that plan I found out quickly; the old box used DDR2, and at this point to get DDR2 memory it is quite expensive. That, along with my worry of power usage on the old box, I decided to give another solution a try.

After researching around I found my local Fry’s Electronics had the Intel NUC in stock. This is a tiny tiny PC that can take up to 16GB of RAM, has an Intel Core i5, and only uses 17 Watts. The box also has Intel vPro; what is vPro you ask? vPro allows you to remotely manage the system, so I can remote into it without buying a fancy management card, I can also remote power the box on and off, or mount a virtual CD. not bad for a ~$300 box. The model I got, DC53427, is a last gen i5, so it was a little cheaper, at the cost of having only 1 USB 3.0 port. It came with a VESA mount, so the NUC could be attached to the back of a monitor, that was a nice feature. I got USB 3.0 enclosure for 2 older 500GB hard drives, and used those as my storage. I installed Proxmox  on the system since my work has been starting to use that software more and more, and this was a chance for me to learn it.

A quick note about Proxmox to those who have not used it, I had come from a VMWare background so my work was my first experience with Proxmox. It is a free system, the company offers paid subscriptions for patches and such, without that the web page bothers you one time when you login, and you just dismiss the message. The software is a wrapper around KVM and some other Linux virtualization technologies. It can handle Windows and Linux systems without a problem. The interface is completely web based, with a Java virtual console; if you don’t update to the latest patches the java console can break with Java 7 Update 51. The software works well enough. There are still some areas that is needs improvements; in VMWare if you want to make a separate virtual network you can use their interface, on Proxmox that’s when you go to the Linux console and start creating virtual bridges. But once I got everything working, it seemed to work well. I don’t know how long I will keep it without trying another system, but for now it is nice. Since the system relies on KVM, it can do feautres like Dynamic memory allocation, if a VM is only using 1 GB of ram but is allocated 6, it will only take 1GB at that time. Also KVM can do deduplication of memory, so if two VMs are running the same OS, it only stores those files in memory once, freeing up more memory space.

I ran into one problem during install of Proxmox, the NUC is so fast, that it would start to boot before the USB 3.0 hard drives had been mounted. After searching around everywhere I found a fix on http://forum.proxmox.com/threads/12922-Proxmox-Install-on-USB-Device; adding a delay in the GRUB boot loader allows enough time for the system to mount the LVM disks correctly and then start. At first I just went to the Grub boot menu, hit “e” then added “rootdelay=10″, to the “linux /vmlinuz-2.6.32-17-pve root=/dev/mapper/pve-root ro rootdelay=10 quiet” line. After the system loaded I went into /Boot and added the same entry to the real Grub menu. Now I had a Intel NUC with 1TB of storage and 16GB of RAM. I could have used the NAS with iSCSI, but that was a lot of config I didn’t want to do; along with, I was setting up some Databases on the system and didn’t want the overhead of using the NASs RAIDZ2 at this time.

I have been using it for a few weeks, and its a nice little box. It never makes a audible level of noise (although it does sit next to its louder brother the NAS). Down the road if I want more power I can always get another NUC and put Proxmox into a clustered mode. These boxes keep going down in price and up in power, so this can grow with my needs.

NAS Migrations 2013

For years I used a Windows Server 2008 for my home files, having TechNet I used Windows Server 2008 and then later 2008 R2. While this was nice, it was using software RAID and a random assortment of drives that were cloning (RAID 1 style) between themselves. I originally went with this for the ease that Windows brings to things, but in the end with it mainly being a file server it just sat there initialized.

Fast-forward to this November, with space running out, I decided it was time to get a new system and replace the aging AMD Windows Server.

I wanted a RAID 5 or 6, so that I was not losing as much space as the RAID 10s that I had been using. I also wanted the system to be less maintenance than a Windows Server that needs patched every month. Recently I had heard good things about FreeNAS (freenas.org), from reddit.com/r/homelab; after seeing all the features of ZFS, I decide on a RAID 6, with ZFS. This is also known as a RAIDZ-2.

At first I looked at HP Microservers, http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=5379860 – !tab=features, yet after looking at what you got for the price, decided I wanted to build the new system myself.

The first challenge was finding a small case, that could hold the amount of hard drives I wanted, at least 5, without having a large footprint. After some searching I came across the LIAN LI PC­Q25B, http://www.newegg.com/Product/Product.aspx?Item=N82E16811112339, while not a cheap case, it offered a 5 hard drive tray and at the same time was not that large. This suited my purposes nicely.

Next I had to find which CPU I wanted; since I was hoping to keep the cost of the system down I looked at the AMD processors available. I was disappointed to see how cheap Intel processors were beating or matching far more expensive AMD chips. AMD would throw items in to sweeten the deal such as a decent GPU on the chip. However this was a NAS, I did not need all that extra stuff that would just sit there using power.

My final selection was an Intel Pentium G3220, http://www.newegg.com/Product/Product.aspx?Item=N82E16819116950; this part offers decent performance, and is the latest Haswell chip. This would allow me to upgrade the system down the road if need be. The part is also the latest socket, meaning that it could handle the larger memory sizes available, while I could use the MicroATX board the case required.

I threw in 16GB of ram (if you haven’t looked ZFS eats memory, you need about 1GB of memory per TB just to idle), and 5 – 3TB hard drives. I got the hard drives from different batches, so if something similar to Seagate’s 7200.11 drive failure happened again (http://www.theinquirer.net/inquirer/news/1050374/seagate-barracuda-7200-drives-failing) I would be protected.

Now that you know the hardware I will talk a little about the experience I have had with FreeNAS. The system is easy to install and has a nice interface. Using ZFS and the terminology they use takes a little getting used to, but the wiki can clear up a lot about what the different options do. I started the box on 9.1.0 and have updated to the latest 9.2.1; you can do updates through the web interface, and in the short time they have fixed a lot of little bugs, cleaned up the interface, as well as added new features. A nice new feature is the ability to make “Jails” of any Linux variety. These are hypervisor level VMS that can run on the system at little cost. I tend not to use them because when I use a VM to develop I tend to need a decent amount of memory, and my FreeNAS with ZFS uses 12GB of the 16GB doing nothing. But a nice feature non-the-less. FreeNAS also has some plugins that are a few clicks away; I installed Plex so I could stream media easily over the home network. FreeNAS uses Jails to run its plugins, creating a separate VM for each, this allows for security between your hosts data, and your plugins.

In the end, I am very happy with the box and its performance; my roommate and myself have been able to sustain 100MB/s writes to the box.

A quick side note, Plex is also a fantastic piece of software. You load it on a PC or NAS, point it at your media and sit back. It scans through all your media and gets all the metadata automatically. Then you can stream with the web interface, or through a DLNA device in your network. There are also iPhone and Android apps that let you stream without setting up weird port forwarding: just a very slick and well working product.

Java Windows Shortcut Library (Parsing and Creating!)

Recently I have been working on a project that involves extracting a bunch of files from zips. The problem I faced was all the shortcuts within the zips were hard coded to locations, making it impossible for me to move the extracted zip data to wherever I may want. I wanted a native library that could read and modify Windows Shortcuts so I could drop my zip data anywhere; my project is in Java, and its instant cross compatibility was needed. I know all my clients have Java installed, so that made its dependency not a issue. After looking around on the internet and finding several options, including the popular https://github.com/jimmc/jshortcut. Now the downside the this popular jShortcut library is you need a DLL, why you need a DLL to write a binary file, I am not sure. More specifically, you need a DLL for your PCs instruction set, ick! After searching the far reaches of github, and getting to the end of my rope I found https://github.com/kactech/jshortcut, written 5 years ago, and not really popular on github I thought I would give it a try. IT’S AMAZING! With no dependencies, and just a single include, you can write, modify, and create new Windows Shortcuts! There is example code included, and it couldn’t be easier to use. I just wanted to make sure anyone who has had the same problem knows about this great library.

How To Remove Branding From a Dell OEM Server

NOTE: This is for Dell OEM systems only, run at your own risk.

Recently I have RMAed motherboards for non-branded Dell servers. The problem I ran into is I was getting branded system boards back when I had originally had non-branded. The non-branded BIOSes would just be blank with a progress bar instead of having the Dell logo. I ended up spending more time and energy talking to Dell again trying to get boards to my specifications. I was told by several Dell engineers that unfortunately there was no way to fix this other than the factory setting the board up.

Well they were wrong, and because I didn’t find this anywhere online I am going to detail the instructions. Note: this is ONLY for people who need to un-brand systems from Dell, I have done this with 12th Generation servers and nothing else.

  1. Remove the old motherboard, and install the new motherboard into the chassis
  2. Now the first thing Dell training says is to set the service tag on the system now, DO NOT DO THIS YET
    • If you set the service tag, the unbranding tool will not work. If you have already set the service tag, more than likely by booting to DOS and using ftp://ftp.dell.com/utility/asset_a209.com, then you can still fix this. Boot back to DOS and use the tool again, except with “asset_~1 /s /d”. This is an undocumented feature that will remove the service tag of the box.
  3. Start up any version of Windows that is at least Windows Vista loaded. I used Windows 8 because you can get a 90 day evaluation for free. And that is enough for me to do what work I need done on the box before handing it over.
  4. Go to Support.Dell.com, and look up the box by the service tag to get to the OEM support site. If you don’t have the service tag, look up the generic version and get the url, currently for a R720 it looks like this http://www.dell.com/support/troubleshooting/us/en/04/Product/poweredge-r620. Now if you replace “poweredge” with “oth” you get the oem version. So http://www.dell.com/support/troubleshooting/us/en/04/Product/oth-r620”.
  5. Go to Drivers and Downloads, and find the download for “Identity Module”, I had to switch the OS selector to “Windows Server 2008 x64” to find it. Then hit “Download File”
  6. Now it will offer ~3 different files, one will be similar to “R620_Identity-Module_Application_WCPFW_WN32_1.01_A00.EXE”, stating “Identity-Module_Application”, download this file.
  7. Run this in Windows, it will ask if you are sure and just say yes. It can take up to 5 minutes, MAKE SURE NOTHING INTURUPTS THE SERVER IN THIS TIME.
  8. Reboot the server, and it will come up with the branding again, then it will give a special message once it gets past post similar to “modifying branding”
  9. The system will reboot again, and the branding is gone
  10. Now go into the DOS bootable drive, USB works well, and set the service tag for the system.

Now your OEM box that was impossible to unbrand has been unbranded.