HowTo: Workaround fixing: HP hosts stuck on reboot after ILO upgrade & showing lost connectivity
This short blog will show a method to overcome the issues an ILO 4 firmware upgrade for HP DL380G9 servers is creating. The method is found and tested by Cris van den Dungen an IT admin I work with often. So all credits go to him for finding this workaround.
After upgrading ILO from 2.40 to the 2.54 all ESXi hosts running 6.0 with specific builds are reporting the following error;
“Lost connectivity to the device mpx.vmhba32:C0:T0:L0 backing the boot filesystem /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0”
“Error loading /conrep.v00
Compressed MD5: (md5hash)
Decompressed MD5: (only 0s)
Fatal error: 6 (Buffer too small)”
Steps to resolve it
Of course steps have been taken to resolve it, the ILO version was downgraded to 2.40 but that didn’t solve it.
We noted that ESXi hosts with version 6 and build 3825889, 4510822, 3620759 seem to be affected but the ones with build 4600944 is not. It’s not rebooted but it doesn’t show the error. We couldn’t find any help in knowledge base articles and calls with VMware did not solve it. From the error it looked as the SD card are damaged some how. The advice was to reinstall ESXi – that’s from the KB articles.
The customer has been deep diving into this as a reinstall would be a huge task and in the mean time they could not handle a power issue. It would cripple the whole environment instantly. Cris the IT admin found a workaround I think you should know, so based on his testing this is how to get your hosts back online without disrupting production.
- Downgrade iLO to the last stable release (2.40)
- Powerdown (not gracefully) via iLO
- Unplug both powercords
- Eject and re-insert SD card
- Get a long coffee break – 5-10mins
- Re-insert powercords
- enjoy your ESXi host
Just thought I shared this, hope it helps