Debugging VMware ESXi purple screen
Another day another challenge, today I sat down to troubleshoot a purple screen for a customer. He called me last night that the Host crashed while he did a recompose.
The customer has a VMware Horizon View 6 environment and VMware ESXi 5.5 hosts.
The environment is pretty simple and freshly installed a few months ago.
The purple screen the customer experienced is shown below.
VMware has a knowledge base article about this issue, read about it here
The cause for the issue seems to be a new AIO Library feature to improve SESparse performance. SESparse has been introduced with VMware View 5.1 and is now available in ESXi 5.5 by default.
Well there seems to be no resolution yet, so I had to go to each ESX host and check the settings for this features. After enabling SSH access to the hosts I go in with Putty and enter the commands to see if the setting is enabled.
The first command gives a 1 back which means the feature is enabled. The second command gives a 0 back. A 0 means the feature is disabled.
The workaround from VMware says I should disable the feature, here it means I only have to disable the writes option. I have to check this with all hosts, not that many but still some to go.
The command shown below disables the feature, it’s a live action that can be ran on a live host.
There is one more command that could be used, here I didn’t need it for it was disabled already…
The command is: esxcfg-advcfg -s 0 /FDS/FDSEnableCoalesceReads