Upgrade to vSphere 5 Update 1 with MSCS
ESXi 5.0 uses a different technique to determine if Raw Device Mapped (RDM) LUNs are used for MSCS cluster devices. During a boot of an ESXi system the storage mid-layer attempts to discover all devices presented to an ESXi system during device claiming phase. MSCS LUNs that have a permanent SCSI reservation causes the boot process to extend, because the ESXi Host cannot interrogate the LUN.
This happens after I upgraded one ESXi node in a 13 node Cluster. It takes approximately 30-45min until the Host was back again. I got this problem 1 year ago, when I upgraded a vSphere 4.1 Cluster. It was the same behavior.
The workaround for the vSphere 4.1 Cluster was to set the advanced option Scsi.CRTimeoutDuringBoot to 1, but with ESXi 5.0 this option is gone.
After a little research I found this VMware KB article.
Before you upgrade to vSphere 5.0 or 5.0 Update 1:
- Determine which RDM LUNs are part of the MSCS and take note of the naa ID
- unpresent the RDM LUNs and upgrade to 5.0 or 5.0 Update1
- Following reboot use this esxcli command to mark the device as reserved (this will also work if the LUN is unpresented, but it will only work on ESXi 5.0 Hosts)
esxcli storage core device setconfig -d naa.id –perennially-reserved=true
- Re-present the RDM LUNs to the ESXi Host and rescan
- To verify that the LUN is perennially-reserved use this command
esxcli storage core device list -d naa.id (Search for the entry “Is perennially-reserved: true”)
After this procedure you can reboot the ESXi Host and it will boot as fast as before the upgrade.
Remember that this is a host only procedure, so you have to do it on every host in your cluster, in which MSCS resides.
What does it means by “unpresent the RDM LUNs and upgrade to 5.0 or 5.0 Update1” ? how do you do this through the vSphere console ?
It means that you have to unzone the storage RDM LUNs from the ESXi host, so that the host does not see it.