Cisco UCS Boot From iSCSI CHAP Password Issue

First, let’s get the jokes out of the way: yes, it’s a technical blog post. No, I haven’t forgotten I had a blog. I know, I know.

One of the projects that I’ve been working on at SolidFire for the last few months is putting together the VMware and Cisco UCS reference architecture, and helping the team put together a UCS best practices document.

Part of that design is to take advantage of the stateless nature of the UCS blade platform by booting the ESXi hosts from the SolidFire array. UCS has supported this feature for a while, but the process certainly has it’s share of quirks.  I’m not going to go through and show the entire process for setting up iSCSI boot-from-SAN, although if there’s any interest in that part I can probably put together a Camtasia video walkthrough.

What I did want to show was an “undocumented feature” we ran into when trying to use CHAP authentication to the SolidFire array as part of the booting process.

Here’s a boot volume we’ve created on the array called “BOOT-AICV-PVH8”:
image

You can see that it’s associated with a SolidFire account called “BOOT-AI”, and if we look at that account, you can see that it has a random target and initiator password that has been generated for it by the system.
image

Using this account information we create an iSCSI Authentication Profile in UCSM:
image

Maybe, at this point, you are thinking “Gee, it doesn’t make ANY SENSE AT ALL that there would be a password complexity setting on the profile, since it’s just passing that back to whatever storage array is being used.” That would be a logical thought, but you’d be wrong. Luckily, there’s a pretty good help function that gives you the information you are looking for.
image

Here’s the problem: this is a total lie. It’s plain not true. Watch what happens when we use the standard, random password that was generated with the account.

First, we set the iSCSI Boot Parameters in the Boot Order screen in UCSM for the appropriate service profile, using the Volume IQN from the boot volume and the authentication profile we created earlier:
image

Then we associate the service profile with a server and let it boot.  It goes through the process and goes from hardware test:
image

…to splash screen:
image

…to initialization screen:
image

…to BIOS. Do not pass go, do not collect $200:
image

When you get to this point, you try to do some basic trouble shooting from the FI, and when you try and pull the iSCSI configuration for this blade, you get no data returned at all, indicating that the iSCSI stack never initialized at all:
image

In fact, if we look at the VIF Paths on the blade, and then even check the ARP table on the FIs, we see that the network stack didn’t try to initialize at all. If you were to boot from a live CD at this point, you’d see that there are no network adapters connected to the blade at all. All of this, because you followed directions and put a password with 12 to 16 characters into a field that returned no errors.
image

Yes, this took me a little more than two days to troubleshoot. I wonder if Cisco has an Internet of Things script that can get that back for me…

So, the fix is pretty simple. We go into the SolidFire array, change the account password to something that doesn’t use anything except alphanumerical characters along with the – (hyphen), _ (underscore), : (colon), and . (period).
image

Then, update the iSCSI Authentication profile and reboot the blade.  Now, on the initialization screen we see the Cisco VIC iSCSI adapters initialize and connect to the boot volume properly.
image

And when we look at the FI, we get the entire iSCSI config coming through fine:
image

Now, the host boots just fine. Hopefully this helps someone else out there having issues.