Upgrading to VCSA 6 fails

I began an upgrade of the VMware vCenter Server Appliance from 5.5 to 6 for a small (in VMware’s own terminology ‘Tiny’) vSphere environment of 3 hosts and about 30 VMs. I certainly didn’t anticipate any trouble beyond the usual hassles associated with upgrading an infrastructure-level service like vCenter.

Unfortunately, carefully following VMware documented procedures and the best advice some of my favorite blogs had to offer (see: VMware KB 2109772, http://www.vladan.fr/how-to-upgrade-from-vcsa-5-5-to-6-0/, and http://www.virtuallyghetto.com/2015/09/how-to-upgrade-from-vcsa-5-x-6-x-to-vcsa-6-0-update-1.html ), and the upgrade still failed.

When the installer failed, the only message the web-installer had to offer was the well-known: Firstboot script execution error.

Firstboot script execution error

Furthermore, the log files failed to download, leaving me with fewer diagnostic resources than I might otherwise find on my Windows desktop.

Normally, the Firstboot script execution error is the result of incorrectly configured DNS, so the first thing I tested was forward and reverse DNS; both were working perfectly.

I decided to delve further into the issue, and found that the SSH daemon on the VM had started, so I connected with Putty for a look around.

Side-note: upgrading from VCSA 5.5 to VCSA 6 requires the creation of a brand-new VM, then the supposed automatic migration of data, leaving you with your original VCSA in a powered-off while a new VM is intended to take its place.

As long as I was in with SSH (Putty), I did some more poking around, and finally ran: df –h on the VM that was supposed to be my new ‘upgraded’, VCSA 6.

storage_seat_full

The problem was immediately apparent; the /storage/seat partition (virtual disk) was completely full! In VCSA 6, /storage/seat is used for the Postgres Stats Events And Tasks (SEAT).

Side note: VCSA 6 puts all of its primary partitions on separate virtual disks, 11 in total. This is a great advantage for long-term scalability, but somewhat of a disadvantage as compared to one disk where every partition can grow to the capacity of the disk. To learn more about what all of the different partitions/disks do, look at this excellent write-up on virtuallyGhetto: Multiple VMDKs in VCSA 6.0?

What I (and VMware) had failed to take into account in sizing of the VCSA, was the potential for an extraordinary number of Tasks and Events. While this may be a ‘Tiny’ deployment by VMware’s standards, with a Horizon View environment plus Veeam Backup and Replication running on a sub 1-hour R.P.O., the number of Tasks and Events presents more like what VMware seems to expect from a ‘Medium’ deployment.

One potential solution may be to Reclaim or purge data from the Postgres database on the VCSA 5.5 before trying the upgrade; but the owner decided in favor of preserving all of the data if possible.

The Solution

In the end, the solution was simply to select a larger deployment size while going through the web-installer wizard. As it turned out, the /storage/seat disk for a ‘Tiny’ deployment was only 10GB, while it was 50GB for a ‘Medium” deployment.

During the upgrade (which took over 2 hours), I connected via SSH as soon as the daemon had started and ran df –h a number of times (I should’ve used: watch). I saw the /storage/seat volume grow slowly, eventually reaching over 17GB of used space, before settling back to 16GB on the successful upgrade.

The only drawback I can think of, to having specified a ‘Medium’ size deployment for a relatively small environment is that the vCPU and RAM allocated to the VM are now vastly beyond what is required with 24GB RAM and 8 vCPU. I plan to shut the VCSA down and scale back to around 16GB RAM and maybe 4 vCPU, better to suit the environment at my earliest opportunity.

NET 3.5 in Windows 10 while offline

I was recently working on a Windows 10 Desktop with an isolated network, when the need to install the VMware vSphere Client for Windows arose. Of course, the vSphere Client requires .NET Framework 3.5, and Windows 10 presents special challenges to those of us who are forced to work without a connection to the Internet.

Here’s how to accomplish the installation offline, provided you have the installation media, or a copy of the SxS folder from the media.

I copied the x64\sources\sxs\ folder from the media (actually a usb) to C:\sxs on the VM before I ran the command, but there is no reason these steps wouldn’t apply to any windows 10 system, using any type of media.

Once I had the sxs folder on the root of C:\, I ran the command:

dism /online /enable-feature /featurename:NetFx3 /all /source:C:\sxs

NET Framework offline installation in Windows 10

and the whole installation took about 30 seconds!

iSCSI with Jumbo Frames and Port Binding

10Gb iSCSI

Immediately after installing an ESXi Server, you may or may not have any storage at all. Most ESXi servers today are diskless, with the ESXi installation living on some sort of flash-based storage. In this case a fresh installation of ESXi will present with no persistent storage whatsoever, as you can see in the example below:

DIskless ESXi Installation
DIskless ESXi Installation

Some servers, however, still have traditional disks as in the screenshot below where clearly see the HP Serial Attached SCSI Disk, A.K.A. “Local Storage” or “Directly Attached Storage,” but no other storage is listed.

ESXi Installed with disks

In either case, it should be noted that it is a Best Practice, when installing ESXi to disk or media, to use a RAID 1 configuration for the location of the ESXi installation.

iSCSI Storage Network Examples

Our first task will be creating a network for iSCSI storage traffic. This should be a completely separate network from Production or Management networks, but how you create the separation is entirely up to you.

Most VMware admins will prefer a physically separate network, but in the days of 10Gb NICs and a relatively smaller number of interfaces[1], VLAN separation will work as well.

Example of a valid iSCSI configuration using VLANs to separate networks
Example of a valid iSCSI configuration using VLANs to separate networks

 

Example of a valid iSCSI configuration on dedicated NICs
Example of a valid iSCSI configuration on dedicated NICs

Configuring an iSCSI Network on dedicated NICs

Creating a network for iSCSI Storage

Let’s begin by adding an entirely new vSwitch for our iSCSI Network. Click on: Add Networking in the upper-right corner of the screen

Select: VMkernel as the network type

Choose to create a vSphere Standard switch, using all of the interfaces which are connected to your iSCSI Storage network. In this example, vmnic4 and vmnic5 are connected to the iSCSI Storage network

For the Network Label of your first VMkernel connection, choose a name that can be remembered sequentially. I always create my VMkernel connections for iSCSI in the following order:

  • VMkernel-iSCSI01
  • VMkernel-iSCSI02
  • VMkernel-iSCSI03
  • VMkernel-iSCSI04

If I have 2 physical uplinks (NICs), I will create 2 VMkernel connections for iSCSI. If I have 4 uplinks, I will create 4 VMkernel connections for iSCSI. Following this standard for your iSCSI configuration will conform with VMware requirements for Port Binding[2] and assist you in establishing the order in which you bind the VMkernel connections.

Set a VLAN ID, if that is appropriate for your environment

Now choose an IP and Subnet Mask that will be unique to your ESXi host, on the iSCSI network

iSCSI IP Plan

I like to set-up my iSCSI networks with an orderly IP schema. If you were to use a bunch of sequential IP addresses for VMkernel connections, you would leave no room for orderly expansion.

In order to allow for the orderly expansion of either the number of ESXi Hosts and/or the number of iSCSI VMkernel connections, I choose to increment my IP addresses by a given amount. For example, to anticipate a maximum of 20 ESXi hosts in a given environment, I would increment all of my VMkernel IP address by 20 like this:

VMkernel-iSCSI01 VMkernel-iSCSI02 VMkernel-ISCSI03
ESXi #1 10.0.0.101 10.0.0.121 10.0.0.141
ESXi #2 10.0.0.102 10.0.0.122 10.0.0.132
ESXi #3 10.0.0.103 10.0.0.123 10.0.0.133

Click on: Finish

After you click: Finish, you will see the new vSwitch (in this case, vSwitch 1)

On vSwitch1, click: Properties (be careful, there are separate “Properties” dialogs for each vSwitch and the overall Networking as well!

Click: Add

Choose: VMkernel

For the network label, choose a name that follows (sequentially) the VMkernel connection you created earlier.

Set a VLAN ID, if that is appropriate for your environment

Set an IP that follows the convention you established earlier. In my case, I am going to increment each VMkernel by 20.

Click: Finish

Repeat the previous steps for any additional iSCSI VMkernel connections you may be creating. You may only bind one iSCSI VMkernel per available uplink (vmnic)

You will now find yourself on the Properties dialog for the vSwitch you created.

Highlight the vSwitch itself, and click: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000

Jumbo Frames MUST be configured on the vSwitch prior to setting the VMkernel MTU above 1500

Click: OK

Now select the first (lowest numbered) iSCSI VMkernel and click: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000 and then go to the NIC Teaming Tab

Our goal in this dialog is to remove all aspects of load-balancing and failover from the vSwitch in order to enable Port Binding. Port Binding will allow the vSphere Path Selection Policy (PSP) to more effectively balance iSCSI loads and implement failover in the event it is required.

In order implement Port Binding, we must leave only one active NIC

VMkernel network adapter

Choose: Override switch failover order

Since this is the lowest-numbered iSCSI VMkernel, we are going to give it the lowest-numbered vmnic, in this case vmnic4.

Highlight all other vmnic’s and click the Move Down button until they are all listed in Unused Adapters. Don’t be tempted to leave any NIC’s in Standby, it will not work, per VMware policy!

Click: OK

Now choose the next iSCSI VMkernel and choose: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000 and then go to the NIC Teaming Tab

Since this is the next iSCSI VMkernel, we are going to give it the next vmnic, in this case vmnic5.

Highlight all other vmnic’s and click the Move Down button until they are all listed in Unused Adapters. Don’t be tempted to leave any NIC’s in Standby, it will not work, per VMware policy!

Click: OK

You may (it is a good idea to) check your settings by highlighting the vSwitch and each VMkernel in order and viewing the settings in the right-side column

Repeat the previous steps for any additional iSCSI VMkernel Connections you may have created and then click: close

This is what the finished Standard vSwitch networking configuration should look like

iSCSI Software Adapter

Choose: Storage Adapters and then click: Add

Choose: Add Software iSCSI Adapter

Click: OK

Right-click on the iSCSI Software Adapter and choose: Properties

Select the tab: Network Configuration

Click: Add

Choose one of the iSCSI VMkernel connections from the list and click: OK

Now click: Add

Choose the other iSCSI VMkernel connection and click: OK

Repeat the previous process until all VMkernel iSCSI Connections are bound. DO NOT add Management Network connections, if they are available

Choose the tab: Dynamic Discovery

Click: Add

Enter the discovery IP of your SAN.

Enter just one address, one time.

Click: OK

The address will appear after some seconds.

Click the Static Discovery tab and take note of how many paths, or targets your SAN presents (the more , the better!)

Click: Close

Click: OK

In a few seconds, you should see a listing of all of the devices (LUN’s) available on your SAN

Creating a VMFS 5 Volume

Click on the option: Storage (middle column, in blue) and then click: Add Storage

Choose: Disk/LUN and then: Next

Choose from the available devices (LUN’s) and click: Next.

Click: Next

Name your Datastore and click: Next

Choose: Maximum available space and click: Next

In reality, there is very little reason for choosing anything other than “Maximum available space”

Click: Next

Click: Finish

And your new VMFS 5 volume will be created!

  1. vSphere 6 Configuration Maximums
  2. Multipathing Configuration for Software iSCSI Using Port Binding

 

Using VMware Paravirtual devices

VMware Paravirtual

One of the most common oversights in vSphere deployments is a failure to use the Paravirtual drivers that VMware has provided us for networking and storage.

On a physical platform, one chooses supported device(s) for networking and storage, and then installs the correct driver(s) to support those devices. For example; on a physical system, you might specify LSI SAS for storage and Intel E1000 NIC’s for network. That particular combination is, in fact, so common that Operating Systems like Windows have the drivers for those devices pre-installed so they will be recognized both during and after installation. The ‘during’ part is particularly important too, because if the storage driver is not present at the time of install, the hard disk will not be recognized, and the installation fails!

On a virtual platform, it’s a completely different story. Even if the host ESXi server actually has LSI SAS storage adapters and Intel E1000E NIC’s, there is no correlation to the network and storage device for Virtual Machines. In fact, if you choose LSI or Intel (they are the default choices for Windows Server VM builds), the only potential benefit will be that Windows includes those drivers by default. You will, in fact, be emulating the corresponding physical devices by LSI and Intel, with resulting loss of performance!

The only true native storage and network devices for vSphere VMs are the VMware Paravirtual SCSI ( pvscsi ) and Network ( vmxnet3 ) device types and corresponding drivers. Problem is; while Linux distros (most all of them) will include support for Paravirtual devices by default, Microsoft is not so magnanimous. Users choosing to use either (or both) of the VMware Paravirtual device types, will have to install the corresponding drivers.

In most cases, VMware Paravirtual devices are supported for installation in Windows Family 5 (Server 2003, XP) and later, and natively supported by most Linux OS.

Benefits of using VMware Paravirtual SCSI and Network devices include:

  • Better data integrity[1] as compared to Intel E1000e
  • Reduced CPU Usage within the Guest
  • Increased Throughput
  • Less Overhead
  • Better overall performance

I have created an example Windows Server 2012 R2 VM using only the default E100e and LSI SAS device types and I am going to show you how easy it is to convert from the default (emulated physical) to VMware Paravirtual drivers. For the following steps to work, the VMware Tools must be installed in the VM which is being updated.

Upgrading a VM to vmxnet3 Paravirtual Network Adapter

During the following procedures, it is important to use the Virtual Machine Remote Console (as opposed to RDP) because we will be causing a momentary disconnection from the network.

The biggest challenge is that the static IP address, if assigned, is associated with the device and not with the VM. Therefore, when you upgrade to the vmxnet3 adapter, your challenge will be un-installing and eliminating any trace of the “old” NIC to avoid seeing the dreaded message: “The IP address XXX.XXX.XXX.XXX you have entered for this network adapter is already assigned to another adapter[2]

Using the VMRC, log in to your Windows VM and run the device manager with: devmgmt.msc

You will see that the Network adapter is clearly listed as an Intel

Now go to the Network and Sharing Center and click on any (all) of the active Networks to observe their settings

You will notice that the speed is clearly 1.0 Gbps

Click on: Properties

Choose TPC/IPv4 and then click: Properties

Take note of the IP Address, Subnet Mask, Gateway, and DNS

Go to: VM > Edit Settings

image068

Remove the Network Adapter(s) from the VM and click OK. In truth, you could both remove the old adapter and add the new vmxnet3 adapter simultaneously, but we will do it in separate steps for clarity.

Notice, the active networks list is empty

Although we have removed the device from the VM, we have not removed its configuration from the system. Therefore, the IP address we saw earlier is still assigned to the E1000e Virtual NIC we just removed. In order to cleanly install a Paravirtual NIC, we need to remove the Intel NIC completely.

Open a command window (this must be done first from the command window) and run the following commands:

set devmgr_show_nonpresent_devices

start devmgmt.msc

After the device Manager window is open, select: View > Show Hidden Devices

Many admins falsely believe that is is simply enough to show hidden devices, but this is not true. It is absolutely necessary to “show_nonpresent_devices” at the command line first!

You should now be able to find the (now removed) Intel NIC listed in lighter text than the devices which remain resent.

Right-click and select: Uninstall

image023

OK

And it’s gone!

Go to: VM > Edit Settings

Edit Settings

 

Click: Add

Choose: Ethernet Adapter

Set the Type to: VMXNET 3 and then choose the appropriate Network Connection (usually VM Network), then click: Next

Click: Finish

Now click: OK

You will see the vmxnet3 Ethernet Adapter added to the Device Manager

Now click the active network, in this case “Ehternet”

Notice the speed listed as 10 Gbps. This does not mean that there are 10 Gbps NICs in the ESXi host merely that the observed speed of the network for this VM is 10 Gbps.

Click on: Properties

Now choose: TCP/IPv4 and select: Properties

Re-assign all of the IP addresses and subnet mask you observed earlier

And you have upgraded to the VMware Paravirtual device VMXNET 3

Upgrading a VM to pvscsi VMware Paravirtual SCSI Adapter

The trick in switching to the VMware Paravirtual SCSI adapter is in adding a dummy disk to the VM, which will force Windows to install the pvscsi driver, included with the VMware Tools package you have installed as part of a separate process.

Start the device manager with devmgmt.msc

Observe the LSI Adapter listed under Storage Controllers

Go to: VM > Edit Settings

Edit Settings

Choose: Add

Select: Hard Disk and then: Next

Choose: Create a new virtual disk and then: Next

The disk you create can be most any size and provisioning. We choose 10 GB Click: Next

In this step, it is critical that you place the new disk on an unique SCSI Node. That is to say, if the existing disk is on 0:0, then plane the new disk on 1:0 (you must not combine it with any LSI nodes, such as 0:1 or the process will not work)

Now click: Finish

Notice, you have added, not just a disk, but also a New SCSI Controller.

Now click: Change Type

Select: VMware Paravirtual

Now click: OK

Once the disk is added, look again in the Windows Device Manger and make sure that you can see the VMware PVSCSI Controller. If you can, that means the PVSCSI drivers have successfully loaded, and you can proceed.

Now we have to shut down the VM.

Shut Down Guest

Once the VM is off, Go to: VM > Edit Settings

Edit Settings

Choose the dummy disk (whichever one it was, BE CAREFUL HERE! and click: Remove

Although I failed to do so in creating this demo, you probably want to choose “Remove from virtual machine and delete files from disk,” to avoid leaving orphan files around.

Now select the SCSI controller(s) which are not already Paravirtual and choose: Change Type

Select: VMware Paravirtual

Now click: OK

Power your VM back on and observe that only the VMware Paravirtual device remains!

It should be noted; just as with the Intel NIC, the LSI device remains as a “nonpresent” device. If you feel like going the extra mile, repeat the steps to show nonpresent devices and uninstall the LSI device!

  1. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2058692
  2. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1179