Category: VMware ESXi

VMware ESXi is the Type 1 “Bare Metal” Hypervisor offered for free as “The VMware Hypervisor” or as part of the vSphere platform under a wide variety of licensing arrangements.

Changing existing LUNs to Round Robin on ESXi

In the following steps, I am going to show you how to set all of the VMFS Volumes (LUNs) on an ESXi Host to use the PSP known as Round Robin, using only the ESXi Shell and/or SSH. This is clearly the simplest and most direct method of changing the PSP for existing volumes, and it is available from all ESXi Hosts in every environment.

There are other ways of changing the PSP, including using the vSphere Client and setting each VMFS Volume individually or using the VMware vSphere PowerCLI and setting the PSP for all of the VMFS Volumes at once, but either of these methods may be undesirable or unusable in any given situation:

  • Using the vSphere client and setting VMFS Volume PSP for each LUN individually would be extremely time-consuming if there were more than just a few ESXi Hosts or volumes.
  • Some environments may not have a vCenter Server, or the vSphere PowerCLI may not be available at the time you need to change the PSP.

Determining the default PSP for an ESXi Host

In the following example, we have an ESXi Host on which all of the VMFS Volumes have been created using the Most Recently Used (MRU) PSP, which is not the best or most optimal choice for our SAN.

MRU

To begin, let’s check what the default PSP is for ALUA arrays (VMW_SATP_ALUA) for new VMFS volumes on this ESXi Host.

Run the command:

esxcli storage nmp satp list

In the first line of the output, we can see that the default PSP for ALUA Arrays is Most Recently Used (VMW_PSP_MRU), which is not correct or desirable for our SAN.

Change the default PSP for new VMFS Volumes to Round Robin.

Run the command:

esxcli storage nmp satp set --default-psp=VMW_PSP_RR --satp=VMW_SATP_ALUA

And check your success by running the command:

esxcli storage nmp satp list

Notice, the association for VMW_SATP_ALUA is now VMW_PSP_RR; or put in simpler terms, we have changed the default PSP from Most Recently Used to Round Robin for ALUA Arrays. Unfortunately, even though we changed the default PSP for the ESXi, all of the existing ESXi volumes retain their former PSP.

Changing existing VMFS volumes to use Round Robin

Esisting VMFS volumes may either be changed to Round Robin one at a time, or by using a scriptlet, we can search for all VMFS Volumes on a host, and then change them all to use Round Robin at once!

First, list all of the LUNs by running the command:

ls /vmfs/devices/disks | grep naa.600

You will see two lines for each LUN, one is the device (first 36 characters) and the other is the first partition (:1).

Because we only need to set the PSP for the device and not the partition, we will cut the first 36 characters from our grep as a variable ‘i’, and pipe the variable to the command ‘esxcli storage nmp device set –device’ and insert the cut characters ‘$i’ in place of the device name, like this:

Run the command:

for i in `ls /vmfs/devices/disks/ | grep naa.600 | cut -b 1-36` ; do esxcli storage nmp device set --device $i --psp VMW_PSP_RR;done 

When complete, you will find that all of the VMFS Volumes on this ESXi Host have been switched to using the PSP Round Robin!

Timekeeping on ESXi

Timekeeping on ESXi Hosts is a particularly important, yet often overlooked or misunderstood topic among vSphere Administrators.

I recall a recent situation where I created an anti-affinity DRS rule (separate virtual machines) for a customer’s domain controllers. Although ESXi time was correctly configured, the firewall had been recently changed and no longer allowed NTP. As it happened, the entire domain was running fine and time was correct before the anti-affinity rule took effect. Unfortunately, as soon as the DC migrated (based on the rule I created), its time was synchronized with the ESXi host it was moved to, which was approximately 7 minutes slow! The net result was users immediately experienced log-in issues.

Unfortunately, when you configure time on your ESXi Host, there is no affirmative confirmation that the NTP servers you specified are reachable or valid! It doesn’t matter if you add correct NTP servers, or completely bogus addresses to the Time Configuration; the result is that the ESXi will report that the NTP client is running and seemingly in good health! Moreover, there is no warning or alarm when NTP cannot sync with the specified server.

Let’s create an example where we add three bogus NTP servers:

In this example, you can see the three bogus NTP servers, yet the vSphere Client reports that the NTP Client is running and there were no errors!

The only way to tell if your NTP servers are valid and/or functioning is to access the shell of your ESXi host (SSH or Console) and run the command: ntpq –p 127.0.0.1

image004

The result from ntpq –p demonstrates that *.mary.little.lamb is not a NTP server.

Now, let’s try using three valid NTP servers:

In this example, I have used us.pool.ntp.org to point to three NTP valid servers outside my network and the result (as seen from the vSphere Client) is exactly the same as when we used three bogus servers!

image008

The result from ntpq –p demonstrate that there are three valid NTP servers resolvable by DNS (we used pool.ntp.org), but that the ESXi host has not been able to poll them. This is what you see when the firewall is blocking traffic on port 123!

Additionally, when firewall rules change, preventing access to NTP, the ‘when’ column will show a value (sometimes in days!) much larger than the poll interval!

When an ESXi host is correctly configured with valid NTP servers and it is actually getting time from those servers, the result form ntpq –p will look like this:

image010

Here you see the following values:

remote Hostname or IP of the NTP server this ESXi host is actually using,
rfid Identification of the time stream.

  • INIT means the ESXi host has not yet received a response
  • CDMA means that the time stream is coming from a cellular network.
st Stratum
t tcp or udp
when last time (in seconds) the NTP was successfully queried. This is the important value: when the ‘when’ value is larger than the “poll” field, NTP is not working!
poll poll interval (in seconds)
reach An 8-bit shift register in octal (base 8), with each bit representing success (1) or failure (0) in contacting the configured NTP server. A value of 377 is ideal, representing success in the last 8 attempts to query the NTP server.
delay Round trip (in milliseconds) to the NTP Server
offset Difference (in milliseconds) in the actual time on the ESXi host and the reported time from the NTP server.
jitter the observed variance in the observed response from the NTP server. Lower values are better.

The NIST publishes a list of valid NTP IP addresses and Hostnames, but I prefer to use pool.ntp.org in all situations where the ESXi Host can be permitted access to a NTP server on port 123. The advantage to pool.ntp.org is that it changes dynamically with availability and usability of NTP servers. Theoretically, pool.ntp.org is a set-and-forget kind of thing!

ESXi Time Best Practices

Do not use a VM (such as a Domain Controller) that could potentially be hosted by this ESXi as a time-source.

Use only Stratum 1 or Stratum 2 NTP Servers

Verify NTP Functionality with: ntpq –p 127.0.0.1

VMs which are already timeservers (such as Domain Controllers) should use either native time services such as w32time or VMware Tools time synchronization, not both! See: VMware KB 1318

iSCSI with Jumbo Frames and Port Binding

10Gb iSCSI

Immediately after installing an ESXi Server, you may or may not have any storage at all. Most ESXi servers today are diskless, with the ESXi installation living on some sort of flash-based storage. In this case a fresh installation of ESXi will present with no persistent storage whatsoever, as you can see in the example below:

DIskless ESXi Installation
DIskless ESXi Installation

Some servers, however, still have traditional disks as in the screenshot below where clearly see the HP Serial Attached SCSI Disk, A.K.A. “Local Storage” or “Directly Attached Storage,” but no other storage is listed.

ESXi Installed with disks

In either case, it should be noted that it is a Best Practice, when installing ESXi to disk or media, to use a RAID 1 configuration for the location of the ESXi installation.

iSCSI Storage Network Examples

Our first task will be creating a network for iSCSI storage traffic. This should be a completely separate network from Production or Management networks, but how you create the separation is entirely up to you.

Most VMware admins will prefer a physically separate network, but in the days of 10Gb NICs and a relatively smaller number of interfaces[1], VLAN separation will work as well.

Example of a valid iSCSI configuration using VLANs to separate networks
Example of a valid iSCSI configuration using VLANs to separate networks

 

Example of a valid iSCSI configuration on dedicated NICs
Example of a valid iSCSI configuration on dedicated NICs

Configuring an iSCSI Network on dedicated NICs

Creating a network for iSCSI Storage

Let’s begin by adding an entirely new vSwitch for our iSCSI Network. Click on: Add Networking in the upper-right corner of the screen

Select: VMkernel as the network type

Choose to create a vSphere Standard switch, using all of the interfaces which are connected to your iSCSI Storage network. In this example, vmnic4 and vmnic5 are connected to the iSCSI Storage network

For the Network Label of your first VMkernel connection, choose a name that can be remembered sequentially. I always create my VMkernel connections for iSCSI in the following order:

  • VMkernel-iSCSI01
  • VMkernel-iSCSI02
  • VMkernel-iSCSI03
  • VMkernel-iSCSI04

If I have 2 physical uplinks (NICs), I will create 2 VMkernel connections for iSCSI. If I have 4 uplinks, I will create 4 VMkernel connections for iSCSI. Following this standard for your iSCSI configuration will conform with VMware requirements for Port Binding[2] and assist you in establishing the order in which you bind the VMkernel connections.

Set a VLAN ID, if that is appropriate for your environment

Now choose an IP and Subnet Mask that will be unique to your ESXi host, on the iSCSI network

iSCSI IP Plan

I like to set-up my iSCSI networks with an orderly IP schema. If you were to use a bunch of sequential IP addresses for VMkernel connections, you would leave no room for orderly expansion.

In order to allow for the orderly expansion of either the number of ESXi Hosts and/or the number of iSCSI VMkernel connections, I choose to increment my IP addresses by a given amount. For example, to anticipate a maximum of 20 ESXi hosts in a given environment, I would increment all of my VMkernel IP address by 20 like this:

VMkernel-iSCSI01 VMkernel-iSCSI02 VMkernel-ISCSI03
ESXi #1 10.0.0.101 10.0.0.121 10.0.0.141
ESXi #2 10.0.0.102 10.0.0.122 10.0.0.132
ESXi #3 10.0.0.103 10.0.0.123 10.0.0.133

Click on: Finish

After you click: Finish, you will see the new vSwitch (in this case, vSwitch 1)

On vSwitch1, click: Properties (be careful, there are separate “Properties” dialogs for each vSwitch and the overall Networking as well!

Click: Add

Choose: VMkernel

For the network label, choose a name that follows (sequentially) the VMkernel connection you created earlier.

Set a VLAN ID, if that is appropriate for your environment

Set an IP that follows the convention you established earlier. In my case, I am going to increment each VMkernel by 20.

Click: Finish

Repeat the previous steps for any additional iSCSI VMkernel connections you may be creating. You may only bind one iSCSI VMkernel per available uplink (vmnic)

You will now find yourself on the Properties dialog for the vSwitch you created.

Highlight the vSwitch itself, and click: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000

Jumbo Frames MUST be configured on the vSwitch prior to setting the VMkernel MTU above 1500

Click: OK

Now select the first (lowest numbered) iSCSI VMkernel and click: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000 and then go to the NIC Teaming Tab

Our goal in this dialog is to remove all aspects of load-balancing and failover from the vSwitch in order to enable Port Binding. Port Binding will allow the vSphere Path Selection Policy (PSP) to more effectively balance iSCSI loads and implement failover in the event it is required.

In order implement Port Binding, we must leave only one active NIC

VMkernel network adapter

Choose: Override switch failover order

Since this is the lowest-numbered iSCSI VMkernel, we are going to give it the lowest-numbered vmnic, in this case vmnic4.

Highlight all other vmnic’s and click the Move Down button until they are all listed in Unused Adapters. Don’t be tempted to leave any NIC’s in Standby, it will not work, per VMware policy!

Click: OK

Now choose the next iSCSI VMkernel and choose: Edit

If you have chosen to use Jumbo Frames, set the MTU to 9000 and then go to the NIC Teaming Tab

Since this is the next iSCSI VMkernel, we are going to give it the next vmnic, in this case vmnic5.

Highlight all other vmnic’s and click the Move Down button until they are all listed in Unused Adapters. Don’t be tempted to leave any NIC’s in Standby, it will not work, per VMware policy!

Click: OK

You may (it is a good idea to) check your settings by highlighting the vSwitch and each VMkernel in order and viewing the settings in the right-side column

Repeat the previous steps for any additional iSCSI VMkernel Connections you may have created and then click: close

This is what the finished Standard vSwitch networking configuration should look like

iSCSI Software Adapter

Choose: Storage Adapters and then click: Add

Choose: Add Software iSCSI Adapter

Click: OK

Right-click on the iSCSI Software Adapter and choose: Properties

Select the tab: Network Configuration

Click: Add

Choose one of the iSCSI VMkernel connections from the list and click: OK

Now click: Add

Choose the other iSCSI VMkernel connection and click: OK

Repeat the previous process until all VMkernel iSCSI Connections are bound. DO NOT add Management Network connections, if they are available

Choose the tab: Dynamic Discovery

Click: Add

Enter the discovery IP of your SAN.

Enter just one address, one time.

Click: OK

The address will appear after some seconds.

Click the Static Discovery tab and take note of how many paths, or targets your SAN presents (the more , the better!)

Click: Close

Click: OK

In a few seconds, you should see a listing of all of the devices (LUN’s) available on your SAN

Creating a VMFS 5 Volume

Click on the option: Storage (middle column, in blue) and then click: Add Storage

Choose: Disk/LUN and then: Next

Choose from the available devices (LUN’s) and click: Next.

Click: Next

Name your Datastore and click: Next

Choose: Maximum available space and click: Next

In reality, there is very little reason for choosing anything other than “Maximum available space”

Click: Next

Click: Finish

And your new VMFS 5 volume will be created!

  1. vSphere 6 Configuration Maximums
  2. Multipathing Configuration for Software iSCSI Using Port Binding

 

Using VMware Paravirtual devices

VMware Paravirtual

One of the most common oversights in vSphere deployments is a failure to use the Paravirtual drivers that VMware has provided us for networking and storage.

On a physical platform, one chooses supported device(s) for networking and storage, and then installs the correct driver(s) to support those devices. For example; on a physical system, you might specify LSI SAS for storage and Intel E1000 NIC’s for network. That particular combination is, in fact, so common that Operating Systems like Windows have the drivers for those devices pre-installed so they will be recognized both during and after installation. The ‘during’ part is particularly important too, because if the storage driver is not present at the time of install, the hard disk will not be recognized, and the installation fails!

On a virtual platform, it’s a completely different story. Even if the host ESXi server actually has LSI SAS storage adapters and Intel E1000E NIC’s, there is no correlation to the network and storage device for Virtual Machines. In fact, if you choose LSI or Intel (they are the default choices for Windows Server VM builds), the only potential benefit will be that Windows includes those drivers by default. You will, in fact, be emulating the corresponding physical devices by LSI and Intel, with resulting loss of performance!

The only true native storage and network devices for vSphere VMs are the VMware Paravirtual SCSI ( pvscsi ) and Network ( vmxnet3 ) device types and corresponding drivers. Problem is; while Linux distros (most all of them) will include support for Paravirtual devices by default, Microsoft is not so magnanimous. Users choosing to use either (or both) of the VMware Paravirtual device types, will have to install the corresponding drivers.

In most cases, VMware Paravirtual devices are supported for installation in Windows Family 5 (Server 2003, XP) and later, and natively supported by most Linux OS.

Benefits of using VMware Paravirtual SCSI and Network devices include:

  • Better data integrity[1] as compared to Intel E1000e
  • Reduced CPU Usage within the Guest
  • Increased Throughput
  • Less Overhead
  • Better overall performance

I have created an example Windows Server 2012 R2 VM using only the default E100e and LSI SAS device types and I am going to show you how easy it is to convert from the default (emulated physical) to VMware Paravirtual drivers. For the following steps to work, the VMware Tools must be installed in the VM which is being updated.

Upgrading a VM to vmxnet3 Paravirtual Network Adapter

During the following procedures, it is important to use the Virtual Machine Remote Console (as opposed to RDP) because we will be causing a momentary disconnection from the network.

The biggest challenge is that the static IP address, if assigned, is associated with the device and not with the VM. Therefore, when you upgrade to the vmxnet3 adapter, your challenge will be un-installing and eliminating any trace of the “old” NIC to avoid seeing the dreaded message: “The IP address XXX.XXX.XXX.XXX you have entered for this network adapter is already assigned to another adapter[2]

Using the VMRC, log in to your Windows VM and run the device manager with: devmgmt.msc

You will see that the Network adapter is clearly listed as an Intel

Now go to the Network and Sharing Center and click on any (all) of the active Networks to observe their settings

You will notice that the speed is clearly 1.0 Gbps

Click on: Properties

Choose TPC/IPv4 and then click: Properties

Take note of the IP Address, Subnet Mask, Gateway, and DNS

Go to: VM > Edit Settings

image068

Remove the Network Adapter(s) from the VM and click OK. In truth, you could both remove the old adapter and add the new vmxnet3 adapter simultaneously, but we will do it in separate steps for clarity.

Notice, the active networks list is empty

Although we have removed the device from the VM, we have not removed its configuration from the system. Therefore, the IP address we saw earlier is still assigned to the E1000e Virtual NIC we just removed. In order to cleanly install a Paravirtual NIC, we need to remove the Intel NIC completely.

Open a command window (this must be done first from the command window) and run the following commands:

set devmgr_show_nonpresent_devices

start devmgmt.msc

After the device Manager window is open, select: View > Show Hidden Devices

Many admins falsely believe that is is simply enough to show hidden devices, but this is not true. It is absolutely necessary to “show_nonpresent_devices” at the command line first!

You should now be able to find the (now removed) Intel NIC listed in lighter text than the devices which remain resent.

Right-click and select: Uninstall

image023

OK

And it’s gone!

Go to: VM > Edit Settings

Edit Settings

 

Click: Add

Choose: Ethernet Adapter

Set the Type to: VMXNET 3 and then choose the appropriate Network Connection (usually VM Network), then click: Next

Click: Finish

Now click: OK

You will see the vmxnet3 Ethernet Adapter added to the Device Manager

Now click the active network, in this case “Ehternet”

Notice the speed listed as 10 Gbps. This does not mean that there are 10 Gbps NICs in the ESXi host merely that the observed speed of the network for this VM is 10 Gbps.

Click on: Properties

Now choose: TCP/IPv4 and select: Properties

Re-assign all of the IP addresses and subnet mask you observed earlier

And you have upgraded to the VMware Paravirtual device VMXNET 3

Upgrading a VM to pvscsi VMware Paravirtual SCSI Adapter

The trick in switching to the VMware Paravirtual SCSI adapter is in adding a dummy disk to the VM, which will force Windows to install the pvscsi driver, included with the VMware Tools package you have installed as part of a separate process.

Start the device manager with devmgmt.msc

Observe the LSI Adapter listed under Storage Controllers

Go to: VM > Edit Settings

Edit Settings

Choose: Add

Select: Hard Disk and then: Next

Choose: Create a new virtual disk and then: Next

The disk you create can be most any size and provisioning. We choose 10 GB Click: Next

In this step, it is critical that you place the new disk on an unique SCSI Node. That is to say, if the existing disk is on 0:0, then plane the new disk on 1:0 (you must not combine it with any LSI nodes, such as 0:1 or the process will not work)

Now click: Finish

Notice, you have added, not just a disk, but also a New SCSI Controller.

Now click: Change Type

Select: VMware Paravirtual

Now click: OK

Once the disk is added, look again in the Windows Device Manger and make sure that you can see the VMware PVSCSI Controller. If you can, that means the PVSCSI drivers have successfully loaded, and you can proceed.

Now we have to shut down the VM.

Shut Down Guest

Once the VM is off, Go to: VM > Edit Settings

Edit Settings

Choose the dummy disk (whichever one it was, BE CAREFUL HERE! and click: Remove

Although I failed to do so in creating this demo, you probably want to choose “Remove from virtual machine and delete files from disk,” to avoid leaving orphan files around.

Now select the SCSI controller(s) which are not already Paravirtual and choose: Change Type

Select: VMware Paravirtual

Now click: OK

Power your VM back on and observe that only the VMware Paravirtual device remains!

It should be noted; just as with the Intel NIC, the LSI device remains as a “nonpresent” device. If you feel like going the extra mile, repeat the steps to show nonpresent devices and uninstall the LSI device!

  1. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2058692
  2. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1179

Install ESXi 6 to a physical server with IPMI

ESXi 6 on a HP Blade Server with iLO

We are going to install ESXi 6 on a physical server using HP’s IPMI interface known as iLO to perform the install. iLO is considered best-in-class for IPMI consoles, but still can take some getting used to. IPMI out-of-band interfaces collectively have the advantage of allowing users to:

  • Power servers on and off
  • Connect to ISO and FLP media
  • Input commands and view the console interface, including blue, purple and red screens that would not be visible with an in-band console

image002

First we are going to choose Image (by the picture of the CD/DVD)

image004

Then connect to the HP customized ESXi image that we just downloaded. Always use the vendor customized ESXi image for physical installs, when one is available

image006

Now click on the power icon and choose: Momentary Press

iLO Momentary Press

Wait a good long time to even see this

image010

And another good long time before the CD/DVD starts to load. When installing ESXi 6, the DVD will load to RAM (which is what you see happening below) and then the hypervisor will start.

image012

When the hypervisor has started, the screen will become yellow and grey, like below. The process speeds up from here.

image014

[Enter]

image016

[F11]

image018

Wait, just a few seconds (usually).

image020

[Enter]

image022

Select your choice here. The default is “Upgrade ESXi, preserve…” but we want a fresh install, so we chose “overwrite” [Enter]

image024

Choose your keyboard [Enter]

image026

Set a password [Enter]

image028

This next step may actually take a few minutes

image030

[F11]

image032

Wait, but since the binaries are all loaded by this point, this goes quickly.

image034

Disconnect the ISO from IPMI and then press [Enter]

image036

ESXi 6 on a HP Blade Server with iLO

Initial Configuration of an ESXi Host with the vSphere Client

There are certain basic settings you will want/need to configure before your ESXi host is suitable for use in production, or even in a lab environment. At the very least, you will need to give your ESXi a hostname and IP address.

Now, I could press [F2] here and configure my ESXi host using the Direct Console User Interface (DCUI), but the DCUI provides a limited set of options, and using an IPMI interface such as iLO (even though iLO is one of the best of its kind), is not always a user-friendly procedure. Besides, we covered using the DCUI in: http://www.johnborhek.com/vmware-vsphere/building-a-vsphere-home-or-learning-lab-2/

Instead, I will show you how to make these initial configurations using the VMware vSphere Client for Windows (sometimes called the vSphere Desktop Client or the vSphere C# Client), which is the only viable client for a standalone host.

Open the vSphere Client for Windows and enter the DHCO IP address you saw on the previous screen. You will use the User name: root and the password you assigned during the install.

Just click Ignore here. Installing this certificate would be useless, as we are going to change the IP.

You may have to click Home to see this screen

Now choose the tab: Configuration and choose the option: Networking

Click on Properties of vSwitch0. Be careful as there are two “Properties” links on this screen. You want the one right by the vSwitch

Highlight the: Management Network and choose: Edit

We probably won’t need to change any of the settings here.

Choose the tab: IP Settings

Now select: Use the following IP settings and don’t forget to click: No IPv6 Settings

Now remember, as soon as you apply this, your client session will become invalid because the IP is now different.

Enter the new IP you assigned, along with the username: root and you password

vSphere Client for Windows

Now is the time to “Install this certificate….” as well as: Ignore

Click on the tab: Configuration

Choose the option: DNS and routing and then: Properties

This is (probably) not the correct information, as it is supplied by DHCP.

Enter the correct hostname and domain, as well as the search domains (“Look for hosts in the following domains”)

You may see this if you left IPv6 enabled.

You are now finished with initial configuration of your ESXi Host and may proceed to set up storage, networking and everything else.