Verify new Spectre mitigation patches using PowerCli and vDocumentation

Intel recently announced that it has released microcode updates for 100 percent of its products launched in the past five years that require protection against Spectre/Meltdown. For virtualization admins using Intel Xeon chips, this would mean that servers with a Sandy Bridge or newer generation CPU now have a BIOS update available. Westmere and Nehalem are in Beta production status as of this writing for microcode updates.

We have published an update of vDocumentation that will assist you in validating your current environment for Spectre. The update (v2.4.1) includes enhanced versions of:

  • Get-ESXSpeculativeExecution
  • Get-VMSpeculativeExecution

UPDATE (3/28/2018) VMware recently updated security advisory VMSA-2018-0004 with the list of new patches, and Intel has also released production MCU for Westmere and Nehalem.

We have released v2.4.2 of vDocumentation to help validate your environment for these new patches. More details at the end of this post.

Get-ESXSpeculativeExecution

A better way of explaining the changes would be by demonstrating the results gathered with a few examples. First though, here are some dependencies that the script relies on, we will mention them here and cover them in more detail in paragraphs below.

The script relies on two CSV files:

  • BIOSUpdates.csv – This file is used to validate and provide guidance on the current installed BIOS version. If you do not have internet access when running the script, or wish to use a customized version of the file, then download it and use the –inputBiosFile parameter to specify the offline version path
  • Intel_MCU.csv– This file is used to report on the microcode status provided by Intel. Use the -inputMcuFile parameter to specify the offline version path

Both files are to up to date, and I will keep them current as Intel provides new updates. The BIOS update file contains information for Dell and HP servers, which is what I have in the environment I support. If you have different hardware and would like to share the pertinent information with the community, please reach out and I will update the file with your input.

SSH dependency: VMware KB52345 has all the details about the initial MCU patches that Intel released which caused reboot problems. In order to validate if the microcode revision running on an ESXi is affected you need to run a shell command; today this can only be accomplished through SSH from an automation standpoint. When running the script, you need to use the -UseSSH parameter, this uses Posh-SSH as an SSH client which you can install from the PowerShell gallery (Install-Module -Name Posh-SSH). With Posh-SSH you will be prompted for credentials, this uses Get-Credential and the PSCredential object which stores and relays the password as a secure string.

Spec0

Running the script

Make sure to run the Get-Help (Example: Get-Help Get-ESXSpeculativeExecution –ShowWindow) command to understand all the parameters and switches, we will focus more on what is gathered on this blogpost.

Syntax

Get-ESXSpeculativeExecution [[-esxi] <Object>] [[-cluster] <Object>] [[-datacenter] <Object>] [[-inputBiosFile] <Object>] [[-inputMcuFile] <Object>] [-UseSSH ] [-ReportOnVMs ] [-ExportCSV ] [-ExportExcel ] [-PassThru ] [[-folderPath] <Object>] [<CommonParameters>]

Looking at an ESXi host that has been patched for Spectre

Spec1

BIOS Guidance

BIOS Guidance relies on the BIOSUpdates.csv file which compares the current installed version. These are the possible values:

  • Blank (null) – a CSV file was not used or is unreachable so this could not be validated
  • Proper BIOS Installed – The ESXi host has the correct BIOS installed that contains the NEW microcode update from Intel
  • BIOS update available – install vxx – The ESXi host needs a BIOS update. The version required is provided here, look at a later example of a system needing an update
  • Unknown – Check with manufacturer – The hardware model is not present in the CSV file, or there has not been a microcode release for the current system model. Look at a later example of a Nehalem/Westmere system

MCU CPUID, PCID/INVPCID

A patched ESXi host will show the new CPU instructions provided by the microcode update (IPBP, IBRS, and STIBP). This is another confirmation that the hypervisor has been properly updated. KB52085 outlines steps to gather confirmation of an updated system.

PCID/INVPCID CPU instructions can reduce performance impact of Spectre mitigation (CVE-2017-5754) which are present on Haswell and newer Intel product family. Please note that this is a performance optimization and not a security weakness if the CPU doesn’t provide it. Check with your VM Guest OS vendor to determine if it’s supported, and note that VM hardware version 11 or higher is needed.

On the example above, you can see that the Intel Product family is Broadwell and PCID/INVPCID is True, for a unsupported system it will show up as False.

ESXi applied MCU, MCU BIOS rev, MCU boot rev

KB52345 covers the details around the initial microcode patches ESXi650-201801402-BG, ESXi600-201801402-BG, and ESXi550-201801401-BG that had to be pulled due to the Intel reboot issue. If you wish to check what the current microcode revision is and whether the ESXi booted with the microcode provided by the patch (or future patches, if VMware bundles the new versions) then here is where you need to use the -UseSSH parameter to obtain this information.

ESXi applied MCU

This has 2 possible values:

  • Active – The microcode revision provided by the VMware patch is being used.
  • Inactive – The microcode revision provided by the system BIOS is being used. There is no VMware patch providing a microcode update or the BIOS provided microcode is the same or newer. In the example above you can see that the value is Inactive and when comparing MCU BIOS rev and MCU boot rev; you see that both versions are the same.

MCU BIOS revision

Shows the microcode revision provided by the system BIOS. If the BIOS provided microcode revision is newer or same as the one provided by a VMware patch, then the BIOS revision will be used.

MCU boot rev

MCU boot rev is the current active microcode revision. If ESXi applied MCU = Active, then MCU boot rev will show a newer revision than MCU BIOS rev. This will be true for a system that has a BIOS version containing an older microcode than what was provided by the VMware patch. ESXi will never override a newer version of the microcode patch provided by BIOS nor will the BIOS with an older version prevent ESXi from applying the newer version. I will show you an example of this later.

Intel product, Intel MCU Status, Intel MCU(s) at risk, Intel Production MCU

This is the result of using the Intel_MCU.csv file that provides information about the microcode production status, if there were earlier affected microcode revision(s), and what the new version is. You can compare this against the results obtained previously (MCU BIOS rev/MCU boot rev) and see if you were affected. The CSV file is a dump of Server CPUs provided by Intel MCU guidance

MCU boot rev at risk, VMware MCU workaround applied

MCU boot rev. at risk

This has 2 possible values

  • True – the current active microcode revision (MCU boot rev.) is equal to Intel MCU(s) at risk. This means that the hypervisor is currently running a microcode version that has been identified by Intel to cause reboot issues
  • False – the current active microcode revision (MCU boot rev.) has not been identified by Intel to cause reboot issues.

VMware MCU workaround applied

KB52345 mentioned earlier outlines the workaround provided by VMware to disable/hide the CPU instructions from the VMs so they are not affected by the reboot issue. This consists of updating the /etc/vmware/config file with this line: cpuid.7.edx = “—-:00–:—-:—-:—-:—-:—-:—-” on the ESXi host.

VMware MCU workaround applied has 2 possible values

  • True – /etc/vmware/config file contains line: cpuid.7.edx = “—-:00–:—-:—-:—-:—-:—-:—-“.
  • False – /etc/vmware/config file does NOT contain line: cpuid.7.edx = “—-:00–:—-:—-:—-:—-:—-:—-“.

Looking at an ESXi host that needs to be patched

Spec2

Note the following on this example:

  • Dell PowerEdge R730 running ESXi 5.5.0 (7504623 Build)
  • The ESXi is using a BIOS microcode that has been identified by Intel to cause reboot issues. You can easily identify it here, as MCU boot rev. at risk = True.
  • The VMware workaround is in place as VMware MCU workaround applied = True
  • ESXi applied MCU = Inactive, Dell BIOS v2.7.0 is providing the current active microcode revision(0x0000003b)
  • There is a new production microcode revision that is available for this Intel product family (Intel product, Intel MCU status, and Intel production MCU fields)
  • BIOS guidance shows that I need to install BIOS v2.7.1 to have this system patched for Spectre
  • Intel product shows that this server has a Haswell product family CPU, so this should provide PCID/INVPCID performance optimization right? Why is PCID/INVPCID = False? Remember that a VM requires hardware version 11 or higher to support this, note that the hypervisor version running here is ESXi 5.5; which only supports:
    • Up to Hardware Version 10
    • A max EVC Mode of Ivy Bridge (Max EVC mode field)

This then is yet another reason why you should consider upgrading ESXi 5.5 sooner rather than later to get the performance benefit against Spectre. I will show you another example next of a system with the same Intel CPU Family but running ESXi 6.0.

Spec3

Looking at an ESXi host, where Intel has not released a Microcode update

Spec4

Note the following on this example:

  • An Intel Westmere product Family CPU
  • Intel MCU production status is Beta
  • A microcode was never provided for this CPU family, so there are no CPUID instructions presents for Spectre mitigation (MCU CPUID is blank/null). There is no Microcode rev at risk, and no VMware workaround applied
  • BIOS guidance = Unknown – Check with manufacturer. Using the CSV file gives you the flexibility of updating the status of this field as Intel makes progress and a BIOS update becomes available without touching the code.

Looking at an ESXi host, with an AMD processor

Spec5

Note the following on this example:

  • This script is more focused on Intel powered systems; but it does provide some information when executed against an AMD system
  • I found no plans, date for Dell to provide a BIOS update for this PowerEdge R715 server, but if one is provided I will be able to leverage the BIOS_Update.csv file to validate this system again.
  • No CPU instructions present as part of Microcode update relevant to Spectre mitigation, even though the VMware patch provided microcode update is active.

ReportOnVMs/Get-VMSpeculativeExecution

-ReportOnVMs and Get- VMSpeculativeExecution produce the same output. ReportOnVMs is just an additional switch to use with Get-ESXSpeculativeExecution to validate VM hosted on the hypervisor as well.

Get-VMSpeculativeExecution

Make sure to run the Get-Help (Example: Get-Help Get-VMSpeculativeExecution -ShowWindow) command to understand all the parameters and switches, we will focus more on what is gathered on this blogpost.

Syntax

Get-VMSpeculativeExecution [-VM] <VirtualMachine[]> [<CommonParameters>]

Validating a VM

This scripts shows/validates that the new CPU instructions introduced by the Spectre microcode update are present and available to the VM Hardware. This does not prove and is not an end to end validation process including the Guest OS. Make sure you’re familiar with KB52085 and follow your VM Guest OS vendor steps to enabling and validating Spectre. Microsoft has provided a PowerShell script to validate/enable the protections on their Guest OS.

Last PoweredOn

This is based on VMware.Vim.VmPoweredOnEvent Events, where are limited to the vCenter Database retention settings; on VCSA 6.x the default is 30 days. Don’t expect all your VMs to show a timestamp for last Powered On, but the intention of including it here is for you to have an idea and compare this time stamp vs the hypervisor uptime to determine whether or not a VM has been power cycled (cold boot), which is required for the new CPUID instructions/features to get picked up by the Guest OS.

Spec6

ESXi MCU CPUID, ESXi PCID/INVPCID

Shows that an ESXi host has the new Spectre mitigations CPUID instructions present and if the ESXi is capable of providing PCID optimization to the VM Guest OS.

VM MCU CPUID, VM PCID/INVPCID

Shows if the Spectre mitigations CPUID(s) have been picked up by the VM hardware, as well if it’s PCID optimized or not. We will see a few examples of these later, but for now please note that having this present does not necessarily mean that the VM has been power cycled, because as a VM is vMotioned to another host is a way for these to show up as well. As an example, if you’re patching a cluster and as you’re evacuating the next host to patch, VMs placed on the recently patched host will show these new instructions.

Hypervisor-Assisted Guest mitigation, PCID optimization

Both of these show an ESXi/VM relation of the setting. Only powered on VMs are validated.

Hypervisor-Assisted Guest mitigation

Has the following possible values:

  • Supported/Enabled – Both the ESXi host and VM shows the Spectre mitigation CPUID(s) (STIBP,IBRS,IBPB)
  • Supported/Disabled – ESXi host has Spectre mitigation CPUID(s) present but the VM Hardware has not yet picked them up.
    • It will show as enabled after power-cycle (cold boot)
    • It will show as enabled if the VM is vMotioned to another “Supported” host
  • NotSupported/Disabled – ESXi host does not support Spectre mitigation, so the VM has no support either. This is most likely due to the ESXi host not yet been updated to be Spectre compliant.
  • Supported/Upgrade VM Hardware – ESXi host has Spectre mitigation CPUID(s) present but the VM needs to have its hardware upgraded before it can recognize the new instructions.
    • VM hardware version 9 is the minimum requirement for Hypervisor-Assisted Guest Mitigation for branch target injection (CVE-2017-5715).
  • UnknownSinceOff – The script will only validate powered on VM. All powered off will show up as UnknownSinceOff

PCID optimization

Has the following values:

  • Supported/Enabled – The CPU on the ESXi host is Haswell or newer supporting the PCID, INVPCID instructions. The VM hardware shows these instructions as well, so the Guest OS may be PCID optimized. Check with your VM Guest OS vendor for PCID Optimization support (For Microsoft, that would be Windows 2012 and newer)
  • Supported/Disabled – The CPU on the ESXi host is Haswell or newer supporting the PCID, INVPCID instructions. The VM does show to be supporting these instructions possibly because of:
    • The VM is hosted on a cluster running in EVC Mode set to a level lower than Haswell, so the instructions have been hidden from the virtual hardware, you will see an example of this later.
    • VM advanced settings in place
  • Supported/Upgrade VM Hardware – The CPU on the ESXi host is Haswell or newer supporting the PCID, INVPCID instructions. The VM does show to be supporting these instructions because it does not meet the minimum hardware version.
    • VM hardware version 11 or newer is needed to support PCID/INVPCID optimization
  • UnknownSinceOff – The script will only validate powered on VM. All powered off will show up as UnknownSinceOff
  • NotSupported/NA – The CPU on the ESXi host is older than Haswell which does not support the INVPCID instruction. VM will show up as NA, you will see an example of this later.

Looking at a Hypervisor-Assisted Guest mitigation enabled VM

 Spec7aSpec7b

Note the following on this example:

  • VM hardware v13 – ESXi host is v6.5.0 (comparing with the results of Get-ESXSpeculativeExecution)
  • PCID Optimized
  • Spectre mitigation CPUID present on both host and VM and there is a recent VM Last PowerOn event, so most likely the Guest OS is now picking up the needed CPUID(s)… given all steps provided by the Guest OS vendor was followed.

Looking at a Hypervisor-Assisted Guest mitigation disabled VM

Spec8

Note the following on this example:

  • Spectre mitigation CPUID present on the ESXi host but not on the VM.
  • Hypervisor-Assisted Guest mitigation = Supported/Upgrade VM hardware. Current VM hardware (v8) does not meet minimum requirement for Spectre mitigation.
  • Same is true for PCID optimization. To get VM to be “compliant” I will need to upgrade VM hardware to v11 or newer.

Looking at a Hypervisor-Assisted Guest mitigation disabled and PCID optimization disabled VM

Spec9a

Note the following on this example:

  • The ESXi host is running the first release of the problematic Intel microcode that is known to cause reboot issues and I have not applied the recommended VMware workaround for it!
  • A good example where results gathered through SSH are important for this analysis. We can deduce that Dell BIOS version 1.2.11 provides microcode version 0x0200002c and that the VMware installed Patch is providing the current active microcode update which is version 0x0200003a. ESXi applied MCU = Active
  • The Intel MCU(s) at risk field shows that the current active microcode version is known to cause reboot issue.
  • There is also a new production MCU available through Dell BIOS update v1.3.7 that I can apply

So why have I not done anything about fixing this potential problem? Let’s look at a VM to better understand!

Spec9b

The answer is due my cluster’s EVC mode. Go back to KB52085, under vMotion and EVC Information, states the following “In order to maintain this compatibility the new features are hidden from guests within the cluster until all hosts in the cluster are properly updated.  At that time, the cluster will automatically upgrade its capabilities to expose the new features. Unpatched ESXi hosts will no longer be admitted into the EVC cluster

  • I have a hardware v13 VM running on that host, that even though shows to be Spectre mitigation capable, the VM hardware is not picking up the new features due to the EVC mode.
  • I don’t need to worry about updating this host, until a microcode version is made available for Nehalem product family, since all my hosts in cluster needs to be patched.
  • The CPU on the ESXi host is Skylake which supports the PCID, INVPCID instructions, however the VM is not picking it up. The EVC mode in place is hiding those instructions from the virtual hardware.

Update 3/28/2018

The following Patches have been made available from VMware:

ESXi Guest OS Framework MCU v2 Patch
6.5 ESXi650-201803401-BG ESXi650-201803402-BG
6.0 ESXi600-201803401-BG ESXi600-201803402-BG
5.5 ESXi550-201803401-BG ESXi550-201803402-BG

 Important knowledge-base articles update:

KB5208 – Note that Guest framework patch is what allows the VM guest OS to use new CPU instructions (IBPB,IBRS,STIBP). The v2 MCU patch covers the Intel Generation CPU on this KB only; which does not include Westmere and Nehalem product Family. You will see examples below of patching against Westmere system.

KB52345 – If you are a user that currently have patches for v1 MCU that caused reboot issues (ESXi650-201801402-BG, ESXi600-201801402-BG, and/or ESXi550-201801401-BG), which I covered previously, and have the workaround applied then know that the Guest OS framework patch will remove the workaround line from /etc/vmware/config when applied.

V2.4.2 of vDocumentation includes the following enhancement to Get-ESXSpeculativeExecution script.

ESXi Guidance

This will validate if you have installed Guest OS Framework Patch and MCU v2 Patch. These are the possible values:

  • Framework Missing – The Framework patch has not been installed (ESXi650-201803401-BG, ESXi600-201803401-BG, or ESXi550-201803401-BG)
  • Framework Installed – The Framework patch has been installed
  • v2 MCU Missing – The v2 MCU patch has not been installed (ESXi650-201803402-BG, ESXi600-201803402-BG, or ESXi550-201803402-BG)
  • v2 MCU Installed – The v2 MCU patch has been installed

Let’s look at a few examples to demonstrate this

Installing the new VMware patches on an ESXi host

Spec10

Note the following on this example:

  • I applied the VMware workaround for previous v1 MCU patches to demonstrate what happens when installing the Framework patch. VMware MCU workaround applied = True
  • Looking at ESXi Build ID, you can see that it has not been patched (7526125)
  • BIOS guidance shows that I need to install BIOS v2.6.1 to have this system patched for Spectre
  • ESXi guidance shows both new patches are missing

Next I’ll install the Framework patch only:

Spec11

  • ESXi Build ID is now updated, showing that I have patched this system
  • ESXi guidance shows that I have only applied the framework patch.
  • MCU CPUID still blank since I have not applied the BIOS update nor the ESXi v2 MCU patch
  • ESXi applied MCU = Inactive
  • VMware MCU workaround applied = False; installing the Framework patch removed the workaround line (cpuid.7.edx = “—-:00–:—-:—-:—-:—-:—-:—-“. ) from etc/vmware/config

Next I’ll install the v2 MCU Patch

Spec12

  • BIOS guidance still shows that I need a BIOS update, and my current BIOS version is still v2.5.4
  • ESXi guidance shows that I have applied both patches
  • MCU CPUID shows the spectre mitigation instructions are present
  • ESXi applied MCU = Active. BIOS v2.5.4 is providing microcode revision 0x710, however the v2 ESXi patch is providing a newer revision: 0x713. The ESXi is then using the microcode provided by the patch

Next I’ll install the BIOS update

Spec13

  • BIOS guidance shows that proper BIOS is installed. Current BIOS version is v2.6.1
  • ESXi applied MCU = Inactive, both microcode revisions are the same, so the BIOS provided rev is used.

Looking at an ESXi host with Westmere/Nehalem Intel CPU

Spec14

Note the following on this example:

  • ESXi guidance shows that both patches have been installed. The Build (7967664) is current as well
  • This is a Westemere Intel CPU, so the patches does not contain the v2 microcode for this product family
  • BIOS guidance shows that there is a BIOS update available: vP65 (2/22/2018)
  • MCU CPUID is blank, no spectre mitigation CPU instructions present

Next, I’ll show you a Westmere system that has the BIOS update present

Spec15

  • BIOS guidance shows that proper BIOS is installed. Current BIOS version is P65 (2/22/2018)
  • MCU CPUID shows the spectre mitigation instructions are present

Happy patching!!

Posted in PowerShell
2 comments on “Verify new Spectre mitigation patches using PowerCli and vDocumentation
  1. […] Verify new Spectre mitigation patches using PowerCLI and vDocumentation These very useful PowerCLI scripts have been updated again, this time to cover off how Intel has released BIOS updates for all products sold in the last 5 years. Learn more in this article. […]

    Like

  2. […] – v2.41 and it works good. Some big changes – where are described in this article. My PowerShell can connect to the Internet so I did not have to download or import files. As well, […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Edgar Sanchez
%d bloggers like this: