Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1567072.1
Update Date:2016-09-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1567072.1 :   Workaround Options for Guest Domains Running Solaris 11.1.7.5.0 (or greater) on T5 and M5 Systems and Live Migration  


Related Items
  • SPARC M5-32
  •  
  • Netra SPARC T5-1B Server Module
  •  
  • SPARC T5-8
  •  
  • Solaris Operating System
  •  
  • SPARC T5-2
  •  
  • SPARC T5-4
  •  
  • SPARC T5-1B
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Operating System>SN-SND: Sun OS Virtualization LDOM
  •  




In this Document
Symptoms
Changes
Cause
Solution
 Modification History:
References


Applies to:

Netra SPARC T5-1B Server Module - Version All Versions and later
SPARC T5-8 - Version All Versions and later
SPARC T5-1B - Version All Versions and later
Solaris SPARC Operating System - Version 11.1 and later
SPARC M5-32 - Version All Versions and later
Sun SPARC Sun OS
__________________________________________



Sun Alert
__________________________________________

Symptoms

When guest domains running Solaris 11.1.7.5.0 (or greater) on T5 and M5 systems are live migrated, those guest domains may experience data corruption or hangs, or the guest domain's operating system may panic.

Note: Please see Sun Alert <Document:1567076.1>, which is also tracking this issue. Attached to this document is the script used in the Workaround 'A" section.

Changes

Guest domains running Solaris 11.1.7.5.0 (or greater) on T5 and M5 systems are being live migrated.

Cause

This is due to an issue in the Hypervisor firmware which makes older system firmware incompatible with the newer Solaris releases 11.1.7.5.0 and greater.

This issue can occur on the following systems with Solaris 11.1.7.5.0 or greater:

  • Guest domains on Sun SPARC T5 Servers with System Firmware versions 9.0.0.d, 9.0.0.h and 9.0.0.i
  • Guest domains on Sun SPARC M5-32 Server with System Firmware versions 9.0.1.f and 9.0.1.g

Notes:

1. This issue only impacts SPARC T5 and M5 platforms running Oracle VM Server (Previously called Sun Logical Domains, or LDoms) for SPARC versions 3.0.0.3 or earlier.

2. Primary domains are not impacted by this issue.

3. Sun System Firmware versions 9.0.0.d, 9.0.0.h and 9.0.0.i for T5 have a hypervisor version of 1.12.0.d, 1.12.0.f and 1.12.0.g respectively. Versions 9.0.1.f and 9.0.1.g for M5 have a hypervisor version of 1.12.1.c and 1.12.1.d respectively.  All of these versions are vulnerable to this issue.

Solution

There are 2 separate workarounds available to avoid this issue until a final resolution is available.  The first workaround is preferred since it only temporarily impacts a guest domain and does not require any additional steps to remove the workaround after migration. The second workaround may also be safely used but may impact performance until the workaround is explicitly removed by the user.

Steps for applying Workaround A: (for use with the attached script)

The disable_mmu_group_demap script (as also referenced in Sun Alert <Document:1567076.1>), is attached below.  Download and save this script for execution from the command line.  As per Sun Alert <Document:1567076.1>, you must be the 'root' user to execute this script.

The first workaround needs to be performed every time a guest domain is to be live migrated.  It requires that DRM policies for the guest domain be temporarily disabled, and then the attached disable_mmu_group_demap script be executed on the guest domain prior to initiating a live migration operation.  This script will temporarily disable a performance feature of the hardware, which in turn allows a successful live migration of the guest domain.  Once live migration has completed, DRM policies will need to be re-enabled.

      1. If DRM policies are present, they must all be disabled on the primary domain using the 'ldm set-policy' command.  For example:

        # ldm set-policy enable=no name=<POLICY-NAME> <GUEST-NAME>

where <POLICY-NAME> is replaced by an actual policy name, and <GUEST-NAME> is replaced by the actual guest domain name.  Although only one example is shown, this command must be run for all policies for the guest domain that is to be migrated.  A list of all policies may be obtained by using the following command:

        # ldm ls -o resmgmt <GUEST-NAME>

      2. On the guest domain, as the 'root' user, run the attached disable_mmu_group_demap script and note the output:

        # disable_mmu_group_demap
        MMU group demap disabled successfully.
        It is safe to do Live Migration.

      3. Migrate the guest domain.

      4. Once the guest is migrated, re-enable all of the policies that were disabled in step 1.  For example:

        # ldm set-policy enable=yes name=<POLICY-NAME> <GUEST-NAME>

Note: Once the guest domain has been successfully migrated and the policies re-enabled, no other steps are required in order to remove this workaround or to re-enable the performance feature of the hardware.

Steps for applying Workaround B:

The second workaround requires that the guest domain's '/etc/system' file be edited and the guest domain be rebooted, prior to live migrating the guest domain.  Unlike the first workaround, this workaround will permanently disable a performance feature of the hardware, until the guest domain's /etc/system file is restored to its original state.

      1. On the guest domain, append the following line to the /etc/system file:

        set sfmmu_demap_xcall_optimization=2

      2. Reboot the guest domain.

      3. Migrate the guest domain.

Note: this workaround remains in effect until it is removed.  If performance is acceptable, and the guest domain may need to be migrated in the future, it is suggested that this workaround be left in place pending the final resolution.

Steps for removing Workaround B:

      1. On the guest domain, remove the following line from the /etc/system file:

        set sfmmu_demap_xcall_optimization=2

      2. Reboot the guest domain.

This issue is addressed in the following firmware patches:

  • 17019067 for NETRA SPARC T5-1B SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019069 for SPARC T5-4+T5-8 SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019075 for SPARC T5-1B SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019079 for SPARC T5-2 SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019082 for SPARC M5-32 SUN SYSTEM FIRMWARE 9.0.2.E

Modification History:

05-Jul-2013: Document released
21-Aug-2013: Fimware patches available, issue is Resolved

This regression was triggered by the putback for Solaris bug 15765451 first released
in Solaris 11.1.7.5.0. Prior to this, the Hypervisor code that causes the issue was
not exercised.

A mitigation resolution is pending completion for Oracle VM Server for SPARC version 3.0.0.4,
and will revert to using warm migration if Sun System Firmware 9.0.2 (or later) is not present.  
This fix is expected to be released near the end of July.

It is a common practice to use live migration to evacuate platforms prior to performing system
maintenance, including firmware upgrades.  Therefore, either the workarounds listed above, or
the pending mitigation resolution in Oracle VM Server for SPARC 3.0.0.4 should be used, prior
to live migrating guest domains when upgrading the firmware.

Questions regarding this document should be emailed to the Contributors and Engineers below:

Internal Contributor/Submitter: madhavan.venkataraman@oracle.com, Justin.Frank@oracle.com
Internal Eng Responsible Engineer: madhavan.venkataraman@oracle.com, Justin.Frank@oracle.com
Internal Eng Business Unit Group: Systems RPE

References


Sun Alert






Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback