Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1567076.1
Update Date:2013-08-21
Keywords:

Solution Type  Sun Alert Sure

Solution  1567076.1 :   Guest Domains Running Solaris 11.1.7.5.0 (or greater) on T5 and M5 Systems may Experience Data Corruption, Hangs or Panics when Live Migrated  


Related Items
  • SPARC M5-32
  •  
  • SPARC T5-1B
  •  
  • Sun Software - Generic
  •  
  • Solaris Operating System
  •  
  • Netra SPARC T5-1B Server Module
  •  
  • SPARC T5-2
  •  
  • SPARC T5-4
  •  
  • Sun Hardware - Generic
  •  
  • SPARC T5-8
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • _Old GCS Categories>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • _Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
Patches
History
References


Applies to:

SPARC T5-1B
Netra SPARC T5-1B Server Module
Sun Hardware - Generic
Sun Software - Generic
SPARC M5-32
Sun SPARC Sun OS
__________________________________________



Date of Workaround Release: 05-Jul-2013

Date of Resolved Release: 21-Aug-2013
__________________________________________

Description

When guest domains running Solaris 11.1.7.5.0 (or greater) on T5 and M5 systems are live migrated, those guest domains may experience data corruption or hangs, or the guest domain's operating system may panic.

This is due to an issue in the Hypervisor firmware which makes older system firmware incompatible with the newer Solaris releases 11.1.7.5.0 and greater.

Note: This issue is also being tracked in Problem <Document:1567072.1>.

Occurrence

This issue can occur on the following systems with Solaris 11.1.7.5.0 or greater:

SPARC Platform

  • Guest domains on Sun SPARC T5 Servers with System Firmware versions 9.0.0.d, 9.0.0.h and 9.0.0.i
  • Guest domains on Sun SPARC M5-32 Server with System Firmware versions 9.0.1.f and 9.0.1.g

Notes:

    1. This issue only impacts SPARC T5 and M5 platforms running Oracle VM Server for SPARC versions 3.0.0.3 or earlier.

    2. Primary domains are not impacted by this issue.

To determine the Solaris version of the guest domain, use the following command on the guest domain:

    $ pkg info entire
       Name: entire
      Summary: entire incorporation including Support Repository Update (Oracle Solaris 11.1.7.5.0).
      Description: This package constrains system package versions to the same
                   build.  WARNING: Proper system update and correct package
                   selection depend on the presence of this incorporation.
                   Removing this package will result in an unsupported system.  For
                   more information see https://support.oracle.com/CSP/main/article
                   ?cmd=show&type=NOT&doctype=REFERENCE&id=1501435.1.
           Category: Meta Packages/Incorporations
                State: Not installed
           Publisher: solaris
             Version: 0.5.11 (Oracle Solaris 11.1.7.5.0)
    Build Release: 5.11
              Branch: 0.175.1.7.0.5.0
 Packaging Date: Sat May 04 02:41:45 2013
                 Size: 5.46 kB
                FMRI: pkg://solaris/entire@0.5.11,5.11-0.175.1.7.0.5.0:20130504T024145Z

In the output above, the "Version" text shows that Solaris 11.1.7.5.0 is in use:

    Version: 0.5.11 (Oracle Solaris 11.1.7.5.0)

To determine the version of Oracle VM Server for SPARC and the version of Sun System Firmware, use the following command on the primary domain:

    # ldm -V
    Logical Domains Manager (v 3.0.0.3)
            Hypervisor control protocol v 1.11
            Using Hypervisor MD v 1.4

    System PROM:
           Hostconfig      v. 1.3.0.h      @(#)Hostconfig 1.3.0.h 2013/05/16 16:58
           Hypervisor      v. 1.12.0.g     @(#)Hypervisor 1.12.0.g 2013/05/16 16:40
           OpenBoot       v. 4.35.0.a     @(#)OpenBoot 4.35.0.a 2013/03/01 14:53

Sun System Firmware versions 9.0.0.d, 9.0.0.h and 9.0.0.i for T5 have a hypervisor version of 1.12.0.d, 1.12.0.f and 1.12.0.g respectively. Versions 9.0.1.f and 9.0.1.g for M5 have a hypervisor version of 1.12.1.c and 1.12.1.d respectively.  All of these versions are vulnerable to this issue.

Symptoms

If the described issue occurs resulting in data corruption, abnormal application behavior (such as application crashes) may be seen on the guest domain.

If the issue described occurs resulting in a guest domain hang, the guest domain and its applications will be unresponsive.

If the issue described occurs resulting in a guest domain system panic, a system panic message may be seen on the guest domain console after the guest is migrated.  Due to the non-deterministic nature of the problem, the panic does not generate reliably reproducible panic output and therefore cannot be documented here. 

Workaround

There are two separate workaround options available that can be used to avoid this issue.  The first workaround (A) is preferred, since it only temporarily impacts a guest domain and does not require any additional steps to remove the workaround after migration. The second workaround may also be safely used but may impact performance until the workaround is explicitly removed by the user.

Workaround A:

The first workaround option needs to be performed every time a guest domain is to be live migrated.  It requires that DRM policies for the guest domain be temporarily disabled, and then the 'disable_mmu_group_demap script' from Problem <Document:1567072.1> be executed on the guest domain prior to initiating a live migration operation.  The script will temporarily disable a performance feature of the hardware, which in turn allows a successful live migration of the guest domain.  Once live migration has completed, DRM policies will need to be re-enabled.

Please see Problem <Document:1567072.1> for the 'disable_mmu_group_demap' script and complete details for Workaround A.

Workaround B:

The second workaround option requires that the guest domain's '/etc/system' file be edited and the guest domain be rebooted, prior to live migrating the guest domain.  Unlike the first workaround, this workaround will permanently disable a performance feature of the hardware, until the guest domain's '/etc/system' file is restored to its original state.

    1. On the guest domain, append the following line to the '/etc/system' file:

       set sfmmu_demap_xcall_optimization=2

    2. Reboot the guest domain.

    3. Migrate the guest domain.

Note: This workaround remains in effect until it is removed.  If performance is acceptable, and the guest domain may need to be migrated in the future, it is suggested that this workaround be left in place pending the final resolution.

Steps for removing Workaround B:

    1. On the guest domain, remove the following line from the /etc/system file:

       set sfmmu_demap_xcall_optimization=2

    2. Reboot the guest domain.


This issue is addressed in the following firmware patches:

  • 17019067 for NETRA SPARC T5-1B SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019069 for SPARC T5-4+T5-8 SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019075 for SPARC T5-1B SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019079 for SPARC T5-2 SUN SYSTEM FIRMWARE 9.0.2.G
  • 17019082 for SPARC M5-32 SUN SYSTEM FIRMWARE 9.0.2.E

Patches

<SUNPATCH:17019067>
<SUNPATCH:17019069>
<SUNPATCH:17019075>
<SUNPATCH:17019079>
<SUNPATCH:17019082>

History

05-Jul-2013: Document released; state Workaround
21-Aug-2013: Firmware patches available, issue is Resolved

This regression was triggered by the putback for Solaris bug 15765451 first released
in Solaris 11.1.7.5.0. Prior to this, the Hypervisor code that causes the issue was not exercised.

A mitigation resolution is pending completion for Oracle VM Server for SPARC version 3.0.0.4,
and will revert to using warm migration if Sun System Firmware 9.0.2 (or later) is not present.  
This fix is expected to be released near the end of July.

It is a common practice to use live migration to evacuate platforms prior to performing system maintenance,
including firmware upgrades.  Therefore, either the workarounds listed above, or the pending mitigation
resolution in Oracle VM Server for SPARC 3.0.0.4 should be used,  prior to live migrating guest domains
when upgrading the firmware.

Please see Problem <Document:1567072.1> for the 'disable_mmu_group_demap' script
and complete details for Workaround A.

Questions regarding this document should be addressed to
sunalertpublication_us_grp@oracle.com and copy the
responsible Engineer listed below.

Internal Contributor/Submitter: Justin.Frank@oracle.com
Internal Eng Responsible Engineer: Justin.Frank@oracle.com
Internal Knowledge Analyst: david.mariotto@oracle.com
Internal Eng Business Unit Group: Systems RPE

References









Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback