Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-2350944.1
Update Date:2018-04-13
Keywords:

Solution Type  Sun Alert Sure

Solution  2350944.1 :   T7, S7, M7, T8 and M8 Servers May Experience Degraded Performance with Cores Misprogrammed in a Reduced Power Mode  


Related Items
  • Netra SPARC S7-2
  •  
  • SPARC T8-1
  •  
  • SPARC T8-4
  •  
  • SPARC M8-8
  •  
  • SPARC T7-4
  •  
  • SPARC M7-8
  •  
  • SPARC S7-2L
  •  
  • SPARC M7-4
  •  
  • SPARC S7-2
  •  
  • SPARC T8-2
  •  
  • SPARC T7-2
  •  
  • SPARC M7-16
  •  
  • SPARC T7-1
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
History
References


Applies to:

SPARC M8-8
Netra SPARC S7-2
SPARC T7-4
SPARC S7-2
SPARC T7-2
SPARC
SPARC T7-2
SPARC T8-1
SPARC M7-16
SPARC T7-1
SPARC T8--2
SPARC T8-4
SPARC M8-8
Netra SPARC S7-2
___________________________________________



Date of Workaround Release: 23-Jan-2018

Date of Resolved Release: 23-Feb-2018
___________________________________________

Description

SPARC T7/S7/M7/T8 and M8 servers may experience degraded performance when a subset of SPARC core clusters (SCC) in M7 and S7 CPUs are misprogrammed in a reduced power mode.

Occurrence

This issue can occur on the following platforms:

SPARC Platform

  • T7/S7/M7/T8 and M8 Servers with Oracle VM Server version 3.5.0.0 or earlier

To determine the Oracle VM Server version installed on the server, use the following Solaris command in the primary domain:

    # ldm -V | grep Logical
    Logical Domains Manager (v 3.5.0.0.31)

Note: Oracle SuperCluster M8, Oracle SuperCluster M7 and Oracle MiniCluster S7-2 Servers disable power management during installation, but the HOST stop/start required to avoid the bug as a workaround may not have occurred. The script listed in the Symptoms section below can be used for detection of reduced power CPUs.

Symptoms

The 'clockrate.py' Python script explained in <Document: 1610270.1> can be run to check for reduced power CPUs. The script must be run in all domains, including the primary domain and all guest domains, to verify no reduced power CPUs are evident. A core cluster is a grouping of four cores. The M7 CPU has eight core clusters (32 cores) and the S7 CPU has two core clusters (eight cores). A cluster of four cores may be misprogrammed in a reduced power mode.

Any CPUs with Mhz(%) listed in the 20-25% range may be misprogrammed for reduced power. The example below shows cores 44-47 and 56-59 running in reduced power:

    # ./clockrate.py --core -c 1
    2018-01-17 11:12:38 PSET CHIP CORE CPU Mhz(r) Mhz(m) Mhz(%) t=100
   ...
    2018-01-17 11:12:38 -1 1 40 320 4133 4143 100%
    2018-01-17 11:12:38 -1 1 41 328 4133 4144 100%
    2018-01-17 11:12:38 -1 1 42 336 4133 4144 100%
    2018-01-17 11:12:38 -1 1 43 344 4133 4145 100%
    2018-01-17 11:12:38 -1 1 44 352 4133 1033 25%
    2018-01-17 11:12:38 -1 1 45 360 4133 1034 25%
    2018-01-17 11:12:38 -1 1 46 368 4133 1034 25%
    2018-01-17 11:12:38 -1 1 47 376 4133 1034 25%
    2018-01-17 11:12:39 -1 1 52 416 4133 4151 100%
    2018-01-17 11:12:39 -1 1 53 424 4133 4152 100%
    2018-01-17 11:12:39 -1 1 54 432 4133 4152 100%
    2018-01-17 11:12:39 -1 1 55 440 4133 4152 100%
    2018-01-17 11:12:39 -1 1 56 448 4133 1035 25%
    2018-01-17 11:12:39 -1 1 57 456 4133 1036 25%
    2018-01-17 11:12:39 -1 1 58 464 4133 1036 25%
    2018-01-17 11:12:39 -1 1 59 472 4133 1036 25%
    ...

In the above example, all four cores of the two reduced power core clusters are assigned to the domain where clockrate is run. Depending on your CPU assignments, core clusters may be split and the cores may be assigned to multiple domains. Reduced power would be evident in the clockrate output of the domains assigned the cores. Thus, fewer than four cores may manifest the reduced power in the output.

Note: If the clockrate script prints "WARNING - poweradm(1m) is currently ENABLED" and the reported Mhz(r) in column six does not match either 4133 or 5067 OR no cores are listed, run 'poweradm set active_control/administrative-authority=none' as root and re-run the clockrate script.

Workaround

To work around this issue, apply the following steps:

1. Disable Service Processor power management (aka platform policy) by using the following ILOM command as an administrative user:

    --> set /SP/powermgmt/ policy=disabled

2. Disable power management on the control domain (aka primary) by using the following command as root:

    # poweradm set active_control/administrative-authority=none

Power management *must* be disabled as shown in both step 1 and step 2 above. If either configuration property is modified, the HOST *must* be stopped (powered off) and restarted in order for the workaround to affect proper power configuration. Repeat the workaround for each HOST on multi-domain servers. HOST is synonymous with the term PDomain.

The Python script (shown in the Symptoms section above) can be used to confirm all CPUs are properly programmed and providing full performance.

For additional information, reference the following documents:

  • How to Save Power on SPARC T5, M5, M6, T7, M7 and S7 Servers <Document: 1610270.1>
  • Oracle VM Server for SPARC (LDoms) Document Index <Document: 1367098.1>

Resolution

This issue is addressed in the following release:

SPARC Platform

  • Solaris 11.3.29.5.0 or later

Note: In addition to installing Solaris 11.3.29.5.0 or later the fix requires the following:

1. Disable Service Processor power management (aka platform policy) by using the following ILOM command as an administrative user:

     --> set /SP/powermgmt/ policy=disabled

2. Set the power management administrative control to platform on the control domain (aka primary) by using the following command as root:

     # poweradm set active_control/administrative-authority=platform

If either configuration property is modified, the HOST *must* be stopped (powered off) and restarted in order for the fix to affect proper power configuration. Repeat the fix steps above for each HOST on multi-domain servers. HOST is synonymous with the term PDomain.

Note: The Solaris update need only be installed on the control domain (primary). It does not need to be installed on all domains.

History

23-Jan-2018: Document released, status is Workaround
23-Feb-2018: Resolution updated for Solaris release, issue Resolved
13-Apr-2018: Updated the Resolution section

This is software issue. Solaris fails to allow sufficient time to properly program the core cluster cycleskip setting. As a result, it times out and leaves the cycleskip programmed in a reduced power mode.

Questions regarding this document should be addressed to
sunalertpublication_us_grp@oracle.com and copy the
submitter/responsible engineer listed below:

Internal Contributor/Submitter: david.lafko@oracle.com
Internal Eng Responsible Engineer: david.finberg@oracle.com
Oracle Knowledge Analyst: jeff.folla@oracle.com
Internal Eng Business Unit Group: Systems Server OS
Internal Associated SRs: 3-16428085581
Internal Resolution Patches:

References

<BUG:27248004> - DISABLE POWER MANAGEMENT IN LDMD BY DEFAULT

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback