Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2227574.1
Update Date:2018-05-01
Keywords:

Solution Type  Technical Instruction Sure

Solution  2227574.1 :   How to diagnose missing resources in SPARC CMT systems  


Related Items
  • SPARC T3-1
  •  
  • SPARC T3-4
  •  
  • SPARC S7-2
  •  
  • SPARC T7-1
  •  
  • Sun SPARC Enterprise T5220 Server
  •  
  • SPARC T4-2
  •  
  • Sun SPARC Enterprise T5240 Server
  •  
  • Sun SPARC Enterprise T1000 Server
  •  
  • Sun SPARC Enterprise T5140 Server
  •  
  • Sun SPARC Enterprise T2000 Server
  •  
  • SPARC T7-2
  •  
  • SPARC T7-4
  •  
  • SPARC T3-2
  •  
  • SPARC T4-1
  •  
  • SPARC T5-4
  •  
  • SPARC T5-2
  •  
  • Sun SPARC Enterprise T5120 Server
  •  
  • Sun SPARC Enterprise T5440 Server
  •  
  • SPARC T4-4
  •  
  • SPARC S7-2L
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: Tx000
  •  




In this Document
Goal
Solution
References


Applies to:

Sun SPARC Enterprise T2000 Server - Version All Versions and later
Sun SPARC Enterprise T1000 Server - Version Not Applicable and later
Sun SPARC Enterprise T5120 Server - Version All Versions and later
Sun SPARC Enterprise T5140 Server - Version All Versions and later
Sun SPARC Enterprise T5440 Server - Version All Versions and later
Information in this document applies to any platform.

Goal

Description

In certain circumstances Solaris may not report all system resources as being available as a result of a fault (KM 1483194.1), end users manually disabling components (KM 1643464.1), or due to resource constraints applied by Oracle VM/LDoms as will be discussed in this document.

NOTE : SPARC T7 introduces Memory DIMM sparing which is enabled by default on fully populated systems. Please refer to KM 2037793.1 for more information.

Symptoms

OBP reports less memory than expected;

-> show -d properties -level all -t /SYS type==DIMM fru_name fault_state
Target | Property | Value
-----------------------------+-----------------------------------+---------------------------------------------------
/SYS/MB/CM/CMP/BOB01/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fault_state | OK
CH1/DIMM | |

->

Alternatively check output from 'show components' to confirm if any devices are currently disabled;

-> show components
Target | Property | Value
-----------------------------+-----------------------------------+---------------------------------------------------
/SYS/MB/CM/CMP | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01 | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01/CH0 | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01/CH0/ | current_config_state | Enabled
DIMM | |
.
.
/SYS/MB/USB_CTRL | current_config_state | Enabled
/SYS/MB/XGBE0 | current_config_state | Enabled
/SYS/MB/XGBE1 | current_config_state | Enabled
/SYS/RIO/VIDEO | current_config_state | Enabled

->

In this example we should expect to see 262144MB of main memory as no modules are disabled or faulted, however OBP is reporting just 8GB;

SPARC T7-1, No Keyboard
Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.40.1, 8.0000 GB memory installed, Serial #108074902.
Ethernet address 0:10:e0:71:17:96, Host ID: 86711796.

If already booted to Solaris various tools may report less memory, CPUs or I/O than expected;

# psrinfo
0 on-line since 01/25/2017 15:18:21
1 on-line since 01/25/2017 15:18:23
2 on-line since 01/25/2017 15:18:23
3 on-line since 01/25/2017 15:18:23
4 on-line since 01/25/2017 15:18:23
5 on-line since 01/25/2017 15:18:23
6 on-line since 01/25/2017 15:18:23
7 on-line since 01/25/2017 15:18:23
#

# prtdiag | grep "^Memory size"
Memory size: 8192 Megabytes
#

Verify whether the system is booting from factory default or a custom Oracle VM/LDoms configuration;

-> show /HOST/bootmode/ config

/HOST/bootmode
Properties:
config = someldomconfig <<<

->

This can also be verified from the POST logs during platform initialisation;

2017-01-25 15:12:26 0:00:0> NOTICE: Booting config = someldomconfig

And from within Solaris;

# ldm list-spconfig
factory-default
someldomconfig [current]
#

NOTE : If the system is missing resources when booted from the 'factory-default' configuration please contact Oracle Support for further assistance.

Use ldm to verify domain resource allocation and ensure this matches what the host is reporting as available. Using the same example as above we have 8 CPUs, 8GB memory and all I/O assigned to the primary domain;

# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 8 8G 0.3% 0.3% 21m
#

# ldm list-io
NAME TYPE BUS DOMAIN STATUS
---- ---- --- ------ ------
pci_0 BUS pci_0 primary IOV
pci_1 BUS pci_1 primary IOV
pci_2 BUS pci_2 primary IOV
pci_3 BUS pci_3 primary IOV
pci_4 BUS pci_4 primary IOV
/SYS/MB/PCIE6 PCIE pci_0 primary EMP
/SYS/MB/SASHBA PCIE pci_0 primary OCC
/SYS/MB/PCIE4 PCIE pci_1 primary OCC
/SYS/MB/PCIE5 PCIE pci_1 primary EMP
/SYS/MB/NET0 PCIE pci_2 primary OCC
/SYS/MB/NET2 PCIE pci_2 primary OCC
/SYS/MB/PCIE2 PCIE pci_3 primary EMP
/SYS/MB/PCIE3 PCIE pci_3 primary OCC
/SYS/MB/PCIE1 PCIE pci_4 primary EMP
/SYS/MB/NET0/IOVNET.PF0 PF pci_2 primary
/SYS/MB/NET0/IOVNET.PF1 PF pci_2 primary
/SYS/MB/NET2/IOVNET.PF0 PF pci_2 primary
/SYS/MB/NET2/IOVNET.PF1 PF pci_2 primary
#

Use 'ldm list-devices' to confirm which specific resources are currently unallocated to a domain;

ldm list-devices cpu
ldm list-devices memory
ldm list-devices io

 

Solution

For details on Oracle VM/LDoms setup and configuration please refer to the Logical Domains (LDoms) Administration Guides, available at Oracle VM for SPARC Documentation.

For details on the hardware configurations of the relevant sun4v platforms, please refer to the system specific documentation.

If all resources should be assigned to the primary domain and there are no guest domains configured on the system, please use the following steps.

From within Solaris;

# ldm set-spconfig factory-default
# ldm list-spconfig
factory-default [next poweron]
test [current]
# init 0

Alternatively from within ILOM;

-> set /HOST/bootmode config=factory-default
Set 'config' to 'factory-default'

->

Power cycle /SYS to reinitialise the host;

-> stop /SYS
-> start /SYS

If there are guest domains configured on the system then resources should be reallocated manually via ldmd as required.

 

References

<NOTE:2167775.1> - SPARC T5/T7: After clearing ILOM fault and replacing faulted hardware some system resources remains disabled or degraded
<NOTE:1643464.1> - [SPARC T3/T4/T5 and T7] OBP reports "One or more resources have been retired, please run 'show faulty' on the SP" on console
<NOTE:1483194.1> - Commands to run to fully clear ILOM/SP, faultmgmt shell, and FMA faults on the T3-x and T4-x Servers

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback