Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2348450.1
Update Date:2018-01-15
Keywords:

Solution Type  Problem Resolution Sure

Solution  2348450.1 :   SPARC T4-2 server doesn't recognize it's Memory Risers and DIMM's  


Related Items
  • SPARC T4-2
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T4
  •  


A SPARC T4-2 system that has 4 Memory Riser on it, went down and it was not possible to configure any MCU on POST due to no Memory Riser or DIMM was accessible, system wasn't able to pass POST.

When listing the FRU information from ILOM system was not displaying any Memory Riser on the components list of ILOM Snapshot or ILOM command:
-> show -o table -level all /SYS fru_part_number

In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-16637006971>

Applies to:

SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Other Facts:
FRU id corrupt FMA message ILOM-8000-2V was found on ILOM after reseating all of it's Memory DIMM's and Risers

Symptoms

Customer reported: system poweroff without manual intervention

From POST output:
Serial console started. To stop, type #.
2018-01-12 16:42:17 0:0:0> NOTICE: Initializing TPM with:
tpm_enable = false
tpm_activate = false
tpm_forceclear = false
2018-01-12 16:42:17 0:0:0> NOTICE: TPM found: Ver 1.2, Rev 1.2, SpecLevel 2, errataRev 0, VendorId 'IFX'
2018-01-12 16:42:19 0:0:0> NOTICE: TPM initialized successfully. Current state is: disabled
2018-01-12 16:42:19 0:0:0> NOTICE: Serial#: 000000000000002a.0159ccc07d224154
2018-01-12 16:42:19 1:0:0> NOTICE: Serial#: 000000000000002a.0159ccc07d2241a6
2018-01-12 16:42:19 0:0:0> NOTICE: Version: 003e003012030607
2018-01-12 16:42:19 1:0:0> NOTICE: Version: 003e003012030607
2018-01-12 16:42:19 0:0:0> NOTICE: T4 Revision: 1.2
2018-01-12 16:42:19 1:0:0> NOTICE: T4 Revision: 1.2
2018-01-12 16:42:19 0:0:0> ERROR: Can't read BoB device type from FRUID, disabling MCUs
2018-01-12 16:42:19 0:0:0> ERROR: Can't read BoB device type from FRUID, disabling MCUs
2018-01-12 16:42:19 0:0:0> ERROR: Can't read BoB device type from FRUID, disabling MCUs
2018-01-12 16:42:20 0:0:0> ERROR: Can't read BoB device type from FRUID, disabling MCUs
2018-01-12 16:42:23 1:0:0> NOTICE: /SYS/MB/CMP1/MCU0 is disabled
2018-01-12 16:42:23 0:0:0> NOTICE: /SYS/MB/CMP0/MCU0 is disabled
2018-01-12 16:42:23 1:0:0> NOTICE: /SYS/MB/CMP1/MCU1 is disabled
2018-01-12 16:42:23 0:0:0> NOTICE: /SYS/MB/CMP0/MCU1 is disabled
2018-01-12 16:42:23 1:0:0> ERROR: Not all MCUs enabled. Unsupported Config.
2018-01-12 16:42:23 0:0:0> ERROR: Please refer to the service documentation for supported memory configurations.
2018-01-12 16:42:23 1:0:0> NOTICE: /SYS/MB/CMP1/MCU0 is disabled
2018-01-12 16:42:23 0:0:0> ERROR: Not all MCUs enabled. Unsupported Config.
2018-01-12 16:42:23 1:0:0> NOTICE: /SYS/MB/CMP1/MCU1 is disabled
2018-01-12 16:42:23 0:0:0> NOTICE: /SYS/MB/CMP0/MCU0 is disabled
2018-01-12 16:42:23 0:0:0> NOTICE: /SYS/MB/
===================

When checking the system there was no Memory Riser found or any FRU id information:

-> show -o table -level all /SYS fru_part_number
Target | Property | Value
------------------------------------------+--------------------------------------------------+-------------------------------------------------------------------------
/SYS/FANBD | fru_part_number | 7051522
/SYS/MB | fru_part_number | 7049060
/SYS/MB/SP | fru_part_number | 7054434
/SYS/MB_ENV | fru_part_number | 7024515
/SYS/PS0 | fru_part_number | 7048278
/SYS/PS1 | fru_part_number | 7048278
/SYS/SASBP | fru_part_number | 511-1246-04

Changes

 No changes were done on the system, this was an unexpected outage.

Cause

After troubleshooting there's a failing Memory Riser that prevented the rest of the Memory Risers and DIMM's to appear on the system configuration
 
Following is the output after populating all memory riser board except suspected one

-> show -o table -level all /SYS fru_part_number
Target | Property | Value
------------------------------------------+--------------------------------------------------+-------------------------------------------------------------------------
/SYS/FANBD | fru_part_number | 7051522
/SYS/MB | fru_part_number | 7049060
/SYS/MB/CMP0/MR0 | fru_part_number | 7051516
/SYS/MB/CMP0/MR0/BOB0/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR0/BOB0/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR0/BOB1/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR0/BOB1/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR1 | fru_part_number | 7051516
/SYS/MB/CMP0/MR1/BOB0/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR1/BOB0/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR1/BOB1/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP0/MR1/BOB1/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP1/MR0 | fru_part_number | 7051516
/SYS/MB/CMP1/MR0/BOB0/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP1/MR0/BOB0/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP1/MR0/BOB1/CH0/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/CMP1/MR0/BOB1/CH1/D0 | fru_part_number | 7014642,M393B5273CH0-YH9
/SYS/MB/SP | fru_part_number | 7054434
/SYS/MB_ENV | fru_part_number | 7024515
/SYS/PS0 | fru_part_number | 7048278
/SYS/PS1 | fru_part_number | 7048278
/SYS/SASBP | fru_part_number | 511-1246-04


populated the suspected memory riser board on 0 and rest of the slots with good memory riser board

-> show -o table -level all /SYS fru_part_number
Target | Property | Value
------------------------------------------+--------------------------------------------------+-------------------------------------------------------------------------
/SYS/FANBD | fru_part_number | 7051522
/SYS/MB | fru_part_number | 7049060
/SYS/MB/SP | fru_part_number | 7054434
/SYS/MB_ENV | fru_part_number | 7024515
/SYS/PS0 | fru_part_number | 7048278
/SYS/PS1 | fru_part_number | 7048278
/SYS/SASBP | fru_part_number | 511-1246-04
 

Solution

Troubleshooting action plan:

1. To remove all power cords
2. To install CMP0/MEM0 only
3. To re-plug power cords
4. To gather output from ILOM:
show -o table -level all /SYS fru_part_number
5. To verify if the Memory Riser and it's Memory DIMM's are on the list

If the DIMMs are displayed:
Repeat with
CMP0/MR0 + CMP0/MR1
CMP0/MR0 + CMP0/MR1 + CMP1/MR0

Until all good risers are installed and all it's memory DIMM's are seen.
NOTE: If the DIMM's and Riser does not appear, put the bad Riser on a side and test the next on configuration.

====================

The result of this troubleshooting action plan was to determine and replace the defective memory riser.


References

<NOTE:1415583.1> - How to Remove and Replace a SPARC T4-2 / Netra T4-2 Memory Risers and DIMMS:ATR:1415583.1:0

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback