Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2197232.1
Update Date:2018-01-08
Keywords:

Solution Type  Technical Instruction Sure

Solution  2197232.1 :   Pillar Axiom: Recovery of corrupt Slammer CU EEPROM  


Related Items
  • Pillar Axiom 600 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Axiom>SN-DK: Ax600
  •  
  • Tools>Type>Guide
  •  




Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Customers are not allowed to work on midplane programming
Created from <SR 3-13446680654>

Applies to:

Pillar Axiom 600 Storage System - Version All Versions and later
Information in this document applies to any platform.

Goal

This document explains the recovery method for a corrupt SlammerCU.

Although the Slammer CU recovery steps followed in some case of SlammerCU failures and/or slammer motherboard was replaced still Slammer CU may not boot. There may be many reason for this. But one clear reason can be identified easily via looking in to CONSOLE logs either in log bundle or via collecting SlammerCU boot logs via console cable.

Below example messages show midplane EEPROM corruption on SlammerCU console logs:

MCCAGENT- 10/06/2016-10:37:33 fofb_check_active() AM_ACTIVE (3, 4) active_candidate=-1
dms: size of cmp rdy msg = 104
##mccAgent: Component Managers Initialized Successfully
fp: INFO: main: slab_flag = 0
WS: cmps rdy: u(0.00), s(0.00), e(7.39)
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
dms: midplane eeprom rd fail
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
dms: mobo rev is 16

id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800
eel.key=0xbeef
eel.versiox2
eel.moboData=0x0
eel.pciinfo=0x0
eel.bootData=0xfc040000
dms: bootData=fc040000
WARN: PO due to buddy hard reset
MCCAGENT: NODE_NOT_READY
MCCAGENT- 10/06/2016-10:37:56 node-153 is NODE_NOT_READY
scrub start: 2016-10-06 10:37:56 UTC
dms: sysinfo_version is 5 (0x5)
** Record 0 (address 0x112DD104) **
next 00000000 length 14
type 2 version 0x12
who 1 (bootblock)
chksum 0x0C
-----------------------
Reset data
cpld_stat1 0xD0 cpld_stat2 0x6F
cpld_stat3 0x91 clamp_status 0x01
eTime out: HW Dog process is exiting id_eeprom_rd:CRC: Exp=0x5556681a, read=0x55566800

 The messages are clearly pointing an inconsistency between what has been read and what was expected. A clear indication of EEPROM corruption on Slammer midplane. The EEPROM is located on midplane not on Slammer motherboard.

Solution

To fix the problem we need to re-program the EEPROM. AN easy way of re-programming is set the lenght field and force Slammer to re-calculate the CRC value and rewrite it to EEPROM.

We can set length field in EEPROM to 0x100, thereby forcing a recalculation of CRC.

Please follow below steps to re-program the EEPROM of the Slammer as follows:

1. Attach the Serial null modem cable to the serial port on the PIM for of  the CU.

2. Set the Serial Port on your workstation to:

Speed(Baud) 115,200 bits per second
Data Bits 8
Stop Bits 1 Parity NONE
Flow Control NONE

Necessary console cable and communication set-up details are explained here: <Document 1394234.1> Pillar Axiom: Brick & Slammer Serial Console Cables

3. Start your Serial Terminal software and set it to capture all printable characters.

4. Disconnect the AC power cord, wait for 30 sec attach the AC Power Cord to the Power Supplies of Slammer CU, CU1

5. Watch the serial console output. As the CU begins the boot process, press Enter to get a shell prompt.

6. When you see "microdms: fans and temps OK after 30 seconds in netboot" press Enter.

7. Type "slay microdms" to stop the microdms service so EEPROM can be used to configure the new chassis.

8. Type "eeprom" to start the EEPROM utility and follow menu. Select item 9 to start working on Midplane. Then select 2 to able to write to EEPROM. Then select 5 to set the length.

pbash-2.05a# eeprom
ID EEPROM UTILITY MENU, for eeprom format version 3

Select a FRU:
---------------------------------------------
1 - Motherboard
2 - NIM (e.g. SAN/Gige)
3 - PIM (e.g. FCIM)
4 - Battery
5 - Fan 1 (left)
6 - Fan 2 (right)
7 - Power Supply 1 (top)
8 - Power Supply 2 (bottom)
9 - Midplane
10 - Display the eeprom format version this program supports
11 - EXIT program
---------------------------------------------
Enter selection: 9 (Enter "9" for Midplane)

FRU: Midplane
Select what to do
---------------------------------------------
1 - Read ID EEPROM
2 - Write ID EEPROM
---------------------------------------------
Enter selection: 2

FRU: Midplane

Select a field
---------------------------------------------
4 - Format
5 - Length
6 - Revision
7 - Assembly Number
8 - Serial Number
9 - Description
10 - Unit Part Number
11 - Unit Serial Number
12 - WWN/MAC Base Address
13 - Configuration Bytes
14 - Vendor Unique Field
15 - System Serial Number
99 - All Fields
---------------------------------------------
Enter selection: 5
Enter data for the Length field (hex): 0x100
id_eeprom_rd:CRC: Exp=0xe622c2d3, read=0x0

serEEPROM will be written with the following data:
(Zeroes usually mean the field will not be written)
If converting formats the format, length, and crc
fields will also be updated.

ID PROM contents:
CRC: 0xffffffff
Format: 0xffffffff
Length: 0x100
:

9. Please check if  you can see it is corrected via reading the EEPROM on menu items once you successfully complete writing. Type "eeprom" to start the EEPROM utility and follow menu. Select item 9 to start working on Midplane. Then select 1 to able to read from EEPROM.

Please follow the standard SlammerCU recovery steps via clearing failure historu of CU and power-cycle it to boot again correctly.

If you need more information about eeprom utility please refer to <Document 1389619.1> Pillar Axiom: Slammer WWN Cloning Explained

References

<BUG:24813410> - ASSIST TSC- SLAMMER1.CU1 EEPROM RD FAIL
<NOTE:1389619.1> - Pillar Axiom: Slammer WWN Cloning Explained
<NOTE:1394234.1> - Pillar Axiom: Brick & Slammer Serial Console Cables

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback