Asset ID: |
1-79-1546217.1 |
Update Date: | 2017-05-09 |
Keywords: | |
Solution Type
Predictive Self-Healing Sure
Solution
1546217.1
:
Netra CT900 component hot-swap state reference
Related Categories |
- PLA-Support>Sun Systems>SPARC>Usx/Blade/Netra>SN-SPARC: Netra Cxxxx
|
In this Document
Applies to:
Sun Netra CT900 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Purpose
Hot-swap states are defined by ATCA specification. This will explain some of the messages in SEL or syslog of ShMM.
Details
The CT900 FRU Hot-swap states are defined as:
-
State
|
Summary
|
Explanation
|
M0
|
Not Installed
|
FRU IPMC is not reachable. All power to FRU is off. Blue LED is off.
|
M1
|
FRU Inactive
|
FRU is installed and its IPMC is in communication with ShMM. Blue LED is on solid. FRU is not powered up, and none of its connectivity is active. Next state is either M0 or M2.
|
M2
|
FRU Activation Request
|
FRU IPMC is waiting for activation permission from ShMM. Blue LED has a long blink. FRU removal is not safe. Next state is M3.
|
M3
|
Activation in Progress
|
FRU's IPMC requests power allocation form ShMM. Blue LED is off. FRU changes to state M4 when activation is complete.
|
M4
|
FRU Active
|
This is the normal FRU operational state. FRU is powered on and cannot be removed safely. Blue LED is off. Next state is either M5 or M6.
|
M5
|
FRU Deactivation Request
|
FRU's IPMC is requesting deactivation permission form ShMM. Blue LED shows a short blink. Next state is M6.
|
M6
|
FRU Deactivation in Progress
|
FRU is shutting down and its I/O connections are being deactivated. Blue LED continues its short blink. Next state is M1.
|
M7
|
Communication Lost
|
ShMM has lost contact with board IPMC, or board IPMC has lost contact with its own FRUs. This is an abnormal state. Board should return to its previous state when IPMC communication is reestablished.
|
In SEL (System Event Log), examples of the following transaction are seen:
0x00E5: Event: at Apr 16 15:17:14 2013; from:(0x8e,0,0); sensor:(0xf0,1); event:0x6f(asserted): HotSwap: FRU 1 M0->M1, Cause=0x0
0x00E7: Event: at Apr 16 15:17:16 2013; from:(0x8e,0,0); sensor:(0xf0,1); event:0x6f(asserted): HotSwap: FRU 1 M1->M2, Cause=0x2
0x00E9: Event: at Apr 16 15:17:18 2013; from:(0x8e,0,0); sensor:(0xf0,1); event:0x6f(asserted): HotSwap: FRU 1 M2->M3, Cause=0x1
0x00EB: Event: at Apr 16 15:17:19 2013; from:(0x8e,0,0); sensor:(0xf0,1); event:0x6f(asserted): HotSwap: FRU 1 M3->M4, Cause=0x0
This is shows a blade (0x8e, slot 4) is coming up from power-off (M0) state to power-on (M4). These are normal transaction. It is also possible that state goes from M4 to M7 directly:
0x0223: Event: at Apr 16 15:40:15 2013; from:(0x10,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M4->M7, Cause=0x4
This means the component (0x10, shm1, top ShMM) does not respond to polling of active ShMM and get a time-out. It will recover if IPMB is less busy and component starts responding to polling again:
0x028C: Event: at Apr 16 15:40:33 2013; from:(0x10,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M7->M4, Cause=0x4
NOTE: Any analysis of SEL need to group SEL that come form the same component (the same "from" field) to be meaningful.
Physical Slot
|
Shm1
|
Shm2
|
Shelf
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
Logical Slot
|
-----
|
-----
|
-----
|
13
|
11
|
9
|
7
|
5
|
3
|
1
|
2
|
4
|
6
|
8
|
10
|
12
|
14
|
SW Blade: Base
|
-----
|
-----
|
-----
|
0/13
|
0/11
|
0/9
|
0/7
|
0/5
|
0/3
|
---
|
---
|
0/4
|
0/6
|
0/8
|
0/10
|
0/12
|
0/14
|
SW Blade: Extended
|
-----
|
-----
|
-----
|
0/12
|
0/10
|
0/8
|
0/6
|
0/4
|
0/2
|
---
|
---
|
0/3
|
0/5
|
0/7
|
0/9
|
0/11
|
0/13
|
IPMB Address
|
10
|
12
|
20
|
9a
|
96
|
92
|
8e
|
8a
|
86
|
82
|
84
|
88
|
8c
|
90
|
94
|
98
|
9c
|
HW Address
|
08
|
09
|
10
|
4d
|
4b
|
49
|
47
|
45
|
43
|
41
|
42
|
44
|
46
|
48
|
4a
|
4c
|
4e
|
References
<NOTE:1546216.1> - Netra CT900 IPMB address references
Attachments
This solution has no attachment