![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1437249.1 : CT900 Troubleshooting & Data Collection Cheat Sheet
In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:Sun Netra CT900 Server - Version Not Applicable to Not Applicable [Release N/A]Information in this document applies to any platform. PurposeTroubleshooting Cheat Sheet for Sun Netra CT-900 platform issues ScopeTroubleshooting basics and how to collect data for additional problem analysis DetailsAcronyms:
IPMB Address Table
This table converts Physical Slot numbers to ShMM IPMB Address and SW status LED
Where to obtain Data?
Data Collection:
First thing to try when a problem occurs with a ShMM, Blade or RTM/ARTM: Re-seat the component or move it (blabe or RTM/ARTM) to a different available slot within a 1 to 3 minute interval. In the case of the Shmm, failover to other ShMM and retry. Also customer should provide a full description of all of the LED activities when problem occurs. For example: "As a blade comes up, when power is applied, All LED's light up at least once, then BLUE (hot-swap) LED blinks at a slow rate, and then the Green (OK) LED lights up solid as Blue LED goes off, and then the blade is at OBP" Notes:
In addition to the above outputs/files, it would also be desirable if the customer provided the following information:
Various ShMM Commands to help debug an issue:
FRU Information
From the ShMM console: Use [clia] fruinfo command to obtain part number and serial number of component # clia help fruinfo Display the FRU Info of the dedicated FRU in the readable format instead of <addr> <fru_id> user may use: power_supply <N> (valid in 2.x systems only) fan_tray <N> board <N> shm <N> to access the FRU on the specified board fruinfo board 14 fruinfo power_supply 4 fruinfo <add> <fru_id> Examples: # clia fruinfo shm 1 10: FRU # 0, FRU Info Common Header: Format Version = 1 Internal Use Area: Version = 1 Board Info Area: Version = 1 Language Code = 25 Mfg Date/Time = Oct 15 13:00:00 2007 (6199980 minutes since 1996) Board Manufacturer = Schroff GmbH Board Product Name = ACBIV Rad Split USB Board Serial Number = 1480701210 Board Part Number = 21596-247 FRU Programmer File ID = 21596247ABBIN.bin Product Info Area: Version = 1 Language Code = 25 Manufacturer Name = Sun Microsystems, Inc. Product Name = NETRA,CT900,SHELF_MGR Product Part / Model# = 371-3037-01 Product Version = 50 Product Serial Number = 1005SCH-0748AX1224 Asset Tag = 0000000000000001 FRU Programmer File ID = 21596247ABBIN.bin Multi Record Area: PICMG Board Point-to-Point Connectivity Record (ID=0x14) Version = 0 # clia fruinfo 20 3 <---- clia fruinfo fan_tray 1 20: FRU # 3, FRU Info Common Header: Format Version = 1 Board Info Area: Version = 1 Language Code = 25 Mfg Date/Time = Dec 21 00:00:00 2005 (5244480 minutes since 1996) Board Manufacturer = Schroff Board Product Name = Fan Tray Controller Board Serial Number = 0000001 Board Part Number = 23098-533 FRU Programmer File ID = Product Info Area: Version = 1 Language Code = 25 Manufacturer Name = Schroff Product Name = Fan Tray Product Part / Model# = 21594-189 Product Version = Rev. 1.00 Product Serial Number = 0000001 Asset Tag = FRU Programmer File ID = /var/nvdata/Schroff_21594189_AA.inf # clia fruinfo board 3 < Pigeon Point Shelf Manager Command Line Interpreter 92: FRU # 0, FRU Info Common Header: Format Version = 1 Board Info Area: Version = 1 Language Code = 25 Mfg Date/Time = Nov 14 11:59:00 2002 (3613679 minutes since 1996) Board Manufacturer = Sun Microsystems, Inc. Board Product Name = Netra CP3060 Board Serial Number = WJ009D Board Part Number = 50176570152 FRU Programmer File ID = 520-3967.fru-info.inf Product Info Area: Version = 1 Language Code = 25 Manufacturer Name = Sun Microsystems, Inc. Product Name = Netra CP3060 Product Part / Model# = 50176570152 <---- 501-7657-01, REV 52 Product Version = 2007.05.03.v1.1 Product Serial Number = WJ009D < Asset Tag = FRU Programmer File ID = 520-3967.fru-info.inf Multi Record Area: PICMG Board Point-to-Point Connectivity Record (ID=0x14) Version = 0 AMC Carrier Information Table Record (ID=0x1a) Version = 0 AMC Carrier Activation and Current Management Record (ID=0x17) Version = 0 AMC Carrier Point-to-Point Connectivity Record (ID=0x18) Version = 0 AMC Point-to-Point Connectivity Record (ID=0x19) Version = 0 AMC Point-to-Point Connectivity Record (ID=0x19) Version = 0 LED Status ShMM Console: Use [clia] getfruledstate command to obtain LED status of all chassis components # clia help getfruledstate Returns the state of the FRU's LED(s) instead of <addr> <fru_id> user may use: board <N> shm <N> If the <LedId> parameter is not specified, all the LEDs related to the specified FRU are queried; Otherwise that specified LED is queried only If -v option is specified, additional information about LED(s) properties will be printed Examples: getfruledstate 20 4 1 getfruledstate 20 4 getfruledstate 20 getfruledstate [-v] [<addr> [<fru_id> [<LedId>|ALL]]] Note: 20 (0x20) is IPMB Address of chassis itself Note: Use [clia] fru 20 to obtain FRU list of chassis # clia getfruledstate board 1 <--- For blades: getfruledstate board <slot #> 9a: FRU # 0, Led # 0 ("BLUE LED"): Local Control LED State: LED OFF 9a: FRU # 0, Led # 1 ("LED 1"): Local Control LED State: LED OFF 9a: FRU # 0, Led # 2 ("LED 2"): Local Control LED State: LED ON, color: GREEN # clia getfruledstate 20 3 <--- For FT: getfruledstate 20 <3|4|5> 20: FRU # 3, Led # 0 ("BLUE LED"): Local Control LED State: LED OFF 20: FRU # 3, Led # 1 ("LED 1"): Local Control LED State: LED OFF 20: FRU # 3, Led # 2 ("LED 2"): Local Control LED State: LED ON, color: GREEN Sensor Information (Sensor Number, Voltage/Temperature Reading, etc.) ShMM Console: Use [clia] sensor command to obtain the list of sensors of a particular chassis component. Please be careful that each R version has its own sensor # to sensor name association, the following example sensor lists (using R3U2-RR) are not fixed. If Customer uses sensor # in their script writing, the recommendation is to change that to sensor name instead. # clia help sensor Shows sensor information instead of <addr> user may use: board <N> shm <N> to access the sensor on the specified board sensor board 21 "IPMB LINK" sensor 20 8 sensor [ <addr> [ [ lun: ]<sensor id> | <sensor name> ] ] Example list of chassis sensors: # clia sensor 20 | grep Sensor <---- 20 (0x20) is the IPMB Address of chassis itself
20: LUN: 0, Sensor # 0 ("FRU 0 HOT_SWAP") 20: LUN: 0, Sensor # 2 ("FRU 1 HOT_SWAP") 20: LUN: 0, Sensor # 3 ("FRU 2 HOT_SWAP") 20: LUN: 0, Sensor # 4 ("FRU 8 HOT_SWAP") 20: LUN: 0, Sensor # 5 ("FRU 3 HOT_SWAP") 20: LUN: 0, Sensor # 6 ("FRU 4 HOT_SWAP") 20: LUN: 0, Sensor # 7 ("FRU 5 HOT_SWAP") 20: LUN: 0, Sensor # 8 ("FRU 6 HOT_SWAP") 20: LUN: 0, Sensor # 9 ("FRU 7 HOT_SWAP") 20: LUN: 0, Sensor # 10 ("IPMB LINK 1") 20: LUN: 0, Sensor # 11 ("IPMB LINK 2") 20: LUN: 0, Sensor # 12 ("Fan Tray 0") 20: LUN: 0, Sensor # 13 ("Fan Tray 1") 20: LUN: 0, Sensor # 14 ("Fan Tray 2") 20: LUN: 0, Sensor # 15 ("IPMB LINK 3") 20: LUN: 0, Sensor # 16 ("IPMB LINK 4") 20: LUN: 0, Sensor # 17 ("IPMB LINK 5") 20: LUN: 0, Sensor # 18 ("IPMB LINK 6") 20: LUN: 0, Sensor # 19 ("IPMB LINK 7") 20: LUN: 0, Sensor # 20 ("IPMB LINK 8") 20: LUN: 0, Sensor # 21 ("IPMB LINK 9") 20: LUN: 0, Sensor # 22 ("IPMB LINK 10") 20: LUN: 0, Sensor # 23 ("IPMB LINK 11") 20: LUN: 0, Sensor # 24 ("IPMB LINK 12") 20: LUN: 0, Sensor # 25 ("IPMB LINK 13") 20: LUN: 0, Sensor # 26 ("IPMB LINK 14") 20: LUN: 0, Sensor # 27 ("IPMB LINK 15") 20: LUN: 0, Sensor # 120 ("Center Exhaust") 20: LUN: 0, Sensor # 121 ("Left Exhaust") 20: LUN: 0, Sensor # 122 ("Right Exhaust") 20: LUN: 0, Sensor # 123 ("SAP Temp") 20: LUN: 0, Sensor # 124 ("Temp_In Left") 20: LUN: 0, Sensor # 125 ("Temp_In Center") 20: LUN: 0, Sensor # 126 ("Temp_In Right") 20: LUN: 0, Sensor # 131 ("TELCO Alarms") 20: LUN: 0, Sensor # 132 ("BMC Watchdog") 20: LUN: 0, Sensor # 133 ("SYSTEM EVENT") 20: LUN: 0, Sensor # 150 ("Air Filter") 20: LUN: 0, Sensor # 152 ("SAP") 20: LUN: 0, Sensor # 162 ("PEM A In 2") 20: LUN: 0, Sensor # 163 ("PEM A In 2 Fused") 20: LUN: 0, Sensor # 164 ("PEM A In 1") 20: LUN: 0, Sensor # 165 ("PEM A In 1 Fused") 20: LUN: 0, Sensor # 166 ("PEM A In 4") 20: LUN: 0, Sensor # 167 ("PEM A In 4 Fused") 20: LUN: 0, Sensor # 168 ("PEM A In 3") 20: LUN: 0, Sensor # 169 ("PEM A In 3 Fused") 20: LUN: 0, Sensor # 174 ("PEM B In 2") 20: LUN: 0, Sensor # 175 ("PEM B In 2 Fused") 20: LUN: 0, Sensor # 176 ("PEM B In 1") 20: LUN: 0, Sensor # 177 ("PEM B In 1 Fused") 20: LUN: 0, Sensor # 178 ("PEM B In 4") 20: LUN: 0, Sensor # 179 ("PEM B In 4 Fused") 20: LUN: 0, Sensor # 180 ("PEM B In 3") 20: LUN: 0, Sensor # 181 ("PEM B In 3 Fused") 20: LUN: 0, Sensor # 192 ("PEM A") 20: LUN: 0, Sensor # 193 ("PEM B") 20: LUN: 0, Sensor # 194 ("Shelf EEPROM 1") 20: LUN: 0, Sensor # 195 ("Shelf EEPROM 2") 20: LUN: 0, Sensor # 200 ("PEM A Temp") 20: LUN: 0, Sensor # 201 ("PEM B Temp") 20: LUN: 0, Sensor # 208 ("24V FT 0") 20: LUN: 0, Sensor # 209 ("-48A bus FT 0") 20: LUN: 0, Sensor # 210 ("-48A FT 0") 20: LUN: 0, Sensor # 211 ("-48B bus FT 0") 20: LUN: 0, Sensor # 212 ("-48B FT 0") 20: LUN: 0, Sensor # 213 ("-48A FT 0 Fuse") 20: LUN: 0, Sensor # 214 ("-48B FT 0 Fuse") 20: LUN: 0, Sensor # 215 ("24V FT 1") 20: LUN: 0, Sensor # 216 ("-48A bus FT 1") 20: LUN: 0, Sensor # 217 ("-48A FT 1") 20: LUN: 0, Sensor # 218 ("-48B bus FT 1") 20: LUN: 0, Sensor # 219 ("-48B FT 1") 20: LUN: 0, Sensor # 220 ("-48A FT 1 Fuse") 20: LUN: 0, Sensor # 221 ("-48B FT 1 Fuse") 20: LUN: 0, Sensor # 222 ("24V FT 2") 20: LUN: 0, Sensor # 223 ("-48A bus FT 2") 20: LUN: 0, Sensor # 224 ("-48A FT 2") 20: LUN: 0, Sensor # 225 ("-48B bus FT 2") 20: LUN: 0, Sensor # 226 ("-48B FT 2") 20: LUN: 0, Sensor # 227 ("-48A FT 2 Fuse") 20: LUN: 0, Sensor # 228 ("-48B FT 2 Fuse") 20: LUN: 0, Sensor # 244 ("3V3_RAD") Example list of ShMM sensors: # clia sensor shm 1 | grep Sensor 10: LUN: 0, Sensor # 0 ("FRU 0 HOT_SWAP") 10: LUN: 0, Sensor # 1 ("IPMB LINK") 10: LUN: 0, Sensor # 2 ("Local Temp") 10: LUN: 0, Sensor # 3 ("3V3_local") 10: LUN: 0, Sensor # 4 ("I2C_PWR_A") 10: LUN: 0, Sensor # 5 ("I2C_PWR_B") 10: LUN: 0, Sensor # 6 ("VBAT") 10: LUN: 0, Sensor # 7 ("Fan Tach. 0") 10: LUN: 0, Sensor # 8 ("Fan Tach. 1") 10: LUN: 0, Sensor # 10 ("Fan Tach. 2") 10: LUN: 0, Sensor # 11 ("Fan Tach. 3") 10: LUN: 0, Sensor # 13 ("Fan Tach. 4") 10: LUN: 0, Sensor # 14 ("Fan Tach. 5") 10: LUN: 0, Sensor # 15 ("-48A Bus voltage") 10: LUN: 0, Sensor # 16 ("-48B Bus voltage") 10: LUN: 0, Sensor # 17 ("-48A ACB voltage") 10: LUN: 0, Sensor # 18 ("-48B ACB voltage") 10: LUN: 0, Sensor # 19 ("-48A ACB Fuse") 10: LUN: 0, Sensor # 20 ("-48B ACB Fuse") 10: LUN: 0, Sensor # 128 ("CPLD State") Example list of PEM sensors: # clia sensor 20 | grep PEM 20: LUN: 0, Sensor # 162 ("PEM A In 2") 20: LUN: 0, Sensor # 163 ("PEM A In 2 Fused") 20: LUN: 0, Sensor # 164 ("PEM A In 1") 20: LUN: 0, Sensor # 165 ("PEM A In 1 Fused") 20: LUN: 0, Sensor # 166 ("PEM A In 4") 20: LUN: 0, Sensor # 167 ("PEM A In 4 Fused") 20: LUN: 0, Sensor # 168 ("PEM A In 3") 20: LUN: 0, Sensor # 169 ("PEM A In 3 Fused") 20: LUN: 0, Sensor # 174 ("PEM B In 2") 20: LUN: 0, Sensor # 175 ("PEM B In 2 Fused") 20: LUN: 0, Sensor # 176 ("PEM B In 1") 20: LUN: 0, Sensor # 177 ("PEM B In 1 Fused") 20: LUN: 0, Sensor # 178 ("PEM B In 4") 20: LUN: 0, Sensor # 179 ("PEM B In 4 Fused") 20: LUN: 0, Sensor # 180 ("PEM B In 3") 20: LUN: 0, Sensor # 181 ("PEM B In 3 Fused") 20: LUN: 0, Sensor # 192 ("PEM A") 20: LUN: 0, Sensor # 193 ("PEM B") 20: LUN: 0, Sensor # 200 ("PEM A Temp") 20: LUN: 0, Sensor # 201 ("PEM B Temp") Example list of Switch (CP3140) sensors: # clia sensor board 7 | grep Sensor <---- Switch blades are in slots 7 & 8 82: LUN: 0, Sensor # 0 ("FRU 0 HOT_SWAP") 82: LUN: 0, Sensor # 1 ("IPMB LINK") 82: LUN: 0, Sensor # 2 ("-48V ALARM") 82: LUN: 0, Sensor # 3 ("RTM Present") 82: LUN: 0, Sensor # 4 ("OOS LED") 82: LUN: 0, Sensor # 5 ("ACTIVE LED") 82: LUN: 0, Sensor # 6 ("5V") 82: LUN: 0, Sensor # 7 ("3.3V") 82: LUN: 0, Sensor # 8 ("2.5V") 82: LUN: 0, Sensor # 9 ("1.5V") 82: LUN: 0, Sensor # 10 ("1.25V") 82: LUN: 0, Sensor # 11 ("Board Temp1") 82: LUN: 0, Sensor # 12 ("Board Temp2") 82: LUN: 0, Sensor # 13 ("BMC Watchdog") Example list of Switch (CP3240) sensors: # clia sensor board 7 | grep Sensor <---- Switch blade is location at slot 7 & 8 82: LUN: 0, Sensor # 0 ("Hot Swap") 82: LUN: 0, Sensor # 2 ("Hot Swap AMC #1") 82: LUN: 0, Sensor # 3 ("Hot Swap AMC #2") 82: LUN: 0, Sensor # 4 ("Hot Swap AMC #3") 82: LUN: 0, Sensor # 5 ("+12.0V") 82: LUN: 0, Sensor # 6 ("+3.3V") 82: LUN: 0, Sensor # 7 ("+2.5V") 82: LUN: 0, Sensor # 8 ("+1.25V") 82: LUN: 0, Sensor # 9 ("IPMB Physical") 82: LUN: 0, Sensor # 10 ("Base CPU Temp") 82: LUN: 0, Sensor # 12 ("RTM Presence") 82: LUN: 0, Sensor # 13 ("Base Early") 82: LUN: 0, Sensor # 14 ("Base Full") 82: LUN: 0, Sensor # 15 ("Base Good") 82: LUN: 0, Sensor # 16 ("Fabric Early") 82: LUN: 0, Sensor # 17 ("Fabric Full") 82: LUN: 0, Sensor # 18 ("Fabric Good") 82: LUN: 0, Sensor # 19 ("BMC Watchdog") 82: LUN: 0, Sensor # 20 ("Fabric CPU Temp") 82: LUN: 0, Sensor # 21 ("+1.5V") 82: LUN: 0, Sensor # 22 ("+1.8V") 82: LUN: 0, Sensor # 23 ("+1.0V") 82: LUN: 0, Sensor # 24 ("+1.2V") 82: LUN: 0, Sensor # 25 ("Site 1 PWR cur") 82: LUN: 0, Sensor # 26 ("Site 1 PWR") 82: LUN: 0, Sensor # 27 ("Site 1 MP") 82: LUN: 0, Sensor # 28 ("Site 2 PWR cur") 82: LUN: 0, Sensor # 29 ("Site 2 PWR") 82: LUN: 0, Sensor # 30 ("Site 2 MP") 82: LUN: 0, Sensor # 31 ("Site 3 PWR cur") 82: LUN: 0, Sensor # 32 ("Site 3 PWR") 82: LUN: 0, Sensor # 33 ("Site 3 MP") 82: LUN: 0, Sensor # 34 ("+3.3V STBY") 82: LUN: 0, Sensor # 35 ("+12V") 82: LUN: 0, Sensor # 36 ("DS75 Temp") 82: LUN: 0, Sensor # 37 ("AD7417 Temp") 82: LUN: 0, Sensor # 38 ("+1.2V") 82: LUN: 0, Sensor # 39 ("+1.8V") 82: LUN: 0, Sensor # 40 ("+3.3V") 82: LUN: 0, Sensor # 41 ("+5V") 82: LUN: 0, Sensor # 42 ("+3.3V STBY") 82: LUN: 0, Sensor # 43 ("+12V") 82: LUN: 0, Sensor # 44 ("DS75 Temp") 82: LUN: 0, Sensor # 45 ("AD7417 Temp") 82: LUN: 0, Sensor # 46 ("+1.2V") 82: LUN: 0, Sensor # 47 ("+1.8V") 82: LUN: 0, Sensor # 48 ("+3.3V") 82: LUN: 0, Sensor # 49 ("+5V") 82: LUN: 0, Sensor # 50 ("+3.3V STBY") 82: LUN: 0, Sensor # 51 ("+12V") 82: LUN: 0, Sensor # 52 ("DS75 Temp") 82: LUN: 0, Sensor # 53 ("AD7417 Temp") 82: LUN: 0, Sensor # 54 ("+1.2V") 82: LUN: 0, Sensor # 55 ("+2.5V") 82: LUN: 0, Sensor # 56 ("+3.3V") NOTE: Use [clia] board <slot #> to identify the blade in that slot Example of CP3060 Sensors: # clia sensor board 3 | grep Sensor 92: LUN: 0, Sensor # 0 ("FRU 0 Hot Swap") 92: LUN: 0, Sensor # 1 ("RTM Hot Swap") 92: LUN: 0, Sensor # 2 ("HotSwap AMC 0") 92: LUN: 0, Sensor # 3 ("IPMB Physical") 92: LUN: 0, Sensor # 4 ("BMC Watchdog") 92: LUN: 0, Sensor # 5 ("CPU Temp1") 92: LUN: 0, Sensor # 6 ("CPU Temp2") 92: LUN: 0, Sensor # 7 ("Board Temp") 92: LUN: 0, Sensor # 8 ("12.0V") 92: LUN: 0, Sensor # 9 ("5.0V") 92: LUN: 0, Sensor # 10 ("3.3V") 92: LUN: 0, Sensor # 11 ("3.3V STBY") 92: LUN: 0, Sensor # 12 ("2.5V STBY") 92: LUN: 0, Sensor # 13 ("1.0V") 92: LUN: 0, Sensor # 14 ("1.2V CPU") 92: LUN: 0, Sensor # 15 ("1.2V") 92: LUN: 0, Sensor # 16 ("1.5V") 92: LUN: 0, Sensor # 17 ("0.9V VTTL") 92: LUN: 0, Sensor # 18 ("0.9V VTTR") 92: LUN: 0, Sensor # 19 ("1.8V DDR2L") 92: LUN: 0, Sensor # 20 ("1.8V DDR2R") 92: LUN: 0, Sensor # 21 ("2.5V") 92: LUN: 0, Sensor # 22 ("1.2V STBY") 92: LUN: 0, Sensor # 23 ("AMC 12V") 92: LUN: 0, Sensor # 24 ("AMC 3.3V") 92: LUN: 0, Sensor # 25 ("RTM Presence") 92: LUN: 0, Sensor # 26 ("Version change") 92: LUN: 0, Sensor # 27 ("+3.3V") 92: LUN: 0, Sensor # 28 ("+5V") 92: LUN: 0, Sensor # 29 ("+12V") 92: LUN: 0, Sensor # 30 ("LM60 Temp") 92: LUN: 0, Sensor # 31 ("DS75 Temp") 92: LUN: 0, Sensor # 32 ("BMC Watchdog") Example of CP3250 Sensors: # clia sensor board 6 | grep Sensor 86: LUN: 0, Sensor # 0 ("FRU 0 Hot Swap") 86: LUN: 0, Sensor # 3 ("IPMB Physical") 86: LUN: 0, Sensor # 4 ("BMC Watchdog") 86: LUN: 0, Sensor # 5 ("12.0V") 86: LUN: 0, Sensor # 6 ("5.0V") 86: LUN: 0, Sensor # 7 ("3.3V") 86: LUN: 0, Sensor # 8 ("3.3V STBY") 86: LUN: 0, Sensor # 9 ("SuperCAP voltage") 86: LUN: 0, Sensor # 10 ("1.2V NTune") 86: LUN: 0, Sensor # 11 ("CPU VTT") 86: LUN: 0, Sensor # 12 ("1.5 V") 86: LUN: 0, Sensor # 13 ("1.8 V") 86: LUN: 0, Sensor # 14 ("DDR2 VTT") 86: LUN: 0, Sensor # 15 ("1.05 V Core") 86: LUN: 0, Sensor # 16 ("1.5 V NTune") 86: LUN: 0, Sensor # 17 ("VCC CPU1") 86: LUN: 0, Sensor # 18 ("VCC CPU0") 86: LUN: 0, Sensor # 19 ("Inlet 1 Temp Sen") 86: LUN: 0, Sensor # 20 ("Inlet 3 Temp Sen") 86: LUN: 0, Sensor # 21 ("Inlet 2 Temp Sen") 86: LUN: 0, Sensor # 22 ("MCH Temp Sensor") 86: LUN: 0, Sensor # 23 ("CPU_TEMP_SK0D0") 86: LUN: 0, Sensor # 24 ("CPU_TEMP_SK0D1") 86: LUN: 0, Sensor # 25 ("CPU_TEMP_SK1DO") 86: LUN: 0, Sensor # 26 ("CPU_TEMP_SK1D1") 86: LUN: 0, Sensor # 27 ("Version change") 86: LUN: 0, Sensor # 28 ("System Event") 86: LUN: 0, Sensor # 29 ("CPU 0 presence") 86: LUN: 0, Sensor # 30 ("CPU 1 presence") 86: LUN: 0, Sensor # 31 ("P48V Alarm") 86: LUN: 0, Sensor # 32 ("Sys fw progress") 86: LUN: 0, Sensor # 33 ("Graceful reboot") # clia sensor board 4 | grep Sensor (this is for a CP3270) 8e: LUN: 0, Sensor # 0 ("FRU 0 Hot Swap") Once the sensor is identified, use clia sensordata command to obtain the specific info of the particular sensor. # clia help sensordata Shows sensor data instead of <addr> user may use: board <N> shm <N> to access the sensor on the specified board (only sensors with thresholds crossed if -t is given) sensordata board 21 "IPMB LINK" sensordata 20 8 sensordata [-t] [ <addr> [ [ lun: ]<sensor id> | <sensor name> ] ] If need to know the threshold of each sensor, uses [clia] getthreshold command to check, and [clia] setthreshold command to set the threshold of each sensor. # clia help getthreshold Shows the threshold of the specified sensor instead of <addr> user may use: board <N> shm <N> to access the sensor on the specified board getthreshold board 21 "IPMB LINK" getthreshold 20 8 getthreshold [ <addr> [ [ lun: ]<sensor id> | <sensor name> ] ] # clia help setthreshold Set the specified threshold of the dedicated sensor unc - Upper Non Critical uc - Upper Critical unr - Upper Non Recoverable lnc - Lower Non Critical lc - Lower Critical lnr - Lower Non Recoverable instead of <addr> user may use: board <N> shm <N> to access the sensor on the specified board "-r <value>" considers <value> as unsigned byte just "<value>" considers as the floating point number setthreshold board 21 "IPMB LINK" unc -r 34 setthreshold 20 8 lc -45.67 setthreshold <addr> [ lun: ]<sensor_id> | <sensor name> unc | uc | unr | lnc | lc | lnr [-r] value This is how the threshold levels are defined:
Temperature Sensor Example: # clia sensordata board 3 5 92: LUN: 0, Sensor # 5 ("CPU Temp1") Type: Threshold (0x01), "Temperature" (0x01) Status: 0xc0 All event messages enabled from this sensor Sensor scanning enabled Initial update completed Raw data: 54 (0x36) Processed data: 54.000000 degrees C <---- This is the read of sensor Status: 0x00 # clia getthreshold board 3 5 92: LUN: 0, Sensor # 5 ("CPU Temp1") Type: Threshold (0x01), "Temperature" (0x01) Upper Non-Critical Threshold, Raw Data: 0x50 Processed data: 80.000000 degrees C Upper Critical Threshold, Raw Data: 0x5a Processed data: 90.000000 degrees C Upper Non-Recoverable Threshold, Raw Data: 0x66 Processed data: 102.000000 degrees C Voltage Example: # clia sensordata board 3 14 92: LUN: 0, Sensor # 14 ("1.2V CPU") Type: Threshold (0x01), "Voltage" (0x02) Status: 0xc0 All event messages enabled from this sensor Sensor scanning enabled Initial update completed Raw data: 124 (0x7c) Processed data: 1.215200 Volts <---- This is the reading of sensor Status: 0x00 # clia getthreshold board 3 14 92: LUN: 0, Sensor # 14 ("1.2V CPU") Type: Threshold (0x01), "Voltage" (0x02) Lower Non-Critical Threshold, Raw Data: 0x75 Processed data: 1.146600 Volts Lower Critical Threshold, Raw Data: 0x72 Processed data: 1.117200 Volts Lower Non-Recoverable Threshold, Raw Data: 0x6e Processed data: 1.078000 Volts Upper Non-Critical Threshold, Raw Data: 0x81 Processed data: 1.264200 Volts Upper Critical Threshold, Raw Data: 0x84 Processed data: 1.293600 Volts Upper Non-Recoverable Threshold, Raw Data: 0x88 Processed data: 1.332800 Volts Component Absent/Present Example: # clia sensor 20 | grep Air 20: LUN: 0, Sensor # 150 ("Air Filter") # clia sensordata 20 150 20: LUN: 0, Sensor # 150 ("Air Filter") Type: Discrete (0x6f), "Entity Presence" (0x25) Status: 0xc0 All event messages enabled from this sensor Sensor scanning enabled Initial update completed Sensor reading: 0x00 Current State Mask 0x0001 Entity Present <---- The Air Filter is in place Switch Commands: Switch Console: uses the following commands to report the configuration of the switch. If additional information is needed, please refer to the Switch manual for more show command details
FASTPATH is a mode-based command line interface. The commands in one mode are not available until the operator switches to that particular mode. Enter ? at CLI prompt display a list of available commands and descriptions. Enter TAB at CLI prompt will complete the command for you unless the command is not yet unique at that point. Exit the current level using exit command. There are many different modes available see the chart at the end of the document to determine which mode may be needed. FASTPATH supports multiple users with different security levels. By default, there is one admin user with no password. In the CLI, privilege mode is password-protected separately from the default mode, but also has a default of no password. Below are the most often used modes.
There are more arguments/options for the show command, please use show ? to view the entire list and brief explanation. CP3240 Examples (CP3140 output is different): (CP3240H-BEX-Z Base) # show hardware Switch: 1 System Description ............................. CP3240H-BEX-Z Base Machine Type ................................... CP3240H-BEX-Z Machine Model .................................. CP3240H-BEX-Z Serial Number .................................. 1544DTI-0742330074 FRU Number ..................................... 375-3523-01 Part Number .................................... 375-3523-01 Maintenance Level .............................. A Manufacturer .................................. 0x34b7 Burned In MAC Address .......................... 00:20:13:F1:0E:6D Software Version ............................... 1.3.3.0 Operating System ............................... Linux 2.4.20_mvl31 Network Processing Device ...................... BCM56504 REV 1 Additional Packages ............................ FASTPATH QOS FASTPATH Multicast FASTPATH IPv6 (CP3240H-BEX-Z Base) # show port all Admin Physical Physical Link Link LACP Actor Intf Type Mode Mode Status Status Trap Mode Timeout ------ ------ ------- ---------- ---------- ------ ------- ------ -------- 0/1 Enable Auto 100 Full U-Up Enable Enable short 0/2 Enable Auto D-Down Enable Enable short 0/3 Enable Auto 1000 Full U-Up Enable Enable short 0/4 Enable Auto U-Down Enable Enable short 0/5 Enable Auto D-Down Enable Enable short 0/6 Enable Auto D-Down Enable Enable short 0/7 Enable Auto D-Down Enable Enable short 0/8 Enable Auto 100 Full U-Up Enable Enable short 0/9 Enable Auto 1000 Full U-Up Enable Enable short 0/10 Enable Auto 100 Full U-Up Enable Enable short 0/11 Enable Auto 1000 Full U-Up Enable Enable short 0/12 Enable Auto 1000 Full U-Up Enable Enable short 0/13 Enable Auto 1000 Full U-Up Enable Enable short 0/14 Enable Auto 1000 Full U-Up Enable Enable short 0/15 Enable Auto D-Down Enable Enable short 0/16 Enable Auto D-Down Enable Enable short 0/17 Enable Auto Down Enable Enable short 0/18 Enable Auto 1000 Full Up Enable Enable short 0/19 Enable Auto Down Enable Enable short 0/20 Enable Auto Down Enable Enable short 0/21 Enable Auto Down Enable Enable short 0/22 Enable Auto Down Enable Enable short 0/23 Enable Auto Down Enable Enable short 0/24 Enable Auto Down Enable Enable short 0/25 Enable Auto Down Enable Enable short 0/26 Enable 10G Full Down Enable Enable short 0/27 Enable 10G Full Down Enable Enable short (CP3240H-BEX-Z Base) # show interface 0/18 Packets Received Without Error ................. 74945744 Packets Received With Error .................... 0 Broadcast Packets Received ..................... 72354102 Packets Transmitted Without Errors............. 339215 Transmit Packet Errors ......................... 0 Collision Frames ............................... 0 Time Since Counters Last Cleared ............... 12 day 21 hr 34 min 57 sec (CP3240H-BEX-Z Base) # ping 10.5.56.1 Send count=3, Receive count=3 from 10.5.56.1 Troubleshooting Tips: ShMM Console is not responding
ShMM is in a booting cycle
/tmp/debug.log file
M0 : FRU not installed M1 : FRU is Inactive M2 : FRU activation request M3 : FRU activation in progress M4 : FRU active M5 : FRU deactivation request M6 : FRU deactivation in progress M7 : Communication Lost - abnormal state ShMM unablle to communicate with the FRU Cause Table: 0x0: Normal State change 0x1: Change commanded by shelf manager 0x2: State change due to operator changing handle switch (latch/delatch) 0x3: State change due to programmatic action 0x4: Communication Lost or Regained 0x5: Communication Lost or Regained - locally detected: 0x6: Surprise State change due to extraction 0x7: State Change Due to provide information (valid for M7 to M0 transition) 0x8: Invalid Hardware Address detected 0x9: Unexpected Deactivation (Valid for M4 to M6 transition) 0xA: Surprise State change due to power failure 0xF: State Change - Cause Unknown - No cause could be determined >> The /tmp/debug.log file contains booting information (/var/log/messages file), and because ShMM file system is set up on a Flash, booting information is erased with each reboot. Therefore, if the problem is related to ShMM booting, please ask Customer to generate a fresh /tmp/debug.log file. Here is a Q&A session of how to read the /tmp/debug.log file. Q: How to find ShMM Firmware version? A: Find the >>>Shelfman Version Session; >>> Shelfman version Pigeon Point Shelf Manager ver. 2.4.9-R3U2-RR Pigeon Point is a trademark of Pigeon Point Systems. Copyright (c) 2002-2007 Pigeon Point Systems All rights reserved Build date/time: Mar 27 2009 08:33:42 Carrier: ACB; Subtype: 3; Subversion: 1 Q: How to identify if is from shm1 or shm2? A: Find the >>>ShMC IPMB Address Session; >>> ShMC IPMB Address Local IPMB Address = 0x10 0x10 is shm1 (upper), and 0x12 is shm2 (bottom). Q: How to identify the blades installed in chassis? A: Check the >>>Board Information Session; >>> Board Information Physical Slot # 1 9a: Entity: (0xa0, 0x60) Maximum FRU device ID: 0x01 PICMG Version 2.2 Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) 9a: FRU # 0 Entity: (0xa0, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "Netra CP3260" 9a: FRU # 1 Entity: (0xc1, 0x6f) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "CP32X0-RTM-HDD" ... Physical Slot # 7 82: Entity: (0xa0, 0x60) Maximum FRU device ID: 0x04 PICMG Version 2.2 Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) 82: FRU # 0 Entity: (0xa0, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "CP3240H-BEX-Z" 82: FRU # 2 Entity: (0xc1, 0x65) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "AMC-XFP" 82: FRU # 3 Entity: (0xc1, 0x66) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "AMC-XFP" 82: FRU # 4 Entity: (0xc1, 0x67) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "AMC10G-CX4" ... Physical Slot # 14 9c: Entity: (0xa0, 0x60) Maximum FRU device ID: 0x02 PICMG Version 2.2 Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) 9c: FRU # 0 Entity: (0xa0, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "NetraCP-3250" Q: How to identify chassis components? A: Check the first sector (start with 20:) of the >>>Detailed FRU Information Session; >>> Detailed FRU Information 10: FRU # 0 Entity: (0xf0, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device ID String: "ShMM-500" Site Type: 0x03, Site Number: 01 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 20.0 Watts ... 20: FRU # 3 Entity: (0x1e, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "Fan Tray 0" Site Type: 0x04, Site Number: 01 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 200.0 Watts 20: FRU # 4 Entity: (0x1e, 0x61) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "Fan Tray 1" Site Type: 0x04, Site Number: 02 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 200.0 Watts 20: FRU # 5 Entity: (0x1e, 0x62) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "Fan Tray 2" Site Type: 0x04, Site Number: 03 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 200.0 Watts 20: FRU # 6 Entity: (0x15, 0x60) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "PEM A" Site Type: 0x01, Site Number: 01 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 20.0 Watts 20: FRU # 7 Entity: (0x15, 0x61) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "PEM B" Site Type: 0x01, Site Number: 02 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 20.0 Watts 20: FRU # 8 Entity: (0xf3, 0x6f) Hot Swap State: M4 (Active), Previous: M3 (Activation In Process), Last State Change Cause: Normal State Change (0x0) Device Type: "FRU Inventory Device behind management controller" (0x10), Modifier 0x0 Device ID String: "SAP Board" Site Type: 0x06, Site Number: 01 Current Power Level: 0x01, Maximum Power Level: 0x01, Current Power Allocation: 2.0 Watts 10: and 12: are shm1 and shm2, the rest of 20: are chassis FRUs. Q: How to find the IP address of ShMM? A: Find eth0 of the >>>Network Interfaces Session; >>> Network Interfaces ... eth0 Link encap:Ethernet HWaddr 00:50:C2:3F:D1:30 inet addr:10.5.58.110 Bcast:10.255.255.255 Mask:255.255.248.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:12685231 errors:1909 dropped:1909 overruns:0 frame:0 TX packets:6569 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:865220012 (825.1 MiB) TX bytes:542114 (529.4 KiB) Interrupt:27 eth1 Link encap:Ethernet HWaddr 00:50:C2:3F:D1:31 inet addr:192.168.2.1 Bcast:192.168.7.255 Mask:255.255.248.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:28 ... vlan55 Link encap:Ethernet HWaddr 00:50:C2:3F:D1:30 inet addr:192.168.13.109 Bcast:192.168.13.255 Mask:255.255.255.224 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:301 errors:0 dropped:0 overruns:0 frame:0 TX packets:266 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:17940 (17.5 KiB) TX bytes:17791 (17.3 KiB) Please also note that IP of vlan55 is FIXED and defined in /var/netcons.ip file. More network interface info could also be found in the >>>Shell Environment Variables Session; >>> Shell Environment Variables SCHROFF_VARIANT=ACB-III CARRIER=ACB RMCPADDR=10.5.63.110 OPENHPI_THREADED=YES OPENHPI_DEBUG=NO OPENHPI_UID_MAP=/tmp/uid_map OPENHPI_CONF=/etc/openhpi.conf GATEWAY=10.5.56.1 NETMASK=255.255.248.0 IP1DEVICE=eth1 IP1ADDR=192.168.1.2 IPDEVICE=eth0 IPADDR=192.168.0.4 TZ=UTC LOGNAME=root USER=root MAIL=/var/mail/root TERM=vt100 SHELL=/bin/sh PATH=/var/bin:/sbin:/bin:/usr/sbin:/usr/bin HOME=/home/root IFS= PS1= PS2= FILE=/tmp/debug.log Or the >>>ShMC LAN Configuration Session; >>> ShMC LAN Configuration Authentication Type Support: 0x15 ( None MD5 Straight Password/Key ) Authentication Type Enables: Callback level: 0x00 User level: 0x15 ( "None" "MD5" "Straight Password/Key" ) Operator level: 0x15 ( "None" "MD5" "Straight Password/Key" ) Administrator level: 0x15 ( "None" "MD5" "Straight Password/Key" ) OEM level: 0x00 IP Address: 10.5.58.110 IP Address Source: Static Address (Manually Configured) (0x01) MAC Address: 00:50:c2:3f:d1:30 Subnet Mask: 255.255.248.0 IPv4 Header Parameters: 0x40:0x40:0x10 Primary RMCP Port Number: 0x026f Secondary RMCP Port Number: 0x0298 BMC-generated ARP Control: 0x02 Enable BMC-generated ARP Response Gratuitous ARP Interval: 2.0 seconds Default Gateway Address: 10.5.56.1 Default Gateway MAC Address: 00:04:96:1e:29:30 Backup Gateway Address: 0.0.0.0 Backup Gateway MAC Address: N/A Community String: "public" Number of Destinations: 16 Destination Type: N/A Destination Address: N/A Q: What to look for in the >>>U-Boot Environment Variables Session? A: There are couple of parameters to look into to prevent configuration problem; >>> U-Boot Environment Variables>> ... ipdevice=eth0 netmask=255.255.248.0 ip1addr=192.168.1.2 ip1device=eth1 rc2=/etc/rc.acb3 rmcpaddr=10.5.63.110 gateway=10.5.56.1 rc_ifconfig=y ipaddr=192.168.0.4 baudrate=9600 console=ttyS0 addmisc=setenv bootargs $(bootargs) $(quiet) console=$(console),$(baudrate) reliable_upgrade=$(reliable_upgrade) The baudrate, console and addmisc parameters have to be set up the way it shows here; then baud-rate of ShMM console becomes 9600 (install of default 115200). Q: How to identify if any Temperature problem? A: Look for the >>>Fan List, >>>Cooling State, and >>>Fan State Sessions; If everything is normal, it should look like: >>> Fan List 20: FRU # 3 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 4 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 5 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 >>> Cooling State Cooling state: "Normal" Sensor(s) at this state: (0x82,20,0) (0x82,36,0) (0x82,37,0) (0x82,44,0) (0x82,45,0) (0x82,52,0) (0x82,53,0) (0x82,10,0) (0x92,31,0) (0x20,126,0) (0x20,125,0) (0x86,19,0) (0x86,20,0) (0x86,21,0) (0x86,22,0) (0x86,23,0) (0x86,24,0) (0x86,25,0) (0x86,26,0) (0x92,6,0) (0x92,7,0) (0x92,30,0) (0x9c,4,0) (0x9c,5,0) (0x92,5,0) (0x98,4,0) (0x98,5,0) (0x9c,3,0) (0x94,7,0) (0x94,8,0) (0x94,23,0) (0x94,24,0) (0x94,25,0) (0x94,40,0) (0x94,41,0) (0x94,42,0) (0x98,3,0) (0x88,7,0) (0x88,8,0) (0x88,23,0) (0x88,24,0) (0x88,25,0) (0x94,6,0) (0x9a,5,0) (0x9a,6,0) (0x9a,29,0) (0x9a,30,0) (0x9a,31,0) (0x88,6,0) (0x9a,4,0) (0x96,5,0) (0x96,6,0) (0x96,29,0) (0x96,30,0) (0x96,31,0) (0x90,6,0) (0x90,7,0) (0x90,8,0) (0x90,23,0) (0x90,24,0) (0x90,25,0) (0x90,40,0) (0x90,41,0) (0x90,42,0) (0x96,4,0) (0x20,120,0) (0x20,121,0) (0x20,122,0) (0x20,123,0) (0x20,124,0) (0x20,200,0) (0x20,201,0) (0x10,2,0) >>> Fan State Fans state: "Normal" Sensor(s) at this state: (0x10,8,0) (0x10,10,0) (0x10,11,0) (0x10,13,0) (0x10,14,0) (0x10,7,0) If any problem, it would look similar: >>> Fan List 20: FRU # 3 Current Level: 15 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 4 Current Level: 15 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 5 Current Level: 15 Minimum Speed Level: 0, Maximum Speed Level: 15 >>> Cooling State Cooling state: "Minor Alert" Sensor(s) at this state: (0x98,31,0) (0x9c,30,0) (0x9c,31,0) (0x86,30,0) (0x86,31,0) >>> Fan State Fans state: "Normal" Sensor(s) at this state: (0x10,8,0) (0x10,10,0) (0x10,11,0) (0x10,13,0) (0x10,14,0) (0x10,7,0) In this output, the Cooling State shows that some of the temperature sensors are at Minor Alert (instead of Normal). These sensors are read as (IPMB Address, Sensor Number, LUN) --- (0x98, 31, 0) shows blade in slot 13 (0x98), sensor 31. Need to check what these sensors are and what are their reading before determine if there is a temperature problem in chassis --- also check for dirty Air Filter. The >>>Fan List shows all FT are running full speed. This may or may not caused by temperature problem, because only blades (0x86, 0x98 and 0x9c) are showing cooling alert, not chassis (0x20). There might be other problems --- could be power related. Chassis temperature sensors are: 20: LUN: 0, Sensor # 120 ("Center Exhaust") 20: LUN: 0, Sensor # 121 ("Left Exhaust") 20: LUN: 0, Sensor # 122 ("Right Exhaust") 20: LUN: 0, Sensor # 124 ("Temp_In Left") 20: LUN: 0, Sensor # 125 ("Temp_In Center") 20: LUN: 0, Sensor # 126 ("Temp_In Right") Q: Any power related information? A: Look into the >>>Shelf FRU Info Session, and find following; PICMG Shelf Power Distribution Record (ID=0x11) Version = 0 Feed count: 4 Feed: Maximum External Available Current: 28.0 Amps Maximum Internal Current: 27.6 Amps Minimum Expected Operating Voltage: -40.5 Volts Feed-to-FRU Mapping entries count: 3 FRU Addr: 45, FRU ID: 0xfe <---- Slot 5 FRU Addr: 49, FRU ID: 0xfe <---- Slot 3 FRU Addr: 4d, FRU ID: 0xfe <---- Slot 1 Feed: Maximum External Available Current: 28.0 Amps Maximum Internal Current: 27.6 Amps Minimum Expected Operating Voltage: -40.5 Volts Feed-to-FRU Mapping entries count: 6 FRU Addr: 41, FRU ID: 0xfe <---- Slot 7 FRU Addr: 43, FRU ID: 0xfe <---- Slot 6 FRU Addr: 47, FRU ID: 0xfe <---- Slot 4 FRU Addr: 4b, FRU ID: 0xfe <---- Slot 2 FRU Addr: 08, FRU ID: 0xfe <---- shm1 FRU Addr: 10, FRU ID: 0x03 <---- FT 1 Feed: Maximum External Available Current: 28.0 Amps Maximum Internal Current: 27.6 Amps Minimum Expected Operating Voltage: -40.5 Volts Feed-to-FRU Mapping entries count: 6 FRU Addr: 42, FRU ID: 0xfe <---- Slot 8 FRU Addr: 44, FRU ID: 0xfe <---- Slot 9 FRU Addr: 48, FRU ID: 0xfe <---- Slot 11 FRU Addr: 4c, FRU ID: 0xfe <---- Slot 13 FRU Addr: 09, FRU ID: 0xfe <---- shm2 FRU Addr: 10, FRU ID: 0x04 <---- FT 2 Feed: Maximum External Available Current: 28.0 Amps Maximum Internal Current: 27.6 Amps Minimum Expected Operating Voltage: -40.5 Volts Feed-to-FRU Mapping entries count: 4 FRU Addr: 46, FRU ID: 0xfe <---- Slot 10 FRU Addr: 4a, FRU ID: 0xfe <---- Slot 12 FRU Addr: 4e, FRU ID: 0xfe <---- Slot 14 FRU Addr: 10, FRU ID: 0x05 <---- FT 3 This record shows what chassis component each PEM feed connects to. Q: What is the >>>System Event Log Session? A: The >>>System Event Log Session records all chassis related events. 0x0001: <D&T>; from:(0x98,0,0); sensor:(0x02,20); event:0x1(deasserted): "Lower Critical", 0x02 0xFF 0xFF 0x0002: <D&T>; from:(0x98,0,0); sensor:(0x02,20); event:0x1(deasserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0003: <D&T>; from:(0x98,0,0); sensor:(0x02,21); event:0x1(asserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0004: <D&T>; from:(0x98,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M2->M3, Cause=0x1 0x0005: <D&T>; from:(0x88,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M0->M1, Cause=0x0 0x0006: <D&T>; from:(0x88,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M1->M2, Cause=0x2 0x0007: <D&T>; from:(0x98,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M3->M4, Cause=0x0 0x0008: <D&T>; from:(0x88,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M2->M3, Cause=0x1 0x0009: <D&T>; from:(0x88,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M3->M4, Cause=0x0 0x000A: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(asserted): 0x7C 0x21 0x32 0x000B: <D&T>; from:(0x12,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M7->M1, Cause=0x4 0x000C: <D&T>; from:(0x12,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M1->M2, Cause=0x2 0x000D: <D&T>; from:(0x84,0,0); sensor:(0xf1,1); event:0x6f(asserted): 0xA3 0x00 0x88 0x000E: <D&T>; from:(0x84,0,0); sensor:(0x08,2); event:0x3(asserted): 0x00 0xFF 0xFF 0x000F: <D&T>; from:(0x84,0,0); sensor:(0x15,3); event:0x8(asserted): 0x00 0xFF 0xFF 0x0010: <D&T>; from:(0x84,0,0); sensor:(0x07,4); event:0x3(asserted): 0x00 0xFF 0xFF 0x0011: <D&T>; from:(0x12,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M2->M3, Cause=0x1 0x0012: <D&T>; from:(0x12,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M3->M4, Cause=0x0 0x0013: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(asserted): 0x78 0x2C 0x32 0x0014: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(asserted): 0x78 0x2C 0x32 0x0015: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(deasserted): 0x78 0x2C 0x32 0x0016: <D&T>; from:(0x88,0,0); sensor:(0x02,14); event:0x1(deasserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0017: <D&T>; from:(0x98,0,0); sensor:(0x02,14); event:0x1(deasserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0018: <D&T>; from:(0x98,0,0); sensor:(0xf1,3); event:0x6f(asserted): 0xA3 0x00 0x88 0x0019: <D&T>; from:(0x98,0,0); sensor:(0x02,9); event:0x1(asserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x001A: <D&T>; from:(0x98,0,0); sensor:(0x02,9); event:0x1(asserted): "Lower Critical", 0x02 0xFF 0xFF 0x001B: <D&T>; from:(0x98,0,0); sensor:(0x02,21); event:0x1(asserted): "Upper Non-Critical", 0x07 0xFF 0xFF 0x001C: <D&T>; from:(0x98,0,0); sensor:(0x25,25); event:0x6f(asserted): 0x01 0xFF 0xFF 0x001D: <D&T>; from:(0x84,0,0); sensor:(0x07,4); event:0x3(deasserted): 0x61 0xF1 0x80 0x001E: <D&T>; from:(0x84,0,0); sensor:(0x07,5); event:0x3(asserted): 0x61 0xF0 0x00 0x001F: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(asserted): 0x79 0x28 0x30 0x0020: <D&T>; from:(0x20,0,0); sensor:(0x25,168); event:0x6f(asserted): 0x01 0xFF 0xFF 0x0021: <D&T>; from:(0x20,0,0); sensor:(0x25,169); event:0x6f(asserted): 0x01 0xFF 0xFF 0x0022: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(asserted): 0x71 0x09 0x30 0x0023: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(deasserted): 0x71 0x09 0x30 0x0024: <D&T>; from:(0x10,0,0); sensor:(0xde,128); event:0x6f(deasserted): 0x71 0x09 0x30 0x0025: <D&T>; from:(0x20,0,0); sensor:(0x25,168); event:0x6f(asserted): 0x00 0xFF 0xFF 0x0026: <D&T>; from:(0x20,0,0); sensor:(0x25,169); event:0x6f(asserted): 0x00 0xFF 0xFF 0x0027: <D&T>; from:(0x88,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M4->M0, Cause=0x6 0x0028: <D&T>; from:(0x88,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M4->M0, Cause=0x6 0x0029: <D&T>; from:(0x98,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M4->M0, Cause=0x6 0x002A: <D&T>; from:(0x98,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M4->M0, Cause=0x6 0x002B: <D&T>; from:(0x12,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M4->M7, Cause=0x4 0x002C: <D&T>; from:(0x88,0,0); sensor:(0xf1,3); event:0x6f(asserted): 0xA3 0x00 0x88 0x002D: <D&T>; from:(0x98,0,0); sensor:(0xf0,2); event:0x6f(asserted): HotSwap: FRU 2 M0->M0, Cause=0x0 0x002E: <D&T>; from:(0x84,0,0); sensor:(0xf0,0); event:0x6f(asserted): HotSwap: FRU 0 M4->M7, Cause=0x4 0x002F: <D&T>; from:(0x88,0,0); sensor:(0x02,8); event:0x1(asserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0030: <D&T>; from:(0x98,0,0); sensor:(0x02,8); event:0x1(asserted): "Lower Non-Critical", 0x00 0xFF 0xFF 0x0031: <D&T>; from:(0x88,0,0); sensor:(0x02,8); event:0x1(asserted): "Lower Critical", 0x02 0xFF 0xFF 0x0032: <D&T>; from:(0x98,0,0); sensor:(0x02,8); event:0x1(asserted): "Lower Critical", 0x02 0xFF 0xFF NOTE: <D&T> is <Date and Time> The message has the format of <ID>: Event at <D&T>; from:(IPMB, FRU, LUN); sensor:(<type>, <#>); event:<event type>: <Details> Thus, from:(0x84,0,0); sensor:(0x07,4); event:0x3(asserted) shows it is from slot 8 (0x84, switch blade), FRU 0 (the board itself), and sensor 4 is asserted. Need to use some of the data collecting commands to find out what is sensor 4 of slot 4; then use sel -v command to see the exact details of the event. Since the events are from all components in chassis, depending on the symptom given by Customer, one needs to group the events from the same component (whether it is a blade or a component --- ShMM, FT, PEM &c.) to make judgment of what might be RC. Q: Where is ShMM booting related information located? A: It is in the last session: >>>Shelfman Output to syslog This session is the collection of /var/log/messages files. Some events in the SEL will also be logged here. >> Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|