Asset ID: |
1-77-1019883.1 |
Update Date: | 2016-07-29 |
Keywords: | |
Solution Type
Sun Alert Sure
Solution
1019883.1
:
Sun Blade 6000 Chassis Monitoring Module (CMM) or Sun Blade T6300 Server Module may Exhibit Erroneous "Failed", "Hot Insertion", and "Removal" Messages
Related Items |
- Sun Blade T6300 Server Module
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
- _Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
|
PreviouslyPublishedAs
248186
Bug Id
<BUG: 15491244>, <BUG: 15476847>, <BUG: 15517227>
ProductSun Blade 6000 Modular System
Sun Blade T6300 Server Module
Date of Resolved Release07-Jan-2009
Sun Blade 6000 Chassis Monitoring Module (CMM) or Sun Blade T6300 Server Module
may exhibit erroneous "Failed", "Hot Insertion", and "Removal" messages:
1. Impact
Sun Blade 6000 Chassis Monitoring Module (CMM) and Sun Blade T6300
Server Module may exhibit erroneous "Failed", "Hot Insertion", and
"Removal" Messages.
2. Contributing Factors
This issue can occur on the following platforms:
- Sun Blade 6000 Chassis Monitoring Module (CMM)
- Sun Blade T6300 Server Module
Note 1: This issue only occurs with the following T6300 Blade Module part numbers:
- 541-2317-05 (or below)
- 541-2318-05 (or below)
- 541-2319-05 (or below)
- 541-2320-04 (or below)
To determine the T6300 Blade Module part number, visually inspect the motherboard.
If the blades cannot be inspected visually, the following procedure can be used to identify affected blade(s):
1. Access the SP of the Target blade
2. Type "shownetwork"
3. Set the SP to "ssqa" mode by typing the following:
setsc sc_ssqamode true "xyz"
where "xyz" is the last nibble from the three last bytes from the mac address.
4. Get the "Four Eyes" firmware version:
sc> i2cp 0x20 3 2 0 0 1.
If the above command returns a value == 12, the blade is affected by this issue.
If the command returns a value == 14, the blade is not affected.
5. Upon finishing the data gathering above the sc_ssqamode parameter needs to be set back to false.
sc> setsc sc_ssqamode false
Note 2: The error messages mentioned above are completely harmless and do not affect the systems performance, functionality or reliability.
Note 3: This issue can occur in Sun Blade 6000 configurations with any population of Sun Blade T6300 server modules.
They are most apparent when Sun Blade T6300 server modules are plugged into slots 6,7,8, or 9.
This issue rarely occurs in Sun Blade 6000 configurations that are less populated with Sun Blade T6300 server modules when plugged into the lower slots.
Note 4: This issue can occur on the systems listed above irrespective of the OS/ALOM versions in use.
3.
Symptoms
If the described issue occurs, messages similar to the following will
be seen when the T6300 Service Processor (SP) is powered on:
Example of T6300 SP messages:
SC Alert: PSU at MP/PS1 has been removed.
SC Alert: PSU at MP/PS1 has been inserted.
SC Alert: NEM at MP/NEM0 has been removed.
SC Alert: PSU at MP/PS0 has been removed.
SC Alert: PSU at MP/PS0 has been inserted.
SC Alert: SYS_FAN at MP/FM3/FIN has FAILED.
SC Alert: SYS_FAN at MP/FM3/FOUT has FAILED.
SC Alert: SYS_FAN at MP/FM6/FIN has FAILED.
SC Alert: SYS_FAN at MP/FM6/FOUT has FAILED.
SC Alert: SYS_FAN at MP/FM7/FIN has FAILED.
SC Alert: SYS_FAN at MP/FM7/FOUT has FAILED.
SC Alert: NEM at MP/NEM1 has been removed.
SC Alert: NEM at MP/NEM1 has been inserted.
SC Alert: PSU at MP/PS0 has FAILED.
SC Alert: PSU at MP/PS1 has FAILED.
Example of chassis CMM events:
1319 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/NEM1
1318 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/NEM0
1317 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/PS1
1316 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/PS0
1315 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/BL6
1314 Thu Jan 1 00:01:56 1970 Chassis Action major Hot insertion of /CH/BL5
4.
Workaround
All T6300 blades should be upgraded to the latest firmware version 6.7.2 (delivered in patch139438-02) as this new firmware may decrease the chances of the described issue from occurring.
In Sun Blade 6000 configurations populated with 2 or 3 Sun
Blade T6300's, move the blades to the lower chassis slots while
moving the other blades (i.e T6320) to the higher slots. Also, CMM ILOM
firmware should be at 2.0.3.2 or later. This has been known to
decrease the chances of the described issue from occurring.
Alternatively, setting the "sys_eventlevel" to 0 or 1 in the T6300 ALOM
will suppress the messages from being logged to the "/var/adm/messages" logs. Level 1 is for critical messages and 0 is for
zero messages to be logged to the "/var/adm/messages" file. It is recommended that the
"sys_eventlevel" be set at level 1 so that critical messages are still
logged.
Note: This will not stop the
messages from being logged on the CMM or the T6300 SP's as there is no
way to suppress these.
5. Resolution
This issue is addressed in the following platforms:
- Sun Blade 6000 Chassis Monitoring Module (CMM) with motherboard level 371-1447-06 (or above) and CMM ILOM firmware 2.0.3.2 (or later).
- Sun Blade T6300 Server Module with I2C bridge chip firmware v1.4 (or later).
as delivered in the following T6300 Blade Module part numbers:
- 541-2317-07 (or above)
- 541-2318-07 (or above)
- 541-2319-07 (or above)
- 541-2320-06 (or above)
Note:
In addition to the required motherboard levels above the
System Firmware must also be upgraded to sysfw 6.7.2 (or later), which
contains ALOM code to enable the dynamic arbitration/priority feature
within the bridge chip.
Change History
03-Nov-2011- Corrected the document state from "workaround" to "resolved"
19-Nov-2012- Added step to Contributing Factors
Internal Comments
The ultimate fix will be a two part fix including T6300 blade replacement to update the motherboard FourEyes I2C bus bridge chip firmware to v1.4
which provides I2C bus Dynamic Arbitration and Dynamic Priority features. All blades must use this polling mechanism to ensure fair access to
the I2C Bus and to prevent the timeouts that produce the false messaging.
In addition the system firmware must be upgraded to v6.7.1 which
contains ALOM code to enable the dynamic arbitration/priority feature
within the bridge chip. These fixes will be released in an upcoming FAB
once final testing is complete.
Please send technical questions to the following email:
sunalert-tech-questions@sun.com
and CC the following persons:
Internal Contributor/Submitter
Internal Eng Responsible Engineer
Internal Contributor/submitter
John.Respicio@sun.com
Internal Eng Responsible Engineer
John.Respicio@sun.com
Internal Services Knowledge Engineer
jeff.folla@sun.com
Internal Eng Business Unit Group
SSG WGS (Workgroup Systems)
Internal Escalation ID
1-24303970, 1-24865240, 1-25122354, 1-24110501, 1-24146902, 1-23946501 1-24171305, 1-24864989, 1-25084069, 1-24435495
Internal Sun Alert & FAB Admin Info
WF 23-Mar-2009, jfolla: Updated with requested changes and sent to Nitin for review.
WF 07-Jan-2009, jfolla: This Sun Alert was placed on hold per request. Received release approval. Sent for release.
WF 18-Dec-2008, jfolla: sent for 24 hr review
WF 18-Dec-2008, jfolla: sent to submitter review along with a few questions regarding the submitted draft
WF 18-Dec-2008, jfolla: created
References
<BUG:15476847> - SUNBT6695705 ST PAUL IS REPORTING SYS_FAN FAILURES THAT APPEAR TO BE ERRONEOUS
<BUG:15491244> - SUNBT6720809 SC ALERT: PSU AT MP/PS1 HAS BEEN REMOVED. PERSISTENT MESSAGES
Attachments
This solution has no attachment