Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1324444.1
Update Date:2017-10-05
Keywords:

Solution Type  Troubleshooting Sure

Solution  1324444.1 :   Sun Storage 7000 Unified Storage System: Fan Troubleshooting  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
References


Applies to:

Sun Storage 7210 Unified Storage System - Version All Versions and later
Sun Storage 7410 Unified Storage System - Version All Versions and later
Sun ZFS Storage 7120 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Sun ZFS Storage 7320 - Version All Versions and later
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [2.0.2.15|2.0.2.16]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]

Purpose

The purpose of this document is to assist in the troubleshooting of fan related alerts for the 7xx0 Unified Storage Appliances.

Troubleshooting Steps

Symptom: Appliance has an amber LED lit for Fan Module / Fan Power Board and (or) an alert was received from the appliance indication that there is a FAN related failure.

 

Note: 'Multiple' fan failures can cause node boot issues.  (Issue reported that a 7120, with TWO fans failed, would not boot until one fan was replaced - bringing the fan failure count to ONE)



Resolution:

   1.  Confirm whether you have one of the following faults in the fmadm.out in the "fm" directory of the support bundle:
          SENSOR-8000-26: External sensors indicate that the fan 'xxxxxxxx/FB X FM X' is no longer operating correctly.
          SENSOR-8000-26: External sensors indicate that the fan 'xxxxxxxx/FT X' is no longer operating correctly.


   2.  Check the fan to see if the fan has lost power or there is a physical obstruction.
          If fan has power and there is a physical obstruction, but the fault remains -> Step 5.
          If the fan is currently operating as expected proceed to Step 3.


   3.  Verify SP version via the info/info.dump file in the support bundle
          If SP <= 2.0.2.5, these versions contain the SP memory leak bug. It is highly possible that the customer is hitting bug#6869041 (see Doc ID 1267544.1 for description and resolution). SP version 2.0.2.16 fixes the SP memory leak issue.

          Once SP FW has been updated then verify if the FAN fault persists.

          'Temporary' relief may be achieved by resetting the SP.

          If SP >= 2.0.2.16, Then Proceed to Step 5.

   4  Review the fm/fmadm.out in the bundle; is this a re-occurring failure on this particular FAN location? Or is the customer complaining that this is a re-occurring issue ?
         Yes. Review SR history and consult with customer to determine maintenance history for this FAN.
         If the part has never been replaced then proceed to Step 5.
         If the Fan Module has been replaced on a 7310, 7120, 7320, or 7420 where there is a CRU for a Fan Board and a Fan Module, but the problem still persists then replace the Fan Board.
         If the part(s) have been replaced more than once and the problem still persists then engage the x64 server team via IM in the x64-all chat room or via collaboration in MOS by selecting the proper product using the Orion PLA Manager Product Search https://apex.oraclecorp.com/pls/apex/f?p=18194:1:143849738380509::NO.
         No. Monitor appliance to see if the Fan errors persist. If the Fan errors persist, proceed to Step 5.

    5. Replace the Fan Board or Fan Module called out in the failure alert

 

Note: For Sun ZFS Storage 7x20 systems, using the 3.x ILOM version, please be aware of ...

            CR 7035044 : Exadata x2-2:: S0/G0 ACPI state changes to working for no apparent reason.
              ( Current rev   : 3.0.9.25 )

            This bug is also known to lead to 'false' reporting of PSU/FAN failures.

References

<BUG:6869041> - INCONSISTENCY IN PRODUCTS SELECTED AND PRODUCTS INSTALLED
<NOTE:1267544.1> - Older versions of the Service Processor firmware on Sun Storage 7110, 7210, 7310 and 7410 can leak memory.
<NOTE:1386616.1> - Sun Storage 7000 Unified Storage System: Thermal Events and Ongoing Fan issues on 7410/7110 storage arrays
<NOTE:1416406.1> - Sun ZFS Storage Appliances Troubleshooting Resource Center
http://www.oracle.com/technetwork/documentation/oracle-unified-ss-193371.html
<NOTE:1377473.1> - How to replace a fan module in Sun ZFS Unified Storage Appliance:ATR:1377473.1:0 [Video]
<NOTE:1496283.1> - Sun 7000 Unified Storage System: ASR Fan Alarm Verification
<BUG:15621864> - SUNBT6925325 ELWOOD CHASSIS FAN BOARD INTERMITTENTS CAUSING FANS TO "DISSAP
https://bluegill.us.oracle.com:215/ak8-dev/index.php/Main_Page

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback