Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2045362.1
Update Date:2017-11-21
Keywords:

Solution Type  Technical Instruction Sure

Solution  2045362.1 :   FS System: Work Instructions for FCO 354  


Related Items
  • Oracle FS1-2 Flash Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Flash Storage>SN-EStor: FSx
  •  




In this Document
Goal
Solution
 Confirm the Identify of the Affected Parts
 Implement the FCO
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: ESM is a FRU and instructions are for TSC.

Applies to:

Oracle FS1-2 Flash Storage System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

This document provides the proper procedure to implement FCO 354 as described in Document 2045728.1 FCO A0354-1: Proactive: High failure rate of Energy Storage Modules (ESMs) in FS1-2 Storage System Controllers. ESM part number 7080891 must be replaced with part number 7308302.

Solution

Implementation of FCO 354 is proactive and can be done online or offline.  Either way you must removing both power cords from the Controller(s) being worked on.  If the online option is chosen, it is recommended that this activity be done during a low IO period to reduce the impact on system performance.  If, while implementing this FCO, it is noticed that one of the Controllers is already in a failed state at the time this FCO is implemented, that Controller must be the first to have it's ESMs replaced as well as any other actions needed to restore it to working order.

NOTE: These instructions are for the implementation of this FCO only.  If an SR is opened for a failed ESM and it is noticed that this FCO applies, the TSE should only order the parts necessary to restore the system to working order.  Then contact the Regional Distribution Manager (see FCO 354 for contacts) to get the FCO implemented for the rest of the ESMs.

If the Flash Storage System is in a normal/green state, it is recommended that it be upgraded to the latest version of FS1-2 software prior to implementing the FCO.  See Document 1967797.1 FS System: How to Download Software and Firmware Updates for the FS1-2

Confirm the Identify of the Affected Parts

Flash Storage Systems shipped prior to March 15, 2015 need to confirm whether they still have the affected ESM part numbers (7080891).  There are several ways to do this based on the current status of the Controllers and what information is available.

NOTE: Basic and Basic+ versions of the FS1-2 Storage System will have ESMs only in locations 1 and 3. Performance FS1-2 Storage Systems will have ESMs in locations 0-3. Please reference the Service Label located on the top of the Controller for the proper physical locations. The same information can also be found on this FS1-2 page in the Oracle System Handbook.
  • The FCO provides steps using the CLI or GUI that the customer can perform to obtain the ESM part numbers.  If this information has been gathered, please skip to the Implementation section.
  • For systems that already have a failed Controller look for older log bundles on ISDE/Cores using the System Serial Number and use the *chsh.xml or StorageConfiguration.txt files as indicated below.  If no log bundles are available from a system with a failed Controller, it will require a physical inspection of the system.
  • If current or recent log bundles are available, use one of these two methods:
    • Edit the *chsh.xml file and search for SuperCap:

      <SuperCap>

      <Fru>

      <HardwareComponentStatus>NORMAL</HardwareComponentStatus>
      <IdString>Energy Storage Module</IdString>
      <PartNumber>7080891</PartNumber>  <========================
      <SerialNumber>465765M-1504ESM00X</SerialNumber>
      <Location>ESM0/PRSNT</Location>

      </Fru>
      <CapacitorStatus>CHARGED</CapacitorStatus>
      <DimmSlot>0</DimmSlot>
      <Slot>0</Slot>
      <ChargeCapacityPercentage>0</ChargedCapacityPercentage>

      </SuperCap>

        

    • Or grep for SuperCap in the StorageConfiguration.txt file.
      % grep ControllerInformation_SuperCap_Fru_PartNumber SystemConfiguration.txt
      ControllerInformation_SuperCap_Fru_PartNumber                      = 7080891
      ControllerInformation_SuperCap_Fru_PartNumber                      = 7080891

 

  • If direct access to the system itself is available, identification can be done by querying the ESMs from the Controller using ipmitool. The command must be run against each ESM and from each Controller's ILOM.
    1. ssh into the active Pilot, see Document 2029847.1 FS System: How to Enable SSH to the Pilot
    2. If Controller 0 is operational, ssh into it (172.30.80.128). Otherwise use Controller 1 (172.30.80.129)
    3. Execute the ipmitool command on this Controller's ILOM:
      # ipmitool -H 169.254.2.5 -U root -P changeme sunoem cli "show /SYS/DBP/ESM<ESM_Number>"
        
    4. Repeat for all ESMs in this Controller.
    5. Execute ipmitool command on the other Controller's ILOM.
      # ipmitool -H 169.254.2.9 -U root -P changeme sunoem cli "show /SYS/DBP/ESM<ESM_Number>"

      Example using ESM3:
      # ipmitool -H 169.254.2.5 -U root -P changeme sunoem cli "show /SYS/DBP/ESM3"
      Connected. Use ^D to exit.
      -> show /SYS/DBP/ESM3

      /SYS/DBP/ESM3
           Targets:
                  PRSNT
                  STATE
                  SERVICE
                 OK2RM

           Properties:
                  type = Energy Storage Module
                  ipmi_name = ESM3
                  fru_description = PCA,SCM,TYPEA,S6
                  fru_part_number = 7080891  <======================
                  fru_rev_level = 01
                  fru_serial_number = 1327ESM01X
                  fault_state = OK
                 clear_fault_action = (none)

           Commands:
                  cd
                  set
                  show

      -> Session closed
      Disconnected
        

Implement the FCO

  1. Order the appropriate quantity of ESM replacements based on the part number information above.

    NOTE: If Solaris 10 or 11 hosts have LUNs on the FS1, please reference Document 1986424.1 Certain Solaris 11 SRUs and Solaris 10 Patches may Cause IO Error on Failover before attempting to implement this FCO online.
     
  2. Since the replacement of the ESMs will require Controllers to be powered off, customer should be advised that this could impact overall system performance.  Customer also has the option to take a maintenance window but it is not required.
    • Online method:
      1. Follow the ESM replacement Canned Action Plan Document 1984509.1 FS System: How to Remove and Replace an Energy Storage Module (ESM) in an FS1-2 Controller but replace all the ESMs with part number 7080891 at once in one Controller before moving on to the second Controller. 
      2. If there is a Controller that is already down, replace those ESMs first. If both Controllers are active, they must be taken offline (and power removed) one at at time to do the ESM replacements. 
      3. Please allow sufficient time for the first Controller to boot before attempting to failover the second Controller.

    • Offline method:
      1. Shut down the FS1-2:
        • GUI: In the Menu Bar, click on Oracle FS and then Shut Down.  Confirm the shut down under System Information, Status.
        • CLI:
          C:\>fscli system -shutdown

          Command Succeeded

          C:\>fscli system -list -status

          System
          Name : <name_of_FS1-2>
          SystemStatus : SHUTDOWN

          C:\>
           
      2. Remove both power cords from both Controllers.
      3. Replace any ESMs that have part number 7080891 with the newer 7308302 part number.
      4. Reconnect the power cords.
      5. Restart the FS1-2:

        NOTE: An FS1-2 restart will reboot the Pilots and the connection will be dropped.  Once the Pilots have rebooted, users must login again.
         
        • GUI: In the Menu Bar, click on Oracle FS and then Restart.
        • CLI:
          C:\> fscli system -restart
            

 

References

<NOTE:2045728.1> - FCO A0354-1: Proactive: High failure rate of Energy Storage Modules (ESMs) in FS1-2 Storage System Controllers.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback