Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1316543.1
Update Date:2017-07-17
Keywords:

Solution Type  Technical Instruction Sure

Solution  1316543.1 :   How to Replace an Flash Module, ESM Module and or an Aura Flash Accelerator PCIe Card (F20 SAS HBA) Sun ZFS Unified Storage Appliance:ATR:1316543.1:1  


Related Items
  • Sun ZFS Storage 7120
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: DISK-CAP VCAP
  •  




In this Document
Goal
Solution
 To report errors or request improvements on this procedure, please add a comment
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: FRU CAP

Applies to:

Sun ZFS Storage 7120 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

How to Replace an Flash Module, ESM Module and or an Aura Flash Accelerator PCIe card (F20 SAS HBA) in a 7120 Sun ZFS Unified Storage Appliance ( Doc 1316543.1 )

 

Solution

 

 DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ENGINEER NEED:(IS A SITE ENGINEER AVAILABLE?)
Training and experience with 7120 NAS hardware.

TASK COMPLEXITY: 1

TIME ESTIMATE: 60 minutes

FIELD ENGINEER INSTRUCTIONS

 

PROBLEM OVERVIEW
What: Need to replace a Sun ZFS Storage 7120 AURA ESM or Entire AURA FMOD CARD.
Where: Chassis ID:  PCIE SLOT ID:  (To be inserted by TSC)
Why : Faulty/Expired AURA ESM or Faulted AURA FMOD(s)



WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?:



Note:Failure to follow this process could result in customer data loss and/or the card not being seen by the 7120 post-replacement.


In the event that the appliance reports a problem affecting one or more flash modules on a Sun Flash Accelerator F20 card, the entire card should be replaced. If the flash modules have been configured as log devices in a ZFS pool, all four flash modules on that card must be offlined prior to replacing the card using the instructions below:

 

Preparation for replacing  the Aura Flash Accelerator PCIe card or a FMod ( Flash Module )

 

NOTE: For replacing an ESM Module the offline of the FMods is not needed as the cached date will flushed to the Disk when you doing a proper shutdown.



1. Select the chassis and then disk component type to list the disks

7x20:> maintenance hardware select chassis-000 select disk       
7x20:maintenance chassis-000 disk> list
LABEL STATE MANUFACTURER MODEL SERIAL
disk-000 HDD 0 ok SEAGATE ST31000SSSUN1.0T 0935550PEX 9QJ50PEX
.

.
disk-014 PCIe 0/FMod 0 ok MARVELL SD88SA024SA0 0919M00M18
disk-015 PCIe 0/FMod 1 ok MARVELL SD88SA024SA0 0919M00M1D
disk-016 PCIe 0/FMod 2 faulted MARVELL SD88SA024SA0 0919M00M0U
disk-017 PCIe 0/FMod 3 ok MARVELL SD88SA024SA0 0919M00M0V

 


2. Select the flash module (FMod) and offline it.

7x20:maintenance chassis-000 disk> select disk-016
7x20:maintenance chassis-000 disk-016> set offline=true
offline = true (uncommitted)
7x20:maintenance chassis-000 disk-016> commit
7x20:maintenance chassis-000 disk-016> done

 
3. Repeat the above for the remaining three flash modules.

4. Shut down the affected appliance and disconnect the power chords.

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:


1. Take ESD precautions. Replace the failed component identified in the alert.

2.1 Replace the Card Aura Flash Accelerator PCIe card

 

  • offline of Fmods is needed

 

2.2  Servicing the Energy Storage Module ( ESM )

 

  • NOTE: 1509674.1 - Sun ZFS Storage Appliance: A sensor indicates that an energy storage module on the card 'PCIe x' has exceeded its lifespan

 

  • For replacing an ESM Module the offline of the FMods is not needed as the cached date will flushed to the Disk when you doing a proper shutdown.

 

 


3. Install Aura Flash Accelerator PCIe card
    Note: This will be a new card if replacing the whole fru or the same card if just replacing ESM

4. Return system to operational state

5. Once the power is restored and the SP booted,

   Remove the fault warnings and reset the F20 card UPTIME value to 0 using the ILOM
   For example:

 

> show /SYS/MB/RISER0/PCIE0/F20CARD/UPTIME


/SYS/MB/RISER0/PCIE0/F20CARD/UPTIME

Targets:


Properties:

type = Power Unit

ipmi_name = PCIE0/F20C/UT

class = Threshold Sensor

value = 17405.051 Hours

upper_nonrecov_threshold = 17500.000 Hours

upper_critical_threshold = 17200.000 Hours

upper_noncritical_threshold = 16800.000 Hours

lower_noncritical_threshold = N/A

lower_critical_threshold = N/A

lower_nonrecov_threshold = N/A

alarm_status = major


Commands:

cd

show


-> set clear_fault_action=true

Are you sure you want to clear /SYS/MB/RISER0/PCIE0/F20CARD (y/n)? y

Set 'clear_fault_action' to 'true'


-> show /SYS/MB/RISER0/PCIE0/F20CARD/UPTIME


/SYS/MB/RISER0/PCIE0/F20CARD/UPTIME

Targets:

Properties:

type = Power Unit

ipmi_name = PCIE0/F20C/UT

class = Threshold Sensor

value = 0.000 Hours

upper_nonrecov_threshold = 17500.000 Hours

upper_critical_threshold = 17200.000 Hours

upper_noncritical_threshold = 16800.000 Hours

lower_noncritical_threshold = N/A

lower_critical_threshold = N/A

lower_nonrecov_threshold = N/A

alarm_status = cleared


Commands:

cd

show

 

Place offline fmods back to an online state.

7x20>: maintenance hardware select chassis-000 select disk select disk-016
7x20>: maintenance chassis-000 disk-015> set offline=false
                       offline = false (uncommitted)
7x20>: maintenance chassis-000 disk-016> commit
7x20>: maintenance chassis-000 disk-016>

Repeat the above for the remaining three flash modules.

 

6. Check the BUI or CLI
All the Flash drives should report online.

OBTAIN CUSTOMER ACCEPTANCE

Confirm the appliance is in a normal state.

WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

Confirm all faults cleared in the appliance BUI.

PARTS NOTE: 541-3731 OR 7075796 - 96GB PCI Express Flash Accelerator F20 SAS HBA

PARTS NOTE: 371-4650 [F] Energy Storage Module (ESM)   511-1500 PCI Express Flash Board

REFERENCES:

Sun Fire X4270 M2 Server - Service Manual - Section 4.5 Servicing PCIe Cards
https://support.us.oracle.com/handbook_internal/data/821/821-0488/pdf/821-0488-11.pdf

Sun Flash Accelerator F20 PCIe                               http://docs.oracle.com/cd/E19682-01/


Sun ZFS Storage 7x20 Appliance Field Service Guide
http://fgw.sfbay.sun.com/twiki-bin/viewfile/FishPublic/WebHome?filename=FRU.pdf

FAB 1433369.1: Sun Storage 7120 BIOS and Firmware Upgrade required to support ILOM for Aura 1.0 ESM monitoring.
https://support.us.oracle.com/oip/faces/secure/km/DocumentDisplay.jspx?id=1433369.1

 

To report errors or request improvements on this procedure, please add a comment

References

<NOTE:1559277.1> - Sun ZFS Storage Appliance: Replacement of Aura Flash Accelerator PCIe Card (F20 SAS HBA) results in zpool degraded status (7120 only)
<NOTE:1452452.1> - Sun Storage 7000 Unified Storage System: How to add a Logzilla to an existing pool
<NOTE:1509674.1> - Sun ZFS Storage Appliance: A sensor indicates that an energy storage module on the card 'PCIe x' has exceeded its lifespan
<NOTE:1433369.1> - Sun Storage 7120 BIOS and Firmware upgrade required to support ILOM for Aura 1.0 ESM monitoring.
<NOTE:1379117.1> - Sun Storage 7000 Unified Storage System: How To Shutdown ZFSSA Cluster

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback