Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1496018.1
Update Date:2017-08-24
Keywords:

Solution Type  Technical Instruction Sure

Solution  1496018.1 :   Pillar Axiom: How to replace a Brick Disk Drive using Guided Maintenance [VCAP]  


Related Items
  • Pillar Axiom 300 Storage System
  •  
  • Pillar Axiom 500 Storage System
  •  
  • Pillar Axiom 600 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: DISK-CAP VCAP
  •  
  • Microlearning>Video>ML-VID-VCAP
  •  




In this Document
Goal
Solution
References


Applies to:

Pillar Axiom 500 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 300 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 600 Storage System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

The purpose of this document is to outline the steps required to replace a disk drive using GM (Guided Maintenance) on a Pillar Axiom Storage System

 

Solution

DISPATCH INSTRUCTIONS
   WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED:An understanding of Guided Maintenance in R4 and R5
   TIME ESTIMATE: 30 minutes
   TASK COMPLEXITY: 0

FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS:
   PROBLEM OVERVIEW: A disk has failed in an Axiom.
   WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?:  The CRU (Customer Replaceable Unit) can be in Failed/Warning status.
   WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE:

NOTE: Please review the Knowledge Document: <Document 1535352.1> Pillar Axiom: How to Disable Call Home to Prevent Automatic Service Request ASR Generation before proceeding with the procedure below. The steps contained therein are provided to allow an Administrator to de-activate a particular ASR enabled array while performing maintenance or troubleshooting. This will prevent any additional Service Requests from being created unnecessarily.

Read reference docs before proceeding:

<Document 1447672.1> Pillar Axiom: How to use Guided Maintenance in R4 and How to identify a CRU

<Document 1447965.1> Pillar Axiom: How to use Guided Maintenance in R5 and How to identify a CRU

PLEASE READ THE FOLLOWING NOTICES BEFORE DRIVE REPLACEMENT

  • The capacity of the drive replacement must be equal to or greater than that of the other drives in the Brick enclosure.  Please see <Document 1524090.1> Pillar Axiom: Replacing Brick Hard Disk Drives with Larger Capacity Drives
  • Do not move drives from their original positions. If you move a drive, all data on that drive will be lost. If multiple drives are moved, you will lose data.
  • Immediately replace the component to maintain proper airflow and cooling. Over-temperature conditions will occur if the replacement CRU is not installed into the chassis. Over-temperature conditions can damage other components.
  • If a drive fails, use a sealed spare drive from the Technical Support Center. Do not use a drive of unknown status.  These drives have a unique identifier. The process of writing this identifier to the physical drive is called branding. If the drive is unbranded, the Pillar Axiom system rejects it.
  • Do not attempt to replace a failed drive with one from another Brick or from another Pillar Axiom system.
  • If a new drive is placed in a (FC) Fiber Channel RAID or Expansion Brick, that new drive should have a green LED indicator and should be in normal state within a few minutes.
  • If a new drive is placed in a SATA (version 1) or SATA (version 2) Brick, the drive remains in warning status, until the task to copy back data from the spare drive is complete.
  • If testing Drive Pull, wait a few seconds after removing the drive before reinserting it. Be sure to check for Administrator Actions to accept the drive. Drive pulls should only be tested on bricks that have no data or a on a brick that has a spare available as pulling the drive will cause a rebuild to start. Important! Contact the Technical Support Center before pulling a drive.
  • If a drive fails to be accepted into a Brick and the drive is set to Rejected status, do not attempt to use that drive. Contact the Technical Support Center for another drive and for assistance.
  • If an Administrative Action is asking you to accept the drive is generated, be sure to select the Accept Drive option, which initiates a copyback operation for all but FC drives. If an Administrative Action to Accept a Drive is ever answered negatively, do not attempt to use that drive again. Contact Support for another drive.

 

Spare Drive Replacement

Axiom bricks have 2 types of hot spare drives: dedicated and floating.

Dedicated hot spare drives are only found in SATA and SSD (Solid State Disk) bricks.  They are the 13th drive that is located in the rear of the brick.  If one of the 12 data drives fails, the data on the failed drive is reconstructed to the hot spare drive.  Once the failure is replaced and the reconstruction completes, the data is copied back to the replaced drive and the hot spare drive resumes it's role as a spare.

Important!  Do not remove a dedicated hot spare drive if either of the ACT0 or ACT1 LEDs are blinking.  That indicates that the hot spare drive is in use and not failed.

Floating hot spare drives are only found in FC (fibre channel) bricks.  They are one of the 12 drives located in the front of the brick.  If one of the 11 data drives fails, the data on the failed drive is reconstructed to the hot spare drive.  Once the failure is replaced, that replacement becomes the new hot spare drive and the previous hot spare drive becomes a permanent data drive.

Note: When Guided Maintenance beacons the spare drive to identify it, it turns off the ACT0, RDY and ACT1 LEDs and turns on the FLT LED solid amber.

SATA and SSD Brick Spare Drive Replacement

To help you identify the target Brick that has the CRU that needs to be replaced, Guided Maintenance beacons the bezel LEDs on the target Brick. If you click Reverse Identify in the GUI, Guided Maintenance beacons the LEDs on all Bricks except for the target Brick.
After you click Prepare System, Guided Maintenance continues the replacement process only if the spare drive is not in use. If the spare drive is in use, Guided Maintenance reports this fact. You can try again or exit Guided Maintenance.

Important! Removal of the spare drive can occur only when it is not in use. A spare drive is in use when an array drive has failed or is being rebuilt. To replace the spare drive, first replace the failed drive in the array or wait until the drive rebuild process is complete.  After the system is prepared, Guided Maintenance displays a completion message and enables Next.

Remove a Spare SATA or SSD Drive

1.  When Guided Maintenance prompts you to remove the spare drive, unscrew the two screws that secure the locking tabs to the spare drive casing.  Springs retain the screws in the locking tabs.

2.  Push the two locking tabs down.  The spare drive disengages from the Brick's midplane.

3.  Slide the spare drive out of the chassis and set it aside.

Important! If Guided Maintenance encounters a problem at this stage, you must contact the Technical Support Center to continue Guided Maintenance for this CRU.

 

Inserting a Spare SATA or SSD Drive

1.  Slide the replacement spare drive into the Brick chassis and push the drive into place.

2.  Lift up the locking tabs to engage the spare drive with the Brick midplane.

NOTE: The RDY LED should begin flashing green when the drive is inserted. This LED should stop flashing and light steady green within one minute.  A burst of flashing should then be seen on the CU 1 and CU 0 LEDs. If the RDY LED continues to flash or the FLT LED lights, contact the Oracle Support Center

3.  Screw the two screws that are located on either side of the component into the back of the chassis until they are firmly secured. Do not over tighten.

4.  In Guided Maintenance, click Next.

5.  Choose one of these options as appropriate:

  • If prompted to acknowledge the successful discovery of the spare drive replacement, click OK.
  • If the replacement is not new, Guided Maintenance opens a dialog box and displays the prompt "Are you sure you want to do this".

Choose one of these options:

  • Click OK to accept the replacement. Acceptance binds this drive to this Brick and destroys any data that may have existed on the drive.
  • Click Cancel to reject the replacement. Rejection terminates this procedure and retains any previous data that might have existed on the drive.

Important! If you reject the replacement spare drive, it cannot be used again in this system.

6.  Review the status of the replacement CRU to ensure that it is Normal.

When the CRU replacement process is complete, the Pillar Axiom system reports the status of the CRU.  After Guided Maintenance successfully validates the drive replacement, the drive is bound to that Brick.

 

Data Drive Replacement (SATA, SSD and FC)

1.  Within Guided Maintenance, click Next in the Prepare System page.

2.  When Guided Maintenance prompts you to remove the drive, press the cam latch button on the face of the drive carrier to release the cam latch.

3.  Open the cam latch fully.  The drive disengages from the Brick's midplane. The system then begins rebuilding the data that was on the drive from parity data to the spare drive. This process can take several hours.

4.  Slide the drive out of the chassis and set it aside.

If you observe an Administrator Action to accept the foreign drive, be sure to click Accept. If the drive came from a spares kit, the Accept Foreign Drive task should begin automatically within a few minutes.

Note: If a Copyback or Rebuild operation to this drive occurs, the Accept Foreign Drive task will not complete until that operation completes. After you insert this CRU into a Brick control unit (CU), use Guided Maintenance to complete the replacement procedure.

Important! If Guided Maintenance encounters a problem at this stage, you must contact the Technical Support Center to continue Guided Maintenance for this CRU.

Inserting a drive improperly can cause errors and faults. Follow these instructions to ensure success.

5.  Fully open the cam latch on the replacement drive and slide the drive into the Brick chassis until it snaps into place.

    If a drive is not fully seated, either or both of the following will be true:

    ● The metal portion of the carrier will be visible.

    ● The front of the drive carrier will not be flush with the other carriers. Important! Do not unlatch and re-latch a drive carrier unnecessarily. Doing so can lead to potential troubles in the future.

6.  Close the cam latch until it snaps shut to engage the drive with the Brick midplane.   The center LED should flash green for up to one minute.

7.  In Guided Maintenance, click Next.

While the system checks the drive for acceptance, the drive status displays as Foreign. Also, you should see brief bursts of activity on the top and bottom LEDs as each RAID controller checks the drive. After a short while, the center LED should light steady green. Important! If the center LED    lights amber, the system has rejected the drive or the drive failed to spin up properly. Contact the Customer Support Center.

 

If an Administrative Action is asking you to accept the drive is generated, be sure to select the Accept Drive option, which initiates a copyback operation. If an Administrative Action to Accept a Drive is ever answered negatively, do not attempt to use that drive again. Contact Support for another drive.  Accepting a drive into the system (foreign or otherwise) will erase that disk.

8.  Choose one of the following options:

    ● If prompted to acknowledge the successful discovery of the drive replacement, click OK to accept the drive.

    ● If the replacement is not new, Guided Maintenance displays a dialog box that contains the prompt “Are you sure you want to do this?”

       Choose one:

  • Click OK to accept the replacement.

    Acceptance binds this drive to this Brick and destroys any data that may have existed on the drive.

Note: When you click OK, the system copies the data from the spare drive back to the array drive for all drive types except FC.  FC drives do not reconstruct and the replaced drive becomes the new spare drive.  The status of this drive is Copying Back and the spare drive remains in use during this period. Under some circumstances, if there are two failed drives in the Brick, the new drive may go to a Rebuild status indicating that the array is being rebuilt from parity.

  • Click Cancel to reject the replacement. Rejection terminates this procedure and retains any previous data that might have existed on the drive. Important! If you reject the drive, you cannot use it in this system again.

9.  When the copyback process completes, review the status of the replacement CRU to ensure that:

    ● The status of the replacement CRU is Normal.

    ● The task to accept the drive has completed successfully.

 

OBTAIN CUSTOMER ACCEPTANCE
   WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

Complete Guided Maintenance and re-activate ASR <Document 1508403.1> ASR Deactivation/Reactivation via My Oracle Support

PARTS NOTE: Return the parts if requested by Support/Engineering via the CPAS system, otherwise return the replaced parts as normal.

REFERENCE INFORMATION:

 

References

<NOTE:1447965.1> - Pillar Axiom: How to use Guided Maintenance in R5 and How to identify a FRU
<NOTE:1447672.1> - Pillar Axiom: How to use Guided Maintenance in R4 and How to identify a FRU

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback