Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1311776.1
Update Date:2018-01-16
Keywords:

Solution Type  Technical Instruction Sure

Solution  1311776.1 :   CAM - How to Remove and Replace Midrange Disk Impending Disk Failure:ATR:1311776.1:0  


Related Items
  • Sun Storage 6580 Array
  •  
  • Sun Storage Flexline 240 Array
  •  
  • Sun Storage 6180 Array
  •  
  • Sun Storage Flexline 280 Array
  •  
  • Sun Storage 2540-M2 Array
  •  
  • Sun Storage 2510 Array
  •  
  • Sun Storage 2540 Array
  •  
  • Sun Storage 6780 Array
  •  
  • Sun Storage 6140 Array
  •  
  • Sun Storage Flexline 380 Array
  •  
  • Sun Storage 2530 Array
  •  
  • Sun Storage 2530-M2 Array
  •  
  • Sun Storage 6540 Array
  •  
  • Sun Storage 6130 Array
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: DISK-CAP VCAP
  •  




In this Document
Goal
Solution
References


Applies to:

Sun Storage 6130 Array - Version Not Applicable and later
Sun Storage 2530 Array - Version Not Applicable and later
Sun Storage 6580 Array - Version Not Applicable and later
Sun Storage Flexline 240 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2540 Array - Version Not Applicable and later
Information in this document applies to any platform.

Goal

How to Replace an Impending Disk Failure in CAM

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ADMINISTRATOR/ENGINEER NEED:(IS A SITE ENGINEER AVAILABLE?)

The replacement instructions are well documented in the Common Array Manager service advisor.

TASK COMPLEXITY: 0

TIME ESTIMATE: 20 minutes

FIELD ENGINEER INSTRUCTIONS

PROBLEM OVERVIEW:

HDD has been determined as about to fail by an Impending Failure fault on the array.

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE
RESOLUTION ACTIVITY?: N/A

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:

NOTE: If ASR is enabled, it should be deactivated temporarily before servicing the equipment so that additional unnecessary Service Requests are not created.

DEACTIVATING ASR WITH CAM 6.10: ASR is automatically deactivated when you select "Reserve the tray for maintenance" in Service Advisor, and it is reactivated when you select "Release the tray from maintenance" in Service Advisor.

DEACTIVATING ASR WITH CAM 6.9 OR EARLIER: Before proceeding with the part replacement:

1. Log into CAM.
2. Go to "Storage Systems" -> [your arrayname] -> "Administration" -> "Array Health Monitoring".
3. Uncheck the box "Enable ASR for this array" under the section "Monitoring for This Array". If the box is already unchecked, there is no action required. Remember the status however, as it will matter in step 3 of REACTIVATING ASR WITH CAM 6.9 OR EARLIER below.
4. Click on "Save" if you have unchecked the box in step 3 above.

1. Verify HDD status.

a) If the HDD is a HOT SPARE, the administrator will need to UNASSIGN it before proceeding. Failure to do so may result in an alarm of "Missing Hot Spare Drive". Please consult DOCUMENT 1450121.1 if this occurs.

b) If the HDD is in a single-disk RAID 0 then delete the volume and vDisk before the disk is replaced. Please consult DOCUMENT 1345746.1 for issues with missing volumes if they are not deleted before the disk replacement. 

If the HDD is unassigned, continue to Step 3.

2. Use CAM, to verify the alarms on the array.If there is already a Degraded Volume and/or Hot Spare in Use fault for the HDD then continue to Step 3.

WARNING : FOR RAID 0, if the faults are "Impending Failure Risk High", the replacement of the HDD will cause data loss.  The volumes impacted by this fault should be removed from server access in preparation for this.

The HDD must be manually failed prior to replacement. Click on the Array->Physical Devices->Disks->Click on the Disk->Click Fail Button

3. Use the Service Advisor(SA) for the Array in question to review HDD replacement directions. This will also show you how

to toggle the HDD location indicator for replacement. Use the indicator to locate the HDD in question.

- If the HDD is an 'unassigned drive' it can be safely replaced. 

- If the HDD is an 'assigned drive' the HDD should be failed/faulted in the SA. If an 'assigned drive' it is not contact the TSC or verify the reason for the HDD replacement.

4. The HDD location specified should be indicated by a white location LED dependent on tray. Additionally failed/faulted drives will have an amber fault LED on for the Tray and Slot.

5. Remove the HDD (wait 2 minutes in order to allow the array controllers to notice that the HDD has been removed), and then verify that the replacement HDD is the same:

a) type: SAS/SATA/FC/SSD

b) size

c) RPM

NOTE: the HDD make and model do not have to be the same, only the type, size and RPM.

6. Replace the HDD according to the instructions in the SA.

NOTE: If the HDD firmware needs updating, the customer will have to schedule this at a later date, as a copy back is typically immediate.

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:
1. Verify with customer that the HDD is in an OK or Optimal state.

2. Verify with the customer that the VDisk is reconstructing(if RAID 1,3,5,6). If it is not, you may need to manually start the reconstruction.

3. Verify that all Alarms regarding HDDs bypassed, degraded HDD channel, and impending failures have been removed from the system

4. If the VDisk is a RAID 0, the customer will have to re-create the vDisk and volumes and then restore the data from backup.

REACTIVATING ASR with CAM 6.10: ASR is automatically deactivated when you select "Reserve the tray for maintenance" in Service Advisor, and it is reactivated when you select "Release the tray from maintenance" in Service Advisor. If ASR was not enabled before beginning this procedure, selecting "Release the tray from maintenance" will not activate it.

REACTIVATING ASR WITH CAM 6.9 OR EARLIER: After proceeding with the part replacement:

1. Log into CAM.
2. Go to "Storage Systems" -> [your arrayname] -> "Administration" -> "Array Health Monitoring".
3. Check the box "Enable ASR for this array" under the section "Monitoring for This Array". Do this ONLY if you unchecked the box in step 3 of DEACTIVATING ASR WITH CAM 6.9 OR EARLIER. If you found the box was unchecked, leave it unchecked now.
4. Click on "Save" if you have checked the box to reactivate ASR.

 

If the CAM BUI is not available then it should be possible to fail the disk using the CAM Command Line Inteface CLI

Path to the commands are:

   Solaris: /opt/SUNWsefms/bin
   Linux: /opt/sun/cam/private/fms/bin
   Windows: c:\program files\Sun\Common Array Manager\Component\fms\bin

Solaris Example

     /opt/SUNWsefms/bin/lsscs list -a st6140-tvp540-a disk            # List the disks in the array and check status

     /opt/SUNWsefms/bin/lsscs fail -a st6140-tvp540-a disk t0d03   # Fail the disk

 

 

References

<NOTE:1450121.1> - How to Resolve a Missing Hot Spare Drive
<NOTE:1345746.1> - Sun Storage 2500, 2500-M2 and 6000 Arrays : Missing Volumes Reported After Disk Replacement in Single-disk RAID-0 Vdisk

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback