Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1500079.1
Update Date:2018-04-20
Keywords:

Solution Type  Technical Instruction Sure

Solution  1500079.1 :   Sun Storage 3000 Arrays: How to Proactively Fail and Remove a Failing Disk  


Related Items
  • Sun Storage 3511 SATA Array
  •  
  • Sun Storage 3510 FC Array
  •  
  • Sun Storage 3310 Array
  •  
  • Sun Storage 3320 SCSI Array
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Arrays>SN-DK: SE31xx_33xx_35xx
  •  




In this Document
Goal
Solution
References


Applies to:

Sun Storage 3511 SATA Array - Version Not Applicable and later
Sun Storage 3510 FC Array - Version Not Applicable and later
Sun Storage 3320 SCSI Array - Version Not Applicable and later
Sun Storage 3310 Array - Version Not Applicable and later
Information in this document applies to any platform.

Goal

The Sun Storage 3000 Arrays take advantage of RAID as well as SMART (Self-Monitoring, Analysis, and Reporting) technology to recover from media errors to a disk. These utilities allow for a disk drive to incur minimal (corrected) I/O faults, and continue on as an active member of the Logical Disk.  There may be instances where the disk remains active, but has exceeded any reasonable error threshold.  When these conditions occur, this document explains how to relocate the data off the drive, before replacement of the disk from the array.

Solution

The problem can be identified with the sccli utility. In the example below, the disk remains ONLINE even though multiple error corrections are occurring.

 

sccli> show LD
LD    LD-ID        Size  Assigned  Type   Disks Spare  Failed Status
------------------------------------------------------------------------
ld2   32B69784 204.72GB  Primary   RAID1  2     2      0      Good
                         Write-Policy Default          StripeSize 128KB

sccli>show disks
Ch     Id      Size   Speed  LD     Status     IDs                      Rev
----------------------------------------------------------------------------
 2(3)   8  136.73GB   200MB  ld2    ONLINE     SEAGATE ST314680FSUN146G 0207
                                                  S/N 3HY0D33400007327
                                                  WWNN 20000004CFD8E07C
                                                  Mirror (2.9)
 2(3)   9  136.73GB   200MB  ld2    ONLINE     SEAGATE ST314680FSUN146G 0207
                                                  S/N 3HY0DFFJ00007327
                                                  WWNN 20000004CFD8E1A2
                                                  Mirror (2.8)


sccli> show events
   Sat Dec 22 14:42:01 2012
   [Primary]     Notification
   NOTICE: SMART-CH:2 ID:8 Predictable Failure Detected

   Sun Dec 23 10:36:41 2012
   [Primary]     Notification
   NOTICE: SMART-CH:2 ID:8 Predictable Failure Detected

   etc....

 

 There are many other Disk Drive Events which may warrant proactive failure of the disk. See Appendix E of the Sun StorEdge 3000 Family RAID Firmware 4.2x User's Guide, for the complete set of Warnings and Alerts.

 

 The disk has now been identified for replacement. However, there is no command to proactively fail the disk. Instead, we will use the Clone feature of the array to move the data off of the drive.

 This may be done with the sccli utility, or via the console interface.

 

 Before cloning a disk an  unused disk needs to be identified or a new disk installed/made available in a spare slot.  In the above example the logical drive has two spare disks

 one of which (Disk 2.3) will be used as the target drive to copy the failing disks's data too. 

 

If  a spare disk is not already installed in the array, determine if the array has a spare unused slot in which to install the spare/replacement disk. Verifying the empty slot is connected to the same data channel as the existing logical drive. Refer to the array documentation and if in doubt seek assistance from Oracle.    

 

Using our example above.........

From the telnet/tip interface.

View and edit Drives ->
   Select the disk.
   Chl  ID  Size(MB) Speed    LG_DRV  Status    Vendor       Product ID
   2(3)  8  140009   200MB      2     ON-LINE   SEAGATE  ST314680FSUN146G
Clone Failing drive ->
Replace After Clone ->
Clone and Replace Drive -> Yes

To check on the status (progress) of the clone.
View and edit Drives ->
   Select the destination disk which is cloning.
   Chl  ID  Size(MB) Speed    LG_DRV  Status     Vendor     Product ID
   2(3)  3  140009   200MB      2     CLONING   SEAGATE  ST314680FSUN146G
Clone Failing drive ->
View clone progress -> XX% Completed

 

From sccli

 sccli> clone 2.8 2.3
 sccli: start clone 2.8 to 2.3
 sccli>
 sccli> show clone
         Ch  ID  Status
         -------------------
          2  3  2% complete

See Chapter 9 of the Sun StorEdge 3000 Family RAID Firmware 4.2x User's Guide for complete details on cloning and fault protection. Please note that you cannot clone an NRAID. Only Raid 0, 1, 3 and 5 are supported for clone.

In the best of outcomes, the clone will complete and the disk may be safely removed. When this occurs, show disks will now identify the failing disk as USED or FRMT

sccli> show disks
   Ch     Id      Size     Speed  LD   Status     IDs                      Rev
   --------------------------------------------------------------------------------------------
    2(3)   3    136.73GB   200MB  ld2  ONLINE     SEAGATE ST314680FSUN146G 0207
                                                                      S/N 3HY0DYZT00007327
                                                                      WWNN 20000004CFD8DF4A
                                                                      Mirror (2.9)
    2(3)   8    136.73GB   200MB  NONE FRMT      SEAGATE ST314680FSUN146G 0207
                                                                      S/N 3HY0D33400007327
                                                                      WWNN 20000004CFD8E07C
    2(3)   9    136.73GB   200MB  ld2  ONLINE    SEAGATE ST314680FSUN146G 0207
                                                                      S/N 3HY0DFFJ00007327
                                                                      WWNN 20000004CFD8E1A2
                                                                      Mirror (2.3)

 

If the clone fails, the drive state may toggle to BAD, or remain the same. In either case, continue with the replacement procedure found in <Document 1010064.1> Sun Storage 3000 Arrays: How to Replace a Hard Drive. You can also use Section 2.2 of the Sun StorEdge 3000 Family FRU Installation Guide  for further help on disk replacements.

 

Do you still have questions?  You can use My Oracle Support Communities.  Communities put you in touch with industry professionals like yourself.  They are monitored by Oracle support engineers, so you can expect reliable and correct answers.  Ask questions and see what others are asking about in the Disk Storage 2000, 3000, 6000 RAID Arrays & JBODs Community.

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback