Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1164893.1
Update Date:2017-07-17
Keywords:

Solution Type  Problem Resolution Sure

Solution  1164893.1 :   Copy Back not Starting After Replacing a Faulty Drive in a Sun Storage 2500, 2500-M2, 6140, 6540, 6580, 6780 and Flexline 380  


Related Items
  • Sun Storage Flexline 380 Array
  •  
  • Sun Storage 6580 Array
  •  
  • Sun Storage 2540-M2 Array
  •  
  • Sun Storage 6540 Array
  •  
  • Sun Storage 2540 Array
  •  
  • Sun Storage 6180 Array
  •  
  • Sun Storage 2510 Array
  •  
  • Sun Storage 6780 Array
  •  
  • Sun Storage 2530 Array
  •  
  • Sun Storage 2530-M2 Array
  •  
  • Sun Storage 6140 Array
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Arrays>SN-DK: FLX300_65xx_6780
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Modular Disk - 6xxx Arrays
  •  


When a Global Hot Spare (GHS) is used due to a disk drive failure, the arary will copy the data back to a new, replacement drive after insertion into the storage system. There are conditions which prevent this copy back to start, this document describes these conditions.

In this Document
Symptoms
Cause
Solution


Applies to:

Sun Storage 6780 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage Flexline 380 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2530 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 6540 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2540 Array - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

Use case 1

  1. Drive failed by SYSTEM or USER.
  2. Reconstruction completes to GHS successfully.
  3. Failed drive is replaced in the enclosure.

Results:

  • For firmware 6.xx.xx.xx:
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.
  • For firmware 7.10.xx.xx (all revisions) through 7.35.xx.xx (all revisions):
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, this requires user intervention to trigger the copy back. (See the Solution below)
  • For firmware 7.50.xx.xx (all revisions) and higher:
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.

Use case 2

  1. Drive failed by SYSTEM or USER.
  2. Reconstruction to GHS starts.
  3. Failed drive is removed and replaced from system prior to the reconstruction completes.
  4. Reconstruction completes successfully.

Results:

  • For firmware 6.xx.xx.xx:
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.
  • For firmware 7.10.xx.xx (all revisions) through 7.35.xx.xx (all revisions):
    The copy back operation will not get queued and start automatically. This requires user intervention to trigger the copy back. (See the Solution below)
  • For firmware 7.50.xx.xx (all revisions) and higher:
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.

Use case 3

  1. Drive is pulled from system.
  2. Reconstruction to GHS starts and completes.
  3. Failed drive is replaced.

Results:

  • For firmware 6.xx.xx.xx:
    The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.
  • For firmware 7.xx.xx.xx:
    The copy back operation will not get queued and start automatically. This requires user intervention to trigger the copy back. (See the Solution below)

Use case 4

  1. Drive is bypassed by the array fimware due to a hardware issue.
  2. Reconstruction to GHS starts upon the next write failure and it completes later.
  3. Bypassed drive is removed and replaced.

Results:

  • For firmware 07.60.53.10 and later in the 6000 series:
    The copy back operation will not start automatically. This requires user intervention to trigger the copy back. (See the Solution below)
  • For firmware 07.35.67.10 and later in the 2500 series:
    The copy back operation will not start automatically. This requires user intervention to trigger the copy back. (See the Solution below)
  • For firmware 07.77.13.11 and later in the 2500-M2 series:
    The copy back operation will not start automatically. This requires user intervention to trigger the copy back. (See the Solution below)


Cause

This copy back function has changed slightly between firmware revisions. Depending on the firmware and the circumstances, a copy from GHS to the replacement drive may not happen without manual intervention.

For the situation where a drive is bypassed by the array firmware (above "Use case 4"), upon insertion of a replacement drive to the same enclosure/slot, the new drive is seen as unassigned. As drives are tracked by their World Wide Number (WWN), and the original drive was never failed, the controller firmware is still looking for its existing not present/optimal drive to be reinserted into the system to copy back from the GHS. This is a normal firmware behavior.

Solution

Sun Storage Common Array Manager (CAM)

Note: If you have the firmware level 6.xx.xx.xx on your array, no actions is needed after the drive replacement. The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.


Using the Browser User Interface (BUI):

  1. Select the array in CAM.
  2. Click on "Service Advisor".
  3. Expand "Portable Virtual Disk Management" on the left pane.
  4. Select "Replace a Disk Drive" then follow the instructions.


Using the Command Line (CLI):

  1. Use Sun Storage Common Array Manager (CAM) to confirm that reconstruction jobs are completed before moving forward to the next step.
  2. Use the following CAM command line to list the drives needing replacement:

    service -d <array-name> -c replace -q list

    Location of the 'service' command:

    Solaris: /opt/SUNWsefms/bin/
    Linux:  /opt/sun/cam/private/fms/bin/
    Windows: C:\Program Files\Sun\Common Array Manager\Component\fms\bin\

    Example:
    /opt/SUNWsefms/bin/service -d st6140c -c replace -q list
    
    Executing the replace command on st6140c
    
    Drives needing replacment:
    
    Tray.85.Drive.02
    
    In use hot spares:
    
    Tray.85.Drive.16
    
    Unassigned drives available for replacment:
    
    Tray.85.Drive.10
    
    Tray.85.Drive.05
    
    Tray.85.Drive.11
    
    Tray.85.Drive.06
    
    Tray.85.Drive.04
    
    Tray.85.Drive.03
    
    Tray.85.Drive.08
    
    Tray.85.Drive.12
    
    Tray.85.Drive.07
    
    Tray.85.Drive.09
    
    Tray.85.Drive.13
    
    Tray.85.Drive.14
    
    Tray.85.Drive.02

    The above example shows that the drive 85,02 needs to be replaced. This drive has already been replaced but the copy back did not start.
  3. Use the following CAM command line to manually trigger the copy back:

    service -d <arrayname> -c replace -t <drive_needing_replacement> -q <drive_to_be_used_for_the_replacement>

    Example:
    /opt/SUNWsefms/bin/service -d st6140c -c replace -t t85d02 -q t85d02
    
    Executing the replace command on st6140c
    
    Completion Status: Success

    In the above example, we manually trigger the copy back by replacing the drive 85,02 with itself. This drive has already been physically replaced.
  4. Use CAM to confirm that the copy back from the in use GHS started.

Sun StorageTek SANtricity Storage Manager

Note: If you have the firmware level 6.xx.xx.xx on your array, no actions is needed after the drive replacement. The copy back operation will start, assuming that there are not more than two (2) operation in a combination of reconstruction or copy back taking place on the system. If so, it will be queued.


Using the Graphical User Interface (GUI):

  1. In the Array Management Window, select the Volume Group which contains the replacement drive.
  2. Select Volume Group -> Replace Drives.
  3. Select the replacement drive and replace it by itself.
Using the Command Line (CLI):
 
  1. Use the following command line to manually trigger the copy back:

    SMcli -n <arrayname> -S -p <password> -c "replace drive[<drive_needing_replacement>] replacementDrive=<drive_to_be_used_for_the_replacement>;"
     
    Example:

    SMcli -n arrayname -S -p <password> -c "replace drive[85,6] replacementDrive=85,6;"
     
    In the above example, we manually trigger the copy back by replacing the drive 85,6 with itself. This drive has already been physically replaced.
     

Do you still have questions?  You can use My Oracle Support Communities.  Communities put you in touch with industry professionals like yourself.  They are monitored by Oracle support engineers, so you can expect reliable and correct answers.  Ask questions and see what others are asking about in the Disk Storage 2000, 3000, 6000 RAID Arrays & JBODs Community.
 

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback