Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1008108.1
Update Date:2013-08-12
Keywords:

Solution Type  Problem Resolution Sure

Solution  1008108.1 :   Sun StorEdge 3310, 3510 and 3511: Avoiding double drive failure conditions on 3.x firmware  


Related Items
  • Sun Storage 3511 SATA Array
  •  
  • Sun Storage 3310 Array
  •  
  • Sun Storage 3510 FC Array
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Arrays>SN-DK: SE31xx_33xx_35xx
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Modular Disk - 3xxx Arrays
  •  

PreviouslyPublishedAs
211152


Applies to:

Sun Storage 3310 Array - Version Not Applicable and later
Sun Storage 3510 FC Array - Version Not Applicable and later
Sun Storage 3511 SATA Array - Version Not Applicable and later
All Platforms

Symptoms

When a disk fails on a Sun StorEdge 33x0, 3510 and 3511 array, if a bad block is encountered on another disk of that Logical Drive during the rebuild operation, the rebuild operation fails as follows :

Tue Jan 31 15:38:30 2006
[1113] #5: StorEdge Array SN#8040967 CH2 ID7: SCSI Drive ALERT: bad block encountered (02h, 03h,11/00)
Tue Jan 31 15:38:30 2006
[2103] #6: LD-ID 6CC584FE on StorEdge Array SN#8040967: ALERT: rebuild failed

This is known as a "double disk error", and when this happens, data
loss occurs and a restore from backup is required.

Cause

This problem is due to the latent disk access and is common for all arrays.

Solution

New functionality introducing disk scrubbing was introduced to Sun StorEdge 3310, 3510 and 3511 arrays with 4.x firmware which automatically does media scan.
To implement the new functionality, upgrade to 4.x firmware which can minimize and allow the management of the chances of this happening.

Before upgrading to 4.x firmware, it would be advisable to use the procedure mentioned in the workaround section once while on 3.x, to avoid seeing too many drive failures (see Doc ID 1000626.1 for more details).

Sample of drive related Sun Alerts while using 4.x below:

Doc ID 1000369.1 Synopsis: Insufficient Information for Recovery From Double Drive Failure for Sun StorEdge 33x0/35xx Arrays

Doc ID 1000856.1 Synopsis: Disks May be Marked as Bad Without Explanation After "Drive Failure," "Media Scan Failed" or "Clone Failed" Events

Doc ID 1000626.1 Synopsis: Sun StorEdge 33x0/3510 Arrays May Report a Higher Incidence of Drive Failures With Firmware 4.1x SMART Feature Enabled



Relief/Workaround

This document should be read in conjunction with the notes available in the Array documentation 817-3711-18.pdf  page 139

Run the Parity Regenerate operation at least once a month which will read the data and compare it with the parity for all the disk blocks. 

This can be done two ways.

1. Telnet/Serial interface
2. sscs GUI interface

To describe in detail :-

1. Telnet/Serial interface

From the telnet/serial access to the array, select the RAID-5 logical drive. It is best to run parity regenerate and select the last but one option which is

  x reGenerate parity         x

This will give 2 options :-

        Execute Regenerate Logical Drive Parity
        Overwrite Inconsistent Parity - Enabled 

By default, the "Overwrite Inconsistency Parity" is enabled. Disable this as it will overwrite the parity, should there be a mismatch between the data and parity. As either the parity or the data could be the cause of the mismatch, the decision to update the parity cannot be made at this point.

After disabling the "Overwrite Inconsistency Parity", select the "Execute Regenerate Logical Drive Parity" which will start the parity check. You can also track its progress.

Notes:

a. Only run one parity regenerate program at a time.
b. Do not schedule this to run automatically. It has to be manually run each time as the results require a manual judgment for the next action.

2. sscs GUI interface

a. Launch the SSCS gui with /usr/sbin/ssconsole.
b. Select a RAID5 logical drive from the device tree.
c. From the "Array Administration" menu, select "Schedule parity check".

Please refer Sun StorEdge 3000 Family Configuration Service User's Guide for more details on this.



Product
Sun StorageTek 3511 SATA Array
Sun StorageTek 3510 FC Array
Sun StorageTek 3320 SCSI Array
Sun StorageTek 3310 SCSI Array

 

parity, scrubbing, latent, disk, 3310, 3510, 3511, 3.x, 4.x, 3.27, 3.25, 4.11, 4.13, regen, rebuild, drive-failure
Previously Published As
84456

 

Change History
Date: 2010-01-11
User Name: sue.copeland@sun.com
Action: Currency & Update
Date: 2006-03-15
User Name: 7058
Action: Approved
Comment: Trademarked where appropriate.
Reworded sentences throughout document for reader clarity.
Made grammar and punctuation fixes as needed.
Enabled STM to bold section headers and offset preformatted text for clarity.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback