Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1495308.1
Update Date:2018-01-05
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1495308.1 :   Sun 7000 Unified Storage System: ASR Disk Alarm Verification  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Oracle ZFS Storage Appliance Racked System ZS4-4
  •  
  • Sun ZFS Storage 7320
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Purpose
Scope
Details
 1. Verify the Problems list from the Appliance
 2.  Are the problems all ZFS-8000-GH?
 3.  Run a scrub to attempt to correct the errors. 
 4.  Verify Part is in the Correct Enclosure
 5. Please clear the event from the BUI and monitor
References


Applies to:

Sun Storage 7110 Unified Storage System - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Sun ZFS Storage 7320 - Version All Versions and later
Sun ZFS Storage 7120 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
7000 Appliance OS (Fishworks)

Purpose

 This article describes activity required by a System Administrator to verify whether a Disk event is transient (ignorable) or actionable. If actionable, instructions will be provided as to how to proceed.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance Community

Scope

 This document is intended for System Administrators and support personnel.

Details

Auto Service Request (ASR) provides automatic failure detection and SR creation for Oracle hardware systems.  See http://www.oracle.com/us/asr/index.html  for more information on ASR.

This particular ASR event has created a Service Request.   This requires manual verification in order to determine whether further action by Oracle Service is required. 

 

1. Verify the Problems list from the Appliance

From CLI -> maintenance problems show 

From BUI -> Click Maintenance -> Click System -> Click Problems

 The list below is a list of the descriptive text and the error code found on the system, as they may be seen in the customer report. Review the list shown by the appliance against the one below:

Error CodeDescription
ZFS-8000-GH The number of checksum errors associated with the device has exceeded acceptable levels.
ZFS-8000-D3 The device has failed or could not be opened.
AK-8000-F0 The disk 'XXX' uses an interface (SAS) that is incompatible with the enclosure.
ZFS-8000-CS The pool is no longer available
ZFS-8000-K4

The intent log(s) cannot be replayed.

ZFS-8000-8A A file or directory could not be read due to corrupt data.
ZFS-8000-LR

ZFS device failed to open.


If there are no problems in the list, the drives in the Appliance are ok.  No further work is required.

If any of these were due to a maintenance activity, they can be cleared in the BUI and no further action is required.

  • If there is more than one of the above problems in the list, go to Step 2.
  • If the problem is for ZFS-8000-GH, go to Step 3.
  • If the problem is for AK-8000-F0, go to Step 4.
  • If the problem is for ZFS-8000-D3, ZFS-8000-CS ZFS-8000-8A, ZFS-8000-LR or ZFS-8000-K4 go to Step 5

2.  Are the problems all ZFS-8000-GH?

If so, go to Step 3.

If not, proceed with the steps "Engaging a Support Engineer" (see bottom of document)

 

3.  Run a scrub to attempt to correct the errors. 

The ZFS subsystem can generate erroneous checksum errors on the system as part of a disk replacement or normal day to day action.  This does not require a disk replacement unless the 
data checksum turns into an unrecoverable action (this does not cause data loss).  The scrub may flag additional drives as failed due to checksum errors, but will never fail enough drives to offline the storage pool.

  1. Run the scrub to completion
    - BUI: Configuration -> Storage -> Click on the pool -> Click Scrub
    - CLI: configuration storage scrub start
  2. Mark the check sum fault as cleared in the problems log
    - BUI: Maintenance ->Problems -> Click on Problem->Click Marked Repaired
    - CLI: maintenance problems select <problem-id> markrepaired
  3. Repeat this process until there are no checksum errors or there is an Unrecoverable Problem logged by the system.

If there are unrecoverable errors generated during this process, proceed with the steps "Engaging a Support Engineer" (see bottom of document)

 

4.  Verify Part is in the Correct Enclosure

AK-8000-F0 is an indication that the disk is not compatible with the enclosure, i.e. a SAS-1 drive in a SAS-2 enclosure. Unfortunately, the error is often spurious, and can disappear with after reseating a drive. This fault will almost exclusively be seen on 7410 and 7420 systems.

Please reseat the drive in question and mark the problem as cleared.  If the problem returns, please review your configuration.  There may be a SAS-1 or SAS-2 system in the wrong enclosure.

 

5. Please clear the event from the BUI and monitor

- BUI: Maintenance ->Problems -> Click on Problem->Click Marked Repaired
- CLI: maintenance problems select <problem-id> markrepaired

 

Engaging a Support Engineer

Please update your SR if the proposed solution did not fix the issue, a support engineer will then be assigned to assist.

 

    When engaging support via a Service Request, please collect a (full) supportbundle to assist the support engineer - please see

        Document ID 1019887.1 - Sun Storage 7000 Unified Storage System: How to collect a supportbundle using the BUI or CLI

 

NOTE: Without further updates we will close the SR in 14 days.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback