Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2291549.1
Update Date:2018-02-07
Keywords:

Solution Type  Problem Resolution Sure

Solution  2291549.1 :   Oracle ZFS Storage Appliance: Handling DISK-8000-0X FMA events with 'asc=0x5d' or 'asc=0xb'  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage Appliance Racked System ZS4-4
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: ZS
  •  




In this Document
Symptoms
Cause
Solution


Applies to:

Oracle ZFS Storage ZS5-2 - Version All Versions and later
Oracle ZFS Storage ZS4-4 - Version All Versions and later
Oracle ZFS Storage Appliance Racked System ZS4-4 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

The system reports DISK-8000-0X FMA events in conjunction with a 'asc=0x5d' or 'asc=0xb' code.

 

To identify the asc/ascq codes in an uploaded bundle:

        $  cd <SUPPORTBUNDLE>/fm

        $  fmdump -V errlog > errlog-V.txt

        $  grep asc errlog-V.txt

 

Example 'errlog -V' output :

    fault-list-sz = 0x1
    __case_state = 0x1
    topo-uuid = cb935c24-9059-c979-b5e5-e23961896ef2
    topo-time = 0x58d90371
    fault-list = (array of embedded nvlists)
    (start fault-list[0])
    nvlist version: 0
            version = 0x0
            class = fault.io.disk.predictive-failure
            certainty = 0x64
            resource = (embedded nvlist)
            nvlist version: 0
                    version = 0x1
                    scheme = hc
                    hc-root =
                    fru-serial = 001651GWATML--------0EGWATML
                    devid = id1,sd@n5000cca08031ba28
                    fru-part = HGST-H101812SFSUN1.2T
                    fru-revision = A990
                    authority = (embedded nvlist)
                    nvlist version: 0
                            chassis-mfg = Oracle-Corporation
                            chassis-name = ORACLE-DE2-24P
                            chassis-part = 34753164+30+1
                            chassis-serial = 1702NM4002
                    (end authority)

                    hc-list-sz = 0x3
                    hc-list = (array of embedded nvlists)
                    (start hc-list[0])
                    nvlist version: 0
                            hc-name = ses-enclosure
                            hc-id = 0
                    (end hc-list[0])
                    (start hc-list[1])
                    nvlist version: 0
                            hc-name = bay
                            hc-id = 19
                    (end hc-list[1])
                    (start hc-list[2])
                    nvlist version: 0
                            hc-name = disk
                            hc-id = 0
                    (end hc-list[2])

                    hc-specific = (embedded nvlist)
                    nvlist version: 0
                            ascq = 0x97
                            asc = 0xb
                            ena = 0x8c6559db30e06401
                    (end hc-specific)

 

Cause

For Oracle part number 7301592, H7280A520SUN8.0T (8TB/7200 RPM/SAS-3 Disk), the alert (warning, slow response) 0x0b was introduced from disk F/W and incorrectly diagnosed by FMA.

Firmware P9E2 provided on the 8TB disk on 11th July 2016, which included the new event 0x0b.  The association of 0x0b (Disk-8000-0X) as a PFA event occurred with release of AK 8.6.12 on 19th Jan 2017.

This combination did not start to appear until about April/May timeframe with an increase of DISK-8000-0X events.

The fix or correction for the false PFA events was first released in AK 8.7.4 and later backported to AK Minor 8.6.15.  The asc/ascq codes can help to identify the true PFA fault (asc = 0x5d) from the false fault (asc = 0xb) event.

 

Solution

For 'asc = 0x5d/ascq = 0x90' disk events :

These are 'real' PFA (predictive failure) events  =>  Replace the disk

 

For 'asc = 0xb/ascq = 0x97' disk events :

These are 'false' PFA (predictive failure) events  =>  Recommend that the customer upgrade to the latest Appliance Firmware (AK) Release

 

 

In order to mitigate this issue, the attached workflow can be run to determine if this system is affected by this issue.

Instructions:

1.  Upload the 'DISK-8000-0X_False_Positive.akwf' workflow to the appliance.

2.  Execute the 'DISK-8000-0X_False_Positive.akwf' workflow.

You will receive an output message that indicates if this system is or is not affected by this issue.

 

Reference:  Oracle ZFS Storage Appliance Administration Guide - Uploading and Executing Workflows Using the BUI

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback