Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1021607.1
Update Date:2018-03-27
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1021607.1 :   ZFS-8000-8A - A file or directory could not be read due to corrupt data  


Related Items
  • SPARC T8-1
  •  
  • SPARC T3-1B
  •  
  • SPARC SuperCluster T4-4 Full Rack
  •  
  • SPARC T8-4
  •  
  • SPARC T3-4
  •  
  • SPARC M7-8
  •  
  • SPARC M8-8
  •  
  • SPARC T7-4
  •  
  • Oracle SuperCluster M7 Hardware
  •  
  • SPARC T4-2
  •  
  • SPARC SuperCluster T4-4
  •  
  • SPARC T8-2
  •  
  • Solaris Operating System
  •  
  • Oracle SuperCluster M8 Hardware
  •  
  • SPARC T7-2
  •  
  • SPARC T3-2
  •  
  • SPARC T4-1
  •  
  • SPARC T4-1B
  •  
  • SPARC M7-16
  •  
  • SPARC T7-1
  •  
  • SPARC T3-1
  •  
  • SPARC T4-4
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
  •  

PreviouslyPublishedAs
ZFS-8000-8A


Applies to:

SPARC M7-8
SPARC M7-16
Solaris Operating System - Version 10 3/05 and later
SPARC T8-1
SPARC T8-2
All Platforms

Purpose

Provide additional information for message ID: ZFS-8000-8A

Details

Predictive Self-Healing Article
ZFS-8000-8A - Corrupted data

Corrupted data

 

Type

Fault
  fault.fs.zfs.object.corrupt_data

Severity

Critical

Description

A file or directory could not be read due to corrupt data.

Automated Response

No automated response will be taken.

Impact

The file or directory is unavailable.

Suggested Action for System Administrator

Run 'zpool status -x' to determine which pool is damaged:

# zpool status -x
  pool: test
 state: ONLINE
status: One or more devices has experienced an error and no valid replicas
        are available.  Some filesystem data is corrupt, and applications
        may have been affected.
action: Destroy the pool and restore from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME                  STATE     READ WRITE CKSUM
        test                  ONLINE       0     0     2
          c0t0d0              ONLINE       0     0     2
          c0t0d1              ONLINE       0     0     0

errors: 1 data errors, use '-v' for a list

The checksum errors as reported above can happen anywhere in the I/O data-path as soon as ZFS has submitted an I/O. This includes Solaris Target drivers and HBA driver bugs, DMA transfers, HBA F/W, the I/O link, SAN and it's components (if involved, like SAN switches) and the end target device (including the Storage Array controller, F/W and devices that might make up the LUNs being presented as vdev to ZFS zpools). ZFS has no control over any of these but is the only conventional file-system that has ability (due to built-in checksum) to detect and report these errors. Please note that there is no need to have associated I/O errors (like SCSI errors) for ZFS to report checksum errors.

 

Unfortunately, if the data cannot be repaired, then the only choice to repair the data is to restore the pool from backup. Applications attempting to access the corrupted data will get an error (EIO), and data may be permanently lost.

On recent versions of Solaris, the list of affected files can be retrieved by using the '-v' option to 'zpool status':

# zpool status -xv
  pool: test
 state: ONLINE
status: One or more devices has experienced an error and no valid replicas
        are available.  Some filesystem data is corrupt, and applications
        may have been affected.
action: Destroy the pool and restore from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME                  STATE     READ WRITE CKSUM
        test                  ONLINE       0     0     2
          c0t0d0              ONLINE       0     0     2
          c0t0d1              ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        /export/example/foo

Damaged files may or may not be able to be removed depending on the type of corruption. If the corruption is within the plain data, the file should be removable. If the corruption is in the file metadata, then the file cannot be removed, though it can be moved to an alternate location. In either case, the data should be restored from a backup source. It is also possible for the corruption to be within pool-wide metadata, resulting in entire datasets being unavailable. If this is the case, the only option is to destroy the pool and re-create the datasets from backup.

          Based on the zpool configuration (replicated; either mirror or raidz type vs non-replicated; simple stripe
          as above) will determine if running "zpool scrub" will help with the "self-heal" feature of ZFS.

          All pool meta-data is replicated (even on non-replicated zpool configurations) and thus will have good
          chance of recovery via "self-heal" if scrub operation is performed. Similarly, for replicated zpool
          configurations, scrub might be able to repair the checksum errors if good copy is available.

          So, running:
      
     # zpool clear test

          Followed by:

     # zpool scrub test

          And, waiting for the scrub to finish and then running:

     # zpool status -v test

          To see if the scrub was really able to "self-heal" any of the corrupted data will be good idea.
          If the errors remain "Permanent" even after the scrub operation, then taking action as described as
          part of "zpool status -v" output, which is to either remove and/or restore the files in question is way
          to proceed to get the zpool back in "healthy" state.

Details

The Message ID: ZFS-8000-8A indicates corrupted data exists in the current pool



Product
Solaris Operating System

Product_uuid
596ffcfa-63d5-11d7-9886-ac816a682f92


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback