Asset ID: |
1-79-1021607.1 |
Update Date: | 2018-03-27 |
Keywords: | |
Solution Type
Predictive Self-Healing Sure
Solution
1021607.1
:
ZFS-8000-8A - A file or directory could not be read due to corrupt data
Related Items |
- SPARC T8-1
- SPARC T3-1B
- SPARC SuperCluster T4-4 Full Rack
- SPARC T8-4
- SPARC T3-4
- SPARC M7-8
- SPARC M8-8
- SPARC T7-4
- Oracle SuperCluster M7 Hardware
- SPARC T4-2
- SPARC SuperCluster T4-4
- SPARC T8-2
- Solaris Operating System
- Oracle SuperCluster M8 Hardware
- SPARC T7-2
- SPARC T3-2
- SPARC T4-1
- SPARC T4-1B
- SPARC M7-16
- SPARC T7-1
- SPARC T3-1
- SPARC T4-4
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
|
PreviouslyPublishedAs
ZFS-8000-8A
Applies to:
SPARC M7-8
SPARC M7-16
Solaris Operating System - Version 10 3/05 and later
SPARC T8-1
SPARC T8-2
All Platforms
Purpose
Provide additional information for message ID: ZFS-8000-8A
Details
Predictive Self-Healing Article
ZFS-8000-8A - Corrupted data
Corrupted data
Type
- Fault
- fault.fs.zfs.object.corrupt_data
Severity
- Critical
Description
- A file or directory could not be read due to corrupt data.
Automated Response
- No automated response will be taken.
Impact
- The file or directory is unavailable.
Suggested Action for System Administrator
-
Run 'zpool status -x' to determine which pool is damaged:
# zpool status -x
pool: test
state: ONLINE
status: One or more devices has experienced an error and no valid replicas
are available. Some filesystem data is corrupt, and applications
may have been affected.
action: Destroy the pool and restore from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 2
c0t0d0 ONLINE 0 0 2
c0t0d1 ONLINE 0 0 0
errors: 1 data errors, use '-v' for a list
-
The checksum errors as reported above can happen anywhere in the I/O data-path as soon as ZFS has submitted an I/O. This includes Solaris Target drivers and HBA driver bugs, DMA transfers, HBA F/W, the I/O link, SAN and it's components (if involved, like SAN switches) and the end target device (including the Storage Array controller, F/W and devices that might make up the LUNs being presented as vdev to ZFS zpools). ZFS has no control over any of these but is the only conventional file-system that has ability (due to built-in checksum) to detect and report these errors. Please note that there is no need to have associated I/O errors (like SCSI errors) for ZFS to report checksum errors.
-
Unfortunately, if the data cannot be repaired, then the only choice to repair the data is to restore the pool from backup. Applications attempting to access the corrupted data will get an error (EIO), and data may be permanently lost.
On recent versions of Solaris, the list of affected files can be retrieved by using the '-v' option to 'zpool status':
# zpool status -xv
pool: test
state: ONLINE
status: One or more devices has experienced an error and no valid replicas
are available. Some filesystem data is corrupt, and applications
may have been affected.
action: Destroy the pool and restore from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 2
c0t0d0 ONLINE 0 0 2
c0t0d1 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
/export/example/foo
Damaged files may or may not be able to be removed depending on the type of corruption. If the corruption is within the plain data, the file should be removable. If the corruption is in the file metadata, then the file cannot be removed, though it can be moved to an alternate location. In either case, the data should be restored from a backup source. It is also possible for the corruption to be within pool-wide metadata, resulting in entire datasets being unavailable. If this is the case, the only option is to destroy the pool and re-create the datasets from backup.
Based on the zpool configuration (replicated; either mirror or raidz type vs non-replicated; simple stripe
as above) will determine if running "zpool scrub" will help with the "self-heal" feature of ZFS.
All pool meta-data is replicated (even on non-replicated zpool configurations) and thus will have good
chance of recovery via "self-heal" if scrub operation is performed. Similarly, for replicated zpool
configurations, scrub might be able to repair the checksum errors if good copy is available.
So, running:
# zpool clear test
Followed by:
# zpool scrub test
And, waiting for the scrub to finish and then running:
# zpool status -v test
To see if the scrub was really able to "self-heal" any of the corrupted data will be good idea.
If the errors remain "Permanent" even after the scrub operation, then taking action as described as
part of "zpool status -v" output, which is to either remove and/or restore the files in question is way
to proceed to get the zpool back in "healthy" state.
Details
- The Message ID: ZFS-8000-8A indicates corrupted data exists in the current pool
Product
Solaris Operating System
Product_uuid
596ffcfa-63d5-11d7-9886-ac816a682f92
Attachments
This solution has no attachment