![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1021621.1 : ZFS-8000-GH - Too many checksum errors on ZFS device
PreviouslyPublishedAs ZFS-8000-GH Applies to:Oracle SuperCluster M8 HardwareSPARC T7-4 SPARC T7-2 SPARC T7-1 SPARC T8-4 All Platforms PurposeThis document provides additional information for message ID: ZFS-8000-GH DetailsPredictive Self-Healing Article Too many checksum errors on ZFS device Type
Severity
Description
Automated Response
Impact
Suggested Action for System Administrator # <strong>zpool status -x</strong> pool: pool1 state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver in progress, 44.83% done, 0h0m to go config: NAME STATE READ WRITE CKSUM pool1 DEGRADED 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 disk1 DEGRADED 0 0 162 too many errors spare1 ONLINE 0 0 0 disk2 ONLINE 0 0 0 spares spare1 INUSE currently in use errors: No known data errors.We can use FMA to get additional information: # <strong>fmadm faulty</strong> --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Feb 18 09:56:24 d82d1716-c920-6243-e899-b7ddd386902e ZFS-8000-GH Major Fault class : fault.fs.zfs.vdev.checksum Description : The number of checksum errors associated with a ZFS device exceeded acceptable levels. Refer to http://sun.com/msg/ZFS-8000-GH for more information. Response : The device has been marked as degraded. An attempt will be made to activate a hot spare if available. Impact : Fault tolerance of the pool may be compromised. Action : Run 'zpool status -x' and replace the bad device. This tells us all that we need to know. The device disk1 was found to have quite a few checksum errors - so many in fact that it was replaced automatically by a hot spare spare1. The spare was resilvering and a full complement of data replicas would be available soon. The entire process was automatic and completely observable.
ReferencesHTTPS://BLOGS.ORACLE.COM/BOBN/ENTRY/ZFS_AND_FMA_TWO_GREATAttachments This solution has no attachment |
||||||||||||
|