![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1545687.1 : In certain conditions, the LSI 1068E firmware may hang
In this Document
Applies to:Sun Fire X4540 Server - Version Not Applicable to Not Applicable [Release N/A]Oracle Solaris on x86-64 (64-bit) SymptomsI/O appears to hang with the following entries in the messages file: Oct 26 09:09:33 xxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,376@f/pci1000,1000@0/sd@3,0 (sd29):
Oct 26 09:09:33 xxxx Error for Command: read(10) Error Level: Retryable Oct 26 09:09:33 xxxx scsi: [ID 107833 kern.notice] Requested Block: 330359991 Error Block: 330360071 Oct 26 09:09:33 xxxx scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: 9SF13ATP Oct 26 09:09:33 xxxx scsi: [ID 107833 kern.notice] Sense Key: Media Error Oct 26 09:09:33 xxxx scsi: [ID 107833 kern.notice] ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0 Oct 26 09:09:46 xxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,376@f/pci1000,1000@0/sd@3,0 (sd29): Oct 26 09:09:46 xxxx Error for Command: read(10) Error Level: Retryable Oct 26 09:09:46 xxxx scsi: [ID 107833 kern.notice] Requested Block: 327290721 Error Block: 327290862 Oct 26 09:09:46 xxxx scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: 9SF13ATP Oct 26 09:09:46 xxxx scsi: [ID 107833 kern.notice] Sense Key: Media Error Oct 26 09:09:46 xxxx scsi: [ID 107833 kern.notice] ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0 Oct 26 09:09:50 xxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,376@f/pci1000,1000@0/sd@3,0 (sd29): Oct 26 09:09:50 xxxx Error for Command: read(10) Error Level: Retryable Oct 26 09:09:50 xxxx scsi: [ID 107833 kern.notice] Requested Block: 327290721 Error Block: 327290862 Oct 26 09:09:50 xxxx scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: 9SF13ATP Oct 26 09:09:50 xxxx scsi: [ID 107833 kern.notice] Sense Key: Media Error Oct 26 09:09:50 xxxx scsi: [ID 107833 kern.notice] ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0 : Oct 26 09:47:22 xxxx scsi: [ID 243001 kern.warning] WARNING: /pci@0,0/pci10de,376@f/pci1000,1000@0 (mpt2): Oct 26 09:47:22 xxxx SAS Discovery Error on port 3. DiscoveryStatus is DiscoveryStatus is |Unaddressable device found| Oct 26 09:48:26 xxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,376@f/pci1000,1000@0 (mpt2): Oct 26 09:48:26 xxxx Disconnected command timeout for Target 3 : Oct 26 12:04:34 xxxx SOURCE: zfs-diagnosis, REV: 1.0 Oct 26 12:04:34 xxxx EVENT-ID: 3b0bf893-a701-e1e7-80f9-8b04fe02d8bb Oct 26 12:04:34 xxxx DESC: The number of I/O errors associated with a ZFS device exceeded Oct 26 12:04:34 xxxx acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Oct 26 12:04:34 xxxx AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt Oct 26 12:04:34 xxxx will be made to activate a hot spare if available. Oct 26 12:04:34 xxxx IMPACT: Fault tolerance of the pool may be compromised. Sep 19 05:09:14 yyyy scsi: [ID 243001 kern.info] /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:09:14 yyyy mpt_check_scsi_io: IOCStatus=0x4b IOCLogInfo=0x31123000 Sep 19 05:09:35 yyyy scsi: [ID 107833 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0/sd@6,0 (sd32): Sep 19 05:09:35 yyyy Error for Command: write(10) Error Level: Retryable Sep 19 05:09:35 yyyy scsi: [ID 107833 kern.notice] Requested Block: 328770879 Error Block: 328770879 Sep 19 05:09:35 yyyy scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: 9SF15RB5 Sep 19 05:09:35 yyyy scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Sep 19 05:09:35 yyyy scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Sep 19 05:13:28 yyyy scsi: [ID 243001 kern.info] /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:13:28 yyyy mpt_handle_event_sync: IOCLogInfo=0x31123000 Sep 19 05:13:28 yyyy scsi: [ID 243001 kern.info] /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:13:28 yyyy mpt_handle_event: IOCLogInfo=0x31123000 : Sep 19 05:13:37 yyyy scsi: [ID 107833 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0/sd@6,0 (sd32): Sep 19 05:13:37 yyyy Error for Command: write(10) Error Level: Retryable Sep 19 05:13:37 yyyy scsi: [ID 107833 kern.notice] Requested Block: 329686969 Error Block: 329686969 Sep 19 05:13:37 yyyy scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: 9SF15RB5 Sep 19 05:13:37 yyyy scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Sep 19 05:13:37 yyyy scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Sep 19 05:20:26 yyyy scsi: [ID 107833 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:20:26 yyyy Disconnected command timeout for Target 6 Sep 19 05:20:28 yyyy scsi: [ID 243001 kern.info] /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:20:28 yyyy mpt_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31140000 Sep 19 05:20:28 yyyy scsi: [ID 107833 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0/sd@6,0 (sd32): Sep 19 05:20:28 yyyy SCSI transport failed: reason 'timeout': retrying command : Sep 19 05:46:35 yyyy scsi: [ID 243001 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:46:35 yyyy SAS Discovery Error on port 6. DiscoveryStatus is DiscoveryStatus is |Unaddressable device found| Sep 19 05:47:41 yyyy scsi: [ID 107833 kern.warning] WARNING: /pci@3c,0/pci10de,376@f/pci1000,1000@0 (mpt5): Sep 19 05:47:41 yyyy Disconnected command timeout for Target 6 : Sep 19 06:10:26 yyyy fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major Sep 19 06:10:26 yyyy EVENT-TIME: Wed Sep 19 06:10:26 EDT 2012 Sep 19 06:10:26 yyyy PLATFORM: Sun-Fire-X4540, CSN: 0000000000, HOSTNAME: yyyy Sep 19 06:10:26 yyyy SOURCE: zfs-diagnosis, REV: 1.0 Sep 19 06:10:26 yyyy EVENT-ID: 0836161b-3b9e-6d50-da39-9783f831c4bb Sep 19 06:10:26 yyyy DESC: The number of I/O errors associated with a ZFS device exceeded Sep 19 06:10:26 yyyy acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Sep 19 06:10:26 yyyy AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt Sep 19 06:10:26 yyyy will be made to activate a hot spare if available. Sep 19 06:10:26 yyyy IMPACT: Fault tolerance of the pool may be compromised. Sep 19 06:10:26 yyyy REC-ACTION: Run 'zpool status -x' and replace the bad device.
CauseThere are issues in the LSI firmware as well as the mpt driver. SolutionUpgrade LSI firmware to 1.27.92 (011b5c00): <Patch: 16044285> X4540 SW 2.3.2 - HIA 2.4.10.5 Apply the following patch: <Patch: 150401-09>/<Patch: 150400-09> or later To address the following BUG: <Bug: 15706409> SUNBT7032847 MPT SHOULD HANDLE FAILING DISKS MORE INTELLIGENTLY
<Bug: 15875298> MPT DRIVER DOES NOT RECOVER FROM "DISCONNECTED COMMAND TIMEOUT" WITH FAILING DISK Attachments This solution has no attachment |
||||||||||||||||||
|