Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2355238.1
Update Date:2018-02-13
Keywords:

Solution Type  Problem Resolution Sure

Solution  2355238.1 :   Oracle ZFS Storage Appliance: Storage Pool Status is Degraded, but No Faulty Disk Found  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7320
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: ZS
  •  




In this Document
Symptoms
Changes
Cause
 Node-1
 Node-2
Solution
References


Created from <SR 3-16666083591>

Applies to:

Oracle ZFS Storage ZS5-4 - Version All Versions and later
Oracle ZFS Storage ZS5-2 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
Oracle ZFS Storage ZS3-BA - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

Storage pool status is degraded, but no faulty disk found.

There are many drives in "Removed" state in the pool.

Spare was kicked in, however system does not report any drive failure.

more pool05a.history |grep -i spare
2017-12-22.00:42:56 [internal vdev attach txg:12793555] spare in vdev=/dev/dsk/c0t5000CCA073070754d0s0 for vdev=/dev/dsk/c0t5000CCA0730626F0d0s0 [user root on nas05]
2017-12-22.00:43:22 [internal vdev attach txg:12793561] spare in vdev=/dev/dsk/c0t5000CCA073010E70d0s0 for vdev=/dev/dsk/c0t5000CCA07300C440d0s0 [user root on nas05]
2017-12-22.00:43:42 [internal vdev attach txg:12793567] spare in vdev=/dev/dsk/c0t5000CCA073018514d0s0 for vdev=/dev/dsk/c0t5000CCA05CDE3204d0s0 [user root on nas05]

 

Changes

 DIMM replacement was performed about a month back.  No other onsite activity was reported.

 

Cause

Verifying an old support bundle revealed that an entire chassis was missing in the current configuration.

By using "diff" on the akdiskmap.txt data from current support bundle and old support bundle, we understood what was missing:

 

diff akdiskmap-old.txt akdiskmap.txt

< 1514NMT0HW/HDD 0 c0t5000CCA073067B34d0
< 1514NMT0HW/HDD 1 c0t5000CCA05CDAF480d0
< 1514NMT0HW/HDD 2 c0t5000CCA0730626F0d0
< 1514NMT0HW/HDD 3 c0t5000CCA073018180d0
< 1514NMT0HW/HDD 4 c0t5000CCA05CE13EB8d0
< 1514NMT0HW/HDD 5 c0t5000CCA0730691C0d0
< 1514NMT0HW/HDD 6 c0t5000CCA073063E1Cd0
< 1514NMT0HW/HDD 7 c0t5000CCA05CD5F40Cd0
< 1514NMT0HW/HDD 8 c0t5000CCA05CDE3204d0
< 1514NMT0HW/HDD 9 c0t5000CCA07300C440d0
< 1514NMT0HW/HDD 10 c0t5000CCA073009ECCd0
< 1514NMT0HW/HDD 11 c0t5000CCA0730601A4d0
< 1514NMT0HW/HDD 12 c0t5000CCA07300F2B8d0
< 1514NMT0HW/HDD 13 c0t5000CCA0730691A8d0
< 1514NMT0HW/HDD 14 c0t5000CCA05CE122E8d0
< 1514NMT0HW/HDD 15 c0t5000CCA073058D0Cd0
< 1514NMT0HW/HDD 16 c0t5000CCA05CE0C4B4d0
< 1514NMT0HW/HDD 17 c0t5000CCA073002478d0
< 1514NMT0HW/HDD 18 c0t5000CCA07300B0D0d0
< 1514NMT0HW/HDD 19 c0t5000CCA05CE1384Cd0
< 1514NMT0HW/HDD 20 c0t5000CCA04E1002B4d0
< 1514NMT0HW/HDD 21 c0t5000CCA04E100274d0
< 1514NMT0HW/HDD 22 c0t5000CCA04E0ED87Cd0
< 1514NMT0HW/HDD 23 c0t5000CCA04E10A800d0

Node-1

Slot: PCIexp PCI9 HBA-Port #0 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f000 pmcs:19 pp01
-> Level1 JBOD DE2-24C : [1515NMT03D] via IOM1 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d780be pp01.24
Slot: PCIexp PCI9 HBA-Port #1 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f00 pmcs:18 pp05
-> Level1 JBOD DE2-24C : [1515NMT089] via IOM0 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d649fe pp05.24
Slot: PCIexp PCI9 HBA-Port #2 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f0 pmcs:17 pp09
-> Level1 JBOD DE2-24C : [1514NMT03L] via IOM1 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d6c7be pp09.24
Slot: PCIexp PCI9 HBA-Port #3 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f pmcs:16 pp0d
-> <<<< WARNING >>>> : DEAD port reported by ::pmcs -ptv for pp instance pp0d. Reseat Cable, Reseat IOM, or Reboot may help.

Node-2

Slot: PCIexp PCI9 HBA-Port #0 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f000 pmcs:18 pp01
-> Level1 JBOD DE2-24C : [1515NMT03D] via IOM1 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d780be pp01.24
Slot: PCIexp PCI9 HBA-Port #1 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f00 pmcs:19 pp05
-> Level1 JBOD DE2-24C : [1515NMT089] via IOM0 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d649fe pp05.24
Slot: PCIexp PCI9 HBA-Port #2 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f0 pmcs:17 pp09
-> Level1 JBOD DE2-24C : [1514NMT03L] via IOM1 FW: 0018 SAS-ADDR:/scsi/ses/c0t5080020001d6c7be pp09.24
Slot: PCIexp PCI9 HBA-Port #3 Path: /pci@ae,0/pci8086,e06@2,2/pci11f8,8018@0/iport@f pmcs:16 pp0d
-> <<<< WARNING >>>> : DEAD port reported by ::pmcs -ptv for pp instance pp0d. Reseat Cable, Reseat IOM, or Reboot may help.

 

NOTE: Here the HBAs had direct connection to chassis and was not in daisy chain.

 

Chassis details compared between current bundle and old bundle:

Current bundle (17-Jan)

more hw/hw.aksh |grep -i chassis
Chassis:
chassis-000  xxxxx01nas05    ok        Oracle                                             Oracle ZFS Storage ZS4-4                    1517NM900B                                         --     system
chassis-001  1514NMT0HN      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0HN                                         7200   hdd
chassis-002  1515NMT089      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT089                                         7200   hdd
chassis-003  1514NMT03L      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT03L                                         7200   hdd
chassis-004  1515NMT03D      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT03D                                         7200   hdd
chassis-005  1515NMT09U      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT09U                                         7200   hdd
chassis-006  1514NMT0E6      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0E6                                         7200   hdd
chassis-007  1514NMT0GH      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0GH                                         7200   hdd


Old Bundle (21-Oct)

more hw/hw.aksh |grep -i chassis
Chassis:
chassis-000  xxxxx01nas05    faulted   Oracle                                             Oracle ZFS Storage ZS4-4                    1517NM900B                                         --     system
chassis-001  1514NMT0HN      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0HN                                         7200   hdd
chassis-002  1515NMT089      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT089                                         7200   hdd
chassis-003  1514NMT03L      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT03L                                         7200   hdd
chassis-004  1515NMT03D      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT03D                                         7200   hdd
chassis-005  1515NMT09U      ok        Oracle                                             Oracle Storage DE2-24C                      1515NMT09U                                         7200   hdd
chassis-006  1514NMT0HW      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0HW                                         7200   hdd
chassis-007  1514NMT0E6      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0E6                                         7200   hdd
chassis-008  1514NMT0GH      ok        Oracle                                             Oracle Storage DE2-24C                      1514NMT0GH                                         7200   hdd

 

Solution

As there was no daisy chain, it is likely that the fault is with chassis or components associated with it.

Hence perform the following steps and check if the chassis is detected:

(1)  Check the cable connections
(2)  Check the power supply to chassis# 1514NMT0HW
(3)  Re-seat both IOMs on chassis# 1514NMT0HW
(4)  Replace IOM if required

 

 

References

<NOTE:1380045.1> - Sun Storage 7000 Unified Storage System: Resilver did not start after replacing a failed disk
<NOTE:1581784.1> - Sun Storage 7000 Unified Storage System: How To Map a SAS-2 (pmcs) backend from a supportbundle

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback