Asset ID: |
1-72-1509059.1 |
Update Date: | 2018-04-05 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1509059.1
:
Sun SPARC[TM] Enterprise M3000, M4000, M5000, M8000, M9000 Identifying HDDx of internal disks under mpxio control and removing internal disks from mpxio control
Related Items |
- Sun SPARC Enterprise M9000-32 Server
- Sun SPARC Enterprise M8000 Server
- Sun SPARC Enterprise M5000 Server
- Sun SPARC Enterprise M4000 Server
- Sun SPARC Enterprise M3000 Server
- Sun Storage Traffic Manager Software
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
|
In this Document
Created from <SR 3-6431440551>
Applies to:
Sun SPARC Enterprise M5000 Server - Version Not Applicable and later
Sun SPARC Enterprise M8000 Server - Version Not Applicable and later
Sun SPARC Enterprise M9000-32 Server - Version Not Applicable and later
Sun Storage Traffic Manager Software - Version 3.0 and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Oracle Solaris on SPARC (32-bit)
Oracle Solaris on SPARC (64-bit)
Symptoms
Issue of concern is a failed internal disk under mulitpath control
Changes
Cause
Failed internal disk under mpxio control
Solution
The following example is from an M4000
format shows an internal disk failure
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c3t5000C50007E0A7B7d0<SUN72G cyl 14087 alt 2 hd 24 sec 424>
/scsi_vhci/disk@g5000c50007e0a7b7
1. c3t5000C50007E0B1A3d0<drive not available>
/scsi_vhci/disk@g5000c50007e0b1a3
but does not provide enough information to determine which actual internal disk has failed since the internal disks are under multipath control
Caution Do Not Assume from the above that c3t5000C50007E0B1A3d0 is HDD1 since it is the second disk
iostat -En will show the errors, the way the disks are shown in format:
c3t5000C50007E0B1A3d0 Soft Errors: 1 Hard Errors: 176 Transport Errors: 11019
Vendor: SEAGATE Product: ST973402SSUN72G Revision: 0603 Serial No: 074822ACY1
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 0 Predictive Failure Analysis: 6
c3t5000C50007E0A7B7d0 Soft Errors: 0 Hard Errors: 115 Transport Errors: 1
Vendor: SEAGATE Product: ST973402SSUN72G Revision: 0603 Serial No: 0748229N2C
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 115 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
which confirms the failure, but doesn't identify the internal disk
While the stmsboot -L information is not collected in the explorer, it provides the most straight forward way to determine which internal disk matches up with the multipath disk name:
# stmsboot -L
non-STMS device name STMS device name
------------------------------------------------------------------
/dev/rdsk/c0t0d0 /dev/rdsk/c3t5000C50007E0B1A3d0
/dev/rdsk/c0t1d0 /dev/rdsk/c3t5000C50007E0A7B7d0
which shows that the failed multipath disk c3t5000C50007E0B1A3d0 is c0t0d0 and the ls_-lAR_@dev_@devices.out file from the disks directory in explorer shows
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s0 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:a,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s1 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:b,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s2 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:c,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s3 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:d,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s4 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:e,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s5 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:f,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s6 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:g,raw
lrwxrwxrwx 1 root root 64 Aug 3 2011 c0t0d0s7 -> ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:h,raw
shows that this is HDD0 based on the path information.
Without the stmsboot information the iostat -E or the iostat -En output can be use to identify the disk.
iostat -E will show the disks based on the sd number, but does not give the multipath name. By comparing the iostat -E output and the iostat -En.
output the sd number can be determined. The sd number is useful when reviewing messages
sd3 Soft Errors: 1 Hard Errors: 176 Transport Errors: 11269
Vendor: SEAGATE Product: ST973402SSUN72G Revision: 0603 Serial No: 074822ACY1
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 0 Predictive Failure Analysis: 6
sd4 Soft Errors: 0 Hard Errors: 115 Transport Errors: 1
Vendor: SEAGATE Product: ST973402SSUN72G Revision: 0603 Serial No: 0748229N2C
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 115 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
the comparison shows that sd3 is the failed disk.
the following command is included in the explorer in the sysconfig directory.
cfgadm-alv
c0::0,0 connected configured unknown Client Device: /dev/dsk/c3t5000C50007E0B1A3d0s0(sd3)
unavailable disk-path n /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::0,0
c0::1,0 connected configured unknown Client Device: /dev/dsk/c3t5000C50007E0A7B7d0s0(sd4)
unavailable disk-path n /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::1,0
and shows that sd3 or c3t5000C50007E0B1A3d0 is /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::0,0 which is HDD0
Removing Internal Disks From stms or mpxio
The next thing to consider is having the customer take the internal disk out of multipath control. Putting the internal
disk under multipath control doesn't make the access to the disk any faster or reliable, makes it harder to diagnose
errors and may prevent a disk to be replaced from being quiesced. If the disk can not be quiesced, the domain will
need to be shutdown to replace the disk. In addition, there are bugs that causes issue with cfgadm if mpxio is enabled.
Note: Hot plug replacement of disk devices behind a mpt controller under Solaris I/O Multipathing (MPxIO/scsi_vhci) requires SCSAv3 released with Patch 144500-19.
See:
Bug 15525492 - SUNBT6776330 cfgadm scsi plugin needs to provide support for mpxio enabled contr
Bug 22463214 - cfgadm -c unconfigure for scsi_vhci controlled internal disk on M4000 fails
To take the 2 internal disks out of multipath control. The command to do that is:
stmsboot -d -D mpt
The above command will remove multipathing on all mpt based controllers on the system which is usually what is desired. To disable on a per port basis see the section titled "Enabling or Disabling Multipathing on a Per-Port Basis" for the Solaris SAN version running on the system.
to turn mpxio control off completely
stmsboot -d
On completion of the stmsboot command the /kernel/drv/mpt.conf file can be checked to confirm it has been modified
the stmsboot command will need to be done in a maintenance window as a reconfiguration reboot is required. When the stmsboot
command is issued, you'll be prompted to reboot.
Please see the stmsboot MAN page for additional information.
References
<NOTE:1019730.1> - Sun SPARC(R) Enterprise M3000 and M3000-E (SPARC VII+) Server Device Paths
<NOTE:1002807.1> - Sun SPARC[TM] Enterprise M4000 and M5000 Server Device Paths
<NOTE:1004116.1> - Sun SPARC(R) Enterprise M8000 and M9000 Device Paths
<BUG:15525492> - SUNBT6776330 CFGADM SCSI PLUGIN NEEDS TO PROVIDE SUPPORT FOR MPXIO ENABLED CONTR
<BUG:15403641> - SUNBT6569367 CFGADM DOESN'T GROK SAS-ATTACHED TARGETS/LUNS WHEN MPXIO IS ENABLED
Attachments
This solution has no attachment