Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1509059.1
Update Date:2018-04-05
Keywords:

Solution Type  Problem Resolution Sure

Solution  1509059.1 :   Sun SPARC[TM] Enterprise M3000, M4000, M5000, M8000, M9000 Identifying HDDx of internal disks under mpxio control and removing internal disks from mpxio control  


Related Items
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
  • Sun Storage Traffic Manager Software
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-6431440551>

Applies to:

Sun SPARC Enterprise M5000 Server - Version Not Applicable and later
Sun SPARC Enterprise M8000 Server - Version Not Applicable and later
Sun SPARC Enterprise M9000-32 Server - Version Not Applicable and later
Sun Storage Traffic Manager Software - Version 3.0 and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Oracle Solaris on SPARC (32-bit)
Oracle Solaris on SPARC (64-bit)

Symptoms

Issue of concern is a failed internal disk under mulitpath control

Changes

 

Cause

Failed internal disk under mpxio control
 

Solution

The following example is from an M4000

format       shows an internal disk failure

Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c3t5000C50007E0A7B7d0<SUN72G cyl 14087 alt 2 hd 24 sec 424>
          /scsi_vhci/disk@g5000c50007e0a7b7
       1. c3t5000C50007E0B1A3d0<drive not available>
          /scsi_vhci/disk@g5000c50007e0b1a3

but does not provide enough information to determine which actual internal disk has failed since the internal disks are under multipath control

Caution Do Not Assume from the above that c3t5000C50007E0B1A3d0 is HDD1  since it is the second disk


iostat -En   will show the errors, the way the disks are shown in format:

c3t5000C50007E0B1A3d0 Soft Errors: 1 Hard Errors: 176 Transport Errors: 11019
Vendor: SEAGATE  Product: ST973402SSUN72G  Revision: 0603 Serial No: 074822ACY1
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 0 Predictive Failure Analysis: 6
c3t5000C50007E0A7B7d0 Soft Errors: 0 Hard Errors: 115 Transport Errors: 1
Vendor: SEAGATE  Product: ST973402SSUN72G  Revision: 0603 Serial No: 0748229N2C
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 115 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

which confirms the failure, but doesn't identify the internal disk

While the  stmsboot -L  information is not collected in the explorer, it provides the most straight forward way to determine which internal disk matches up with the multipath disk name:


# stmsboot -L
non-STMS device name                    STMS device name
------------------------------------------------------------------
/dev/rdsk/c0t0d0        /dev/rdsk/c3t5000C50007E0B1A3d0
/dev/rdsk/c0t1d0        /dev/rdsk/c3t5000C50007E0A7B7d0

which shows that the failed multipath disk c3t5000C50007E0B1A3d0 is c0t0d0 and the  ls_-lAR_@dev_@devices.out file from the disks directory in explorer shows

lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s0 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:a,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s1 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:b,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s2 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:c,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s3 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:d,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s4 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:e,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s5 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:f,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s6 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:g,raw
lrwxrwxrwx   1 root     root          64 Aug  3  2011 c0t0d0s7 ->  ../../devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0:h,raw

shows that this is HDD0 based on the path information.

Without the stmsboot information the iostat -E or the iostat -En output can be use to identify the disk.

iostat -E   will show the disks based on the sd number, but does not give the  multipath name.  By comparing the iostat -E output and the iostat -En.
               output the sd number can be determined.  The sd number is useful when reviewing messages
                    

sd3       Soft Errors: 1 Hard Errors: 176 Transport Errors: 11269
Vendor: SEAGATE  Product: ST973402SSUN72G  Revision: 0603 Serial No: 074822ACY1
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 0 Predictive Failure Analysis: 6
sd4       Soft Errors: 0 Hard Errors: 115 Transport Errors: 1
Vendor: SEAGATE  Product: ST973402SSUN72G  Revision: 0603 Serial No: 0748229N2C
Size: 73.41GB<73407865856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 115 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

the comparison shows that sd3 is the failed disk.

the following command is included in the explorer in the sysconfig directory.

cfgadm-alv

c0::0,0                        connected    configured   unknown    Client Device: /dev/dsk/c3t5000C50007E0B1A3d0s0(sd3)
unavailable  disk-path    n        /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::0,0
c0::1,0                        connected    configured   unknown    Client Device: /dev/dsk/c3t5000C50007E0A7B7d0s0(sd4)
unavailable  disk-path    n        /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::1,0

and shows that sd3 or c3t5000C50007E0B1A3d0 is /devices/pci@0,600000/pci@0/pci@8/pci@0/scsi@1:scsi::0,0   which is HDD0

 

Removing Internal Disks From stms or mpxio 


The next thing to consider is having the customer take the internal disk out of multipath control.  Putting the internal
disk under multipath control doesn't make the access to the disk any faster or reliable, makes it harder to diagnose
errors and may prevent a disk to be replaced from being quiesced.  If the disk can not be quiesced, the domain will
need to be shutdown to replace the disk.   In addition, there are bugs that causes issue with cfgadm if mpxio is enabled.

Note: Hot plug replacement of disk devices behind a mpt controller under Solaris I/O Multipathing (MPxIO/scsi_vhci) requires SCSAv3 released with Patch 144500-19.

See:

Bug 15525492 - SUNBT6776330 cfgadm scsi plugin needs to provide support for mpxio enabled contr

Bug 22463214 - cfgadm -c unconfigure for scsi_vhci controlled internal disk on M4000 fails

 



To take the 2 internal disks out of multipath control.  The command to do that is:

             stmsboot -d -D mpt

The above command will remove multipathing on all mpt based controllers on the system which is usually what is desired.  To disable on a per port basis see the section titled "Enabling or Disabling Multipathing on a Per-Port Basis" for the Solaris SAN version running on the system.

to turn mpxio control off completely

             stmsboot -d

On completion of the stmsboot command the  /kernel/drv/mpt.conf file can be checked to confirm it has been modified 

the stmsboot command will need to be done in a maintenance window as a reconfiguration reboot is required.   When the stmsboot
command is issued, you'll be prompted to reboot.

Please see the stmsboot MAN page for additional information.      

References

<NOTE:1019730.1> - Sun SPARC(R) Enterprise M3000 and M3000-E (SPARC VII+) Server Device Paths
<NOTE:1002807.1> - Sun SPARC[TM] Enterprise M4000 and M5000 Server Device Paths
<NOTE:1004116.1> - Sun SPARC(R) Enterprise M8000 and M9000 Device Paths
<BUG:15525492> - SUNBT6776330 CFGADM SCSI PLUGIN NEEDS TO PROVIDE SUPPORT FOR MPXIO ENABLED CONTR
<BUG:15403641> - SUNBT6569367 CFGADM DOESN'T GROK SAS-ATTACHED TARGETS/LUNS WHEN MPXIO IS ENABLED

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback