Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2332191.1
Update Date:2018-03-07
Keywords:

Solution Type  Problem Resolution Sure

Solution  2332191.1 :   Unable to Unconfigure EMC Storage on Solaris 11 Guest IO LDom - cfgadm: Library error: remove operation failed  


Related Items
  • SPARC M6-32
  •  
  • Solaris Operating System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>HBA>SN-DK: FC HBA
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-15526172281>

Applies to:

SPARC M6-32 - Version All Versions and later
Solaris Operating System - Version 10 3/05 and later
Information in this document applies to any platform.

Symptoms

This is a Solaris 11.3 SRU 12 guest IO LDom "guest01" with two VF FC HBAs, accessing an EMC storage array, this is the connectivity from Solaris server:

c2 = qlc1 (fp2) -> /devices/pci@a40/pci@1/pci@0/pci@4/SUNW,qlc@0,1c/fp@0,0:devctl
================================================================================
Port_ID Port WWN Device Description Type
------- -------- ------------------ ----
b006c0 50060165xxxxxxxx -> EMC (Disk device)
b00720 5006016cxxxxxxxx -> EMC (Disk device)
b00226 100000144xxxxx7b -> Sun On-Board FC Ctlr (Unknown Type,Host Bus Adapter)

c1 = qlc0 (fp4) -> /devices/pci@740/pci@1/pci@0/pci@8/SUNW,qlc@0,c/fp@0,0:devctl
================================================================================
Port_ID Port WWN Device Description Type
------- -------- ------------------ ----
570640 50060164xxxxxxxx -> EMC (Disk device)
570680 5006016dxxxxxxxx -> EMC (Disk device)
570366 100000144xxxxx7a -> Sun On-Board FC Ctlr (Unknown Type,Host Bus Adapter)

 

Disk from this EMC array are under mpxio , BUT the behavior described on this document has been observed also with mpxio disabled, so mpxio is not a factor here.

Customer pretend to remove this EMC storage from this server.

On this example there is a LUN 0 of 10GB mapped from the EMC array to this server, this is important here:

5. c0t6006016018F03A0006B01C32C780E711d0 <DGC-VRAID-0532-9.98GB>
/scsi_vhci/ssd@g6006016018f03a0006b01c32c780e711

Four path under mpxio, see with command "mpathadm list LU"

/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s2
Total Path Count: 4
Operational Path Count: 4

 

If we unmap on the EMC storage this LUN 0, then we fail to remove this LUN / EMC storage from Solaris , it will remain visible under cfgadm as failing

The procedure should be:
cfgadm -c configure cX --> to make the lun 0 as unusable
cfgadm -c unconfigure -o unusable_SCSI_LUN <c#::pwwn> --> to clear the unusable LUN
devfsadm -Cv --> to clear device tree

 

Description of events:

After LUN 0 is unmaped , the disk is seen on format as drive type unknown, this is expected

5. c0t6006016018F03A0006B01C32C780E711d0 <drive type unknown>
/scsi_vhci/ssd@g6006016018f03a0006b01c32c780e711

Note. The EMC array will present a new LUNZ seen as LUN 0 (this LUNZ is only presented by EMC arrays)

# cfgadm -alo show_SCSI_LUN
Ap_Id Type Receptacle Occupant Condition
c1 fc-fabric connected configured unknown

c1::50060164xxxxxxxx,0 disk connected configured unusable

c1::5006016dxxxxxxxx,0 disk connected configured unusable

c2 fc-fabric connected configured unknown

c2::50060165xxxxxxxx,0 disk connected configured unusable

c2::5006016cxxxxxxxx,0 disk connected configured unusable

 

We run with no errors these two commands:

cfgadm -c unconfigure -o unusable_SCSI_LUN c1::50060164xxxxxxxx

cfgadm -c unconfigure -o unusable_SCSI_LUN c1::5006016dxxxxxxxx

cfgadm -c unconfigure -o unusable_SCSI_LUN c2::50060165xxxxxxxx

cfgadm -c unconfigure -o unusable_SCSI_LUN c2::5006016cxxxxxxxx

 

Also remove the entries from device tree:

# devfsadm -Cvvv
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s0
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s1
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s2
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s3
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s4
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s5
devfsadm[25734]: verbose: removing file:
/dev/dsk/c0t6006016018F03A0006B01C32C780E711d0s6
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t50060164xxxxxxxxd0s7
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/dsk/c1t5006016Dxxxxxxxxd0s7
devfsadm[25734]: verbose: removing file:
/dev/dsk/c1t6006016018F03A0006B01C32C780E711d0
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t50060165xxxxxxxxd0s7
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/dsk/c2t5006016Cxxxxxxxxd0s7
devfsadm[25734]: verbose: removing file:
/dev/dsk/c2t6006016018F03A0006B01C32C780E711d0
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s0
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s1
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s2
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s3
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s4
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s5
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s6
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t50060164xxxxxxxxd0s7
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/rdsk/c1t5006016Dxxxxxxxxd0s7
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c1t6006016018F03A0006B01C32C780E711d0
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t50060165xxxxxxxxd0s7
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s0
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s1
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s2
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s3
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s4
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s5
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s6
devfsadm[25734]: verbose: removing file: /dev/rdsk/c2t5006016Cxxxxxxxxd0s7
devfsadm[25734]: verbose: removing file:
/dev/rdsk/c2t6006016018F03A0006B01C32C780E711d0

 

 

But cfgadm is still present LUN 0 as failing

# cfgadm -alo show_SCSI_LUNAp_Id Type Receptacle Occupant Condition
c1 fc-fabric connected configured unknown


c1::50060164xxxxxxxx,0 disk connected configured failing
c1::5006016dxxxxxxxx,0 disk connected configured failing


c2 fc-fabric connected configured unknown


c2::50060165xxxxxxxx,0 disk connected configured failing
c2::5006016cxxxxxxxx,0 disk connected configured failing

 

And mpathadm still present 4 paths

# mpathadm list LU

...
/scsi_vhci/ssd@g6006016018f03a0006b01c32c780e711
Total Path Count: 4
Operational Path Count: 4

# mpathadm show lu /dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s2
Logical Unit: /dev/rdsk/c0t6006016018F03A0006B01C32C780E711d0s2
mpath-support: libmpscsi_vhci.so
Vendor: DGC
Product: VRAID
Revision: 0533
Name Type: unknown type
Name: 6006016018f03a0006b01c32c780e711
Asymmetric: yes
Current Load Balance: round-robin
Logical Unit Group ID: NA
Auto Failback: on
Auto Probing: NA
Paths:
Initiator Port Name: 100000144xxxxx7d
Target Port Name: 50060165xxxxxxxx
Override Path: NA
Path State: OK
Disabled: no
Initiator Port Name: 100000144xxxxx7d
Target Port Name: 5006016cxxxxxxxx
Override Path: NA
Path State: OK
Disabled: no
Initiator Port Name: 100000144xxxxx7c
Target Port Name: 50060164xxxxxxxx
Override Path: NA
Path State: OK
Disabled: no
Initiator Port Name: 100000144xxxxx7c
Target Port Name: 5006016dxxxxxxxx
Override Path: NA
Path State: OK
Disabled: no
Target Port Groups:
ID: 1
Explicit Failover: yes
Access State: active not optimized
Target Ports:
Name: 50060165xxxxxxxx
Relative ID: 6
Name: 50060164xxxxxxxx
Relative ID: 5
ID: 2
Explicit Failover: yes
Access State: active optimized
Target Ports:
Name: 5006016cxxxxxxxx
Relative ID: 13
Name: 5006016dxxxxxxxx
Relative ID: 14

 

If we try to change LUN 0 status to unusable or some other operation, we get an error:

# cfgadm -c configure c1::50060164xxxxxxxx
cfgadm: Insufficient condition

# cfgadm -c unconfigure c1::50060164xxxxxxxx
cfgadm: Library error: remove operation failed:
/devices/pci@740/pci@1/pci@0/pci@8/SUNW,qlc@0,c/fp@0,0/ssd@w50060164xxxxxxxx,0: No such device or address

And cfgadm still present LUN 0 failing as before.

 

Cause

This bug could be related:
BUG:22045231 - CFGADM UNABLE TO MOVE FAILING LUN TO UNUSABLE STATE, HENCE CANT BE REMOVED
Described on this other document:
cfgadm unable to move failing LUN to unusable state, hence can't be removed (Doc ID 2150539.1)
--> Fix delivered in Oracle Solaris 11.3.7.6

But customer already have the fix for that bug (Solaris 11.3 SRU 12) , and problem persist.
 

Solution

Workaround 1 (consider to do this on a window maintenance, this action has not been tested on all scenarios)

On the control domain, remove from the guest LDom the VF associated to these controllers and add it again to the guest LDom again:

On the control domain (cdom) find the VF associated to this guest LDom "guest01"

root@cdom # ldm ls -l guest01 | grep VF
pci@740/pci@1/pci@0/pci@8/SUNW,qlc@0,c /SYS/IOU1/PCIE1/IOVFC.PF0.VF10
pci@a40/pci@1/pci@0/pci@4/SUNW,qlc@0,1c /SYS/IOU1/PCIE16/IOVFC.PF1.VF10

On this example we remove/add VF associated to c1

root@cdom # ldm remove-io /SYS/IOU1/PCIE1/IOVFC.PF0.VF10 guest01
root@cdom # ldm add-io /SYS/IOU1/PCIE1/IOVFC.PF0.VF10 guest01

Note. The action of removing a VF with command "ldm remove-io" will cause a link down event and a fma faulty event that says a FRU "has been removed from the system.", access through that VF FC HBA will be lost until it is added again with command "ldm add-io"

After that, from the guest LDom  the failing entries do not appear anymore on c1

# cfgadm -alo show_SCSI_LUN
Ap_Id Type Receptacle Occupant Condition
c1 fc-fabric connected configured unknown

c2 fc-fabric connected configured unknown

c2::50060165xxxxxxxx,0 disk connected configured failing
c2::5006016cxxxxxxxx,0 disk connected configured failing

Now you can repeat the same action with VF associated to c2.

 

Workaround 2 : Disconnect the EMC storage from the SAN (ie remove EMC controller ports WWN from the FC switch zoning with this server)  while the EMC LUNs are still mapped to this server (guest LDom)

After that action, Solaris will detect all EMC controller ports disappear and LUNs will be placed offline by fcp and ssd , then we should be able to clear the misssing EMC luns with commands described :

cfgadm -c configure cX --> to make the LUNs as unusable
cfgadm -c unconfigure -o unusable_SCSI_LUN <c#::pwwn> --> to clear the unusable LUNs
devfsadm -Cv --> to clear old LUN entries from device tree

 

References

<NOTE:1999091.1> - How to Create and Assign FC SR-IOV Virtual Functions (VFs) on an Emulex FC HBA
<NOTE:1325454.1> - Oracle VM Server for SPARC PCIe Direct I/O and SR-IOV Features

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback