![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2029948.1 : Solaris 11 - After Expanding EMC LUN : Corrupt label - bad geometry - Label says 681561600 blocks; Drive says 419430400 blocks
In this Document
Created from <SR 3-10876411501> Applies to:SPARC T5-2 - Version All Versions and laterSolaris Operating System - Version 11 11/11 and later Information in this document applies to any platform. SymptomsSolaris 11.1 server with Oracle FC HBA connected to SAN to access EMC VNX disk array EMC LUNs are under emc powerpath control, each LUNs have 4 paths. Pseudo name=emcpower110a
VNX ID=CKM00112201084 [T5-2_server01] Logical device ID=600601608C403400BEC7C3FEB30DE511 [LUN 968_SMA] state=alive; policy=CLAROpt; queued-IOs=0 Owner: default=SP A, current=SP A Array failover mode: 4 ============================================================================== --------------- Host --------------- - Stor - -- I/O Path -- -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 3083 pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 c14t500601683EA01BB9d58s0 SP B0 active alive 0 0 3083 pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 c14t500601613EA01BB9d58s0 SP A1 active alive 0 0 3080 pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 c7t500601693EA01BB9d58s0 SP B1 active alive 0 0 3080 pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 c7t500601603EA01BB9d58s0 SP A0 active alive 0 0
Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 (fcp11): Jun 8 13:25:57 server01 Lun=3a for target=11700 reappeared Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 (fcp8): Jun 8 13:25:57 server01 Lun=3a for target=11700 reappeared Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 (fcp11): Jun 8 13:25:57 server01 ndi_devi_online: failed for scsa,00.bfcp: target=11700 lun=3a ffffffff Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 (fcp8): Jun 8 13:25:57 server01 ndi_devi_online: failed for scsa,00.bfcp: target=11700 lun=3a ffffffff Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 (fcp8): Jun 8 13:25:57 server01 Lun=3a for target=10f00 reappeared Jun 8 13:25:57 server01 scsi: [ID 243001 kern.info] /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0 (fcp8): Jun 8 13:25:57 server01 ndi_devi_online: failed for scsa,00.bfcp: target=10f00 lun=3a ffffffff Jun 8 13:26:00 server01 scsi: [ID 243001 kern.info] /pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 (fcp11): Jun 8 13:26:00 server01 Lun=3a for target=10f00 reappeared Jun 8 13:26:00 server01 scsi: [ID 243001 kern.info] /pci@380/pci@1/pci@0/pci@5/SUNW,qlc@0/fp@0,0 (fcp11): Jun 8 13:26:00 server01 ndi_devi_online: failed for scsa,00.bfcp: target=10f00 lun=3a ffffffff Jun 8 13:27:04 server01 cmlb: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0/ssd@w500601603ea01bb9,3a (ssd240): Jun 8 13:27:04 server01 Corrupt label; wrong magic number Jun 8 13:27:04 server01 cmlb: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0/ssd@w500601603ea01bb9,3a (ssd240): Jun 8 13:27:04 server01 Corrupt label; wrong magic number ...
Jun 8 20:27:56 server01 cmlb: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@4/SUNW,qlc@0/fp@0,0/ssd@w500601693ea01bb9,3a (ssd249):
Jun 8 20:27:56 server01 Corrupt label - bad geometry Jun 8 20:27:56 server01 cmlb: [ID 107833 kern.notice] Label says 681561600 blocks; Drive says 419430400 blocks The Solaris partition table shows the right information, EMC has confirmed from the Storage Array that this LUN is now around 325 GB bash-3.2$ more c14t500601613EA01BB9d58s0
* /dev/rdsk/c14t500601613EA01BB9d58s0 partition map * * Dimensions: * 512 bytes/sector * 50 sectors/track * 256 tracks/cylinder * 12800 sectors/cylinder * 53248 cylinders * 53246 accessible cylinders * * Flags: * 1: unmountable * 10: read-only * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 0 00 0 209715200 209715199 1 0 00 209715200 157286400 367001599 2 5 01 0 681548800 681548799 <<<----------- 3 0 00 367001600 157286400 524287999 4 0 00 524288000 157260800 681548799 From partition s2 (the whole disk) we see 681548800 blocks / 2 = 340774400 KB /1024 = 332787 MB /1024 = 324 GB --> very similar to: "Label says 681561600 blocks" / 2 = 340780800 KB /1024 = 332793,75 MB /1024 = 324,99 GB --> same size as reported by EMC So the message reports the right information "Label says 681561600 blocks" , when format runs the disk respond with the right information, but the solaris driver still thinks is has a lower size "Drive says 419430400 blocks"
In addition to that, there are many transport error against this LUN , observed on each ssd instance related with this LUN, ie: ssd232 Soft Errors: 0 Hard Errors: 3 Transport Errors: 18664
Vendor: DGC Product: VRAID Revision: 0532 Device Id: id1,ssd@n600601609a402d00a872244180c2e011 Size: 214.75GB Media Error: 0 Device Not Ready: 0 No Device: 3 Recoverable: 0 Illegal Request: 30 Predictive Failure Analysis: 0
CauseYou most probably are hitting Bug 18239194 - syslog shows errors after LUN expansion on Solaris 11.1Bug 18239194 - syslog shows errors after LUN expansion on Solaris 11.1
New Bug was opened and closed for a similar issue on a Solaris 11.3 SRU 10.5.0 T5-4 server, customer mapped initially a wrong volume with "141419520 blocks" size, then they unmapped that volume and then mapped the good volume (using the same LUN number), with size 176770560 blocks Feb 19 02:49:29 server03 cmlb: [ID 107833 kern.notice] Label says 176770560 blocks; Drive says 141419520 blocks Bug 25584831 - syslog shows errors after LUN replacement 11.3 In order to troubleshoot this aesthetic error further, bug engineer has created a new dtrace script for getting sense data returned by the target : "analyze_sense_1.d" Save above dtrace to file in /var/tmp/ and call it analyze_sense_1.d, then enable the perm Then, as a root, run this command: Once you have reproduced the problem (on the messages files, you should see previous error ) you can CTRL-C the script. Then collect a new explorer from the system, and upload the output of the dtrace and the new explorer to this SR.
SolutionFix has been provided on: Solaris 10 Sparc: Kernel patch 150400-31 Solaris 11.2 SRU 13.6.0 or greater Oracle Solaris 11.2 Support Repository Updates (SRU) Index (Doc ID 1672221.1)
References<BUG:18239194> - SYSLOG SHOWS ERRORS AFTER LUN EXPANSION ON SOLARIS 11.1<NOTE:1672221.1> - Oracle Solaris 11.2 Support Repository Updates (SRU) Index <BUG:25584831> - SYSLOG SHOWS ERRORS AFTER LUN REPLACEMENT 11.3 Attachments This solution has no attachment |
||||||||||||||||||
|