Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1955838.1
Update Date:2015-08-28
Keywords:

Solution Type  Problem Resolution Sure

Solution  1955838.1 :   FS System: Selecting an FS1 Boot Drive After Running Format in Solaris Causes Apparent Hang  


Related Items
  • Oracle FS1-2 Flash Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Flash Storage>SN-EStor: FSx
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-9996911671>

Applies to:

Oracle FS1-2 Flash Storage System - Version All Versions to All Versions [Release All Releases]
SunOS

Symptoms

When running format in Solaris, then selecting FS1 drive that was configured as a boot lun within the Oracle FS System Manager GUI, can "hang" for several minutes.  Solaris will also log SCSI Resets on all paths to LUN. 

root@fs1-host:/var/adm# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t5000C50007EC9B97d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
@           /scsi_vhci/disk@g5000c50007ec9b97
       1. c0t5000C50007EC996Bd0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
@           /scsi_vhci/disk@g5000c50007ec996b
       2. c0t6000B08414B303031323639333400000d0 <Oracle-Oracle FS1-2-6102 cyl
26106 alt 2 hd 255 sec 63>
@           /scsi_vhci/ssd@g6000b08414b303031323639333400000
       3. c0t6000B08414B303031323639333400001d0 <Oracle-Oracle FS1-2-6103 cyl
13052 alt 2 hd 255 sec 63>
@           /scsi_vhci/ssd@g6000b08414b303031323639333400001
Specify disk (enter its number): 3
selecting c0t6000B08414B303031323639333400001d0
[disk formatted] -------------------------------------------------------------> hangs at this point

  

NOTE: This will eventually finish with no intervention.

 

Similar "reset" message will be seen during format operation on all paths to the LUN


Dec 12 11:33:19 hostxyz scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1
Dec 12 11:33:19 hostxyz scsi: [ID 243001 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@c/SUNW,qlc@0,1/fp@0,0 (fcp6):
Dec 12 11:33:19 hostxyz        FCP: WWN 0x2100000e1e1cd850   reset successfully
Dec 12 11:34:19 hostxyz scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1
Dec 12 11:34:19 hostxyz scsi: [ID 243001 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@c/SUNW,qlc@0,1/fp@0,0 (fcp6):
Dec 12 11:34:19 hostxyz        FCP: WWN 0x2100000e1e1cd850   reset successfully
Dec 12 11:35:19 hostxyz scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1
Dec 12 11:35:19 hostxyz scsi: [ID 243001 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@c/SUNW,qlc@0,1/fp@0,0 (fcp6):
Dec 12 11:35:19 hostxyz        FCP: WWN 0x2100000e1e1cd850   reset successfully

  

Cause

The FS1 has integrated data protection to detect both block and location level corruption.  Under normal circumstances, this feature will operate invisibly.  However, under certain circumstances such as the initial scan of a lun, the host will perform a Read before writing the region.  This requires the FS1 to perform special handling and it will request that the host re-drive the Read.  Upon receiving this request for a retry, the Solaris host will wait 5 seconds before issuing the Read again.  This can lead to slow formats.

The FS1 System performs disk integrity checks on unwritten sectors when an initialization is invoked.  The problem occurs when Solaris is handed back SCSI busy responses while this is occurring.  Solaris adds 5 second delays in between each busy response which is delaying the overall time to check the LUN.  The lun is conditioned with zeros at creation which renders the ref tag sitting at the back of the 520 byte block invalid.  The ref tag should have the low 4 bytes of the block LBA in it.  This causes the FC chip to fail the read the first time with a ref tag failure.  Once the chip returns with this, all the Fibre Channel Protocol (FCP) can do is send a Busy or Task Set Full (TSF) response back to the host.  The FCP states that it is illegal to attempt the send again.  So, the FCP makes sure the data looks like a “Read before Write” and makes a note of that starting LBA and returns Busy or TSF.  Then when the IO comes back on the retry, we recognize it and turn off ref tag checking to get the Read to succeed.  More information about "Reference Tags" can be found in the FS System Manager help topics or KM Article: 1955116.1

 

 

Solution

There are currently two workarounds.

1. Select the "Disable Reference Tag" box for the LUN in question.

screen shot of options

or

2. From the Solaris host, set the following environmental variable as user root prior to running format and selecting drive. Unless added to the shell profile, this will be lost after logout.

# NOINUSE_CHECK=1;export NOINUSE_CHECK

Now run format, select FS1 disk.

To make this permanent, the user shell would need to be edited to include this, which would make this persist after logout.

For more information, see KM Articles: <Document 1385747.1> and <Document 1005435.1>

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback