Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-2120675.1
Update Date:2017-11-27
Keywords:

Solution Type  Sun Alert Sure

Solution  2120675.1 :   Flash Storage System FS1-2 SAS Driver Failures May Lead to Loss of Data Access  


Related Items
  • Sun Software - Generic
  •  
  • Oracle FS1-2 Flash Storage System
  •  
  • Sun Hardware - Generic
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
History
References


Applies to:

Oracle FS1-2 Flash Storage System
Sun Software - Generic
Sun Hardware - Generic
Information in this document applies to any platform.
__________________________________________




Date of Resolved Release: 28-Mar-2016
__________________________________________

Description

Problems with the FS1-2 Controller Serial-attached SCSI (SAS) HBA driver may result in multiple drives going offline due to an issue on a single drive or Input/Output Module(IOM). This may also cause one or both controllers going offline. If both controllers go offline, access to data is affected.

Occurrence

This issue can occur on any Flash Storage System FS1-2 not having the 2.19.69.0 SAS firmware revision on their controller. This includes:

  • Systems running Release 6.1.x.
  • Systems doing a Non-Disruptive upgrade to R6.1.x.
  • Systems upgraded to Release 6.2.0 to 6.2.2 as a Non-Disruptive upgrade but has not had a system restart or controller reboot of both controllers since.
  • Systems upgraded to Release 6.2.3 as a Non-Disruptive upgrade without following the instructions in the patch install 'Read Me'.

This issue occurs when there is an error, a disruption, or a change to the internal SAS fabric such as a drive error, replacing a drive, changing/replacing a SAS cable, or adding a drive enclosure. Additional drives, drive groups, or drive enclosures may also be declared offline.

The SAS firmware version can be confirmed using the following Flash System CLI commands:

      # fscli login -u administrator -oracleFs <ip_addr_of_fs1>

      # fscli controller -list -details > controller.out

Searching the resultant file for "SAS" will take you to the first instance of that string, and will look like either one of the two examples below:

    Hba
      Slot : 2
      Status : NORMAL
      Manufacturer : PMC-Sierra
      Model : PM8018
      SerialNumber : 4B20136B717
      PartNumber : 7067091
      FirmwareRevision : 2.19.69.0
      SupportedProtocol : SAS
      HasBeenProvisioned : true
      ProvisionedAs : sas

    Hba
      Slot : 3
      Status : NORMAL
      Manufacturer : PMC-Sierra
      Model : PM8018
      SerialNumber : 211611EF7C5
      PartNumber : 7045347
      MajorFirmwareVersion : 2
      MinorFirmwareVersion : 19
      SubMinorFirmwareVersion : 66
      VariantFirmwareVersion : 0
      SupportedProtocol : SAS
      HasBeenProvisioned : true
      ProvisionedAs : sas

The first example above has the proper firmware version (2.19.69.0). The second example shows an older version in a slightly different format. Be sure to repeat the search through the end of the file as there may be as many as six SAS HBAs between the two FS1-2 controllers.

Symptoms

If a drive fails or is replaced, additional drives, drive groups, or drive enclosures may be declared offline. If an IOM experiences a fault, multiple drive groups or drive enclosures may be declared offline.

For FS1-2 systems running Release 6.1.x, controller warm starts may repeat, leading to a controller software failure and being disabled.

Events such as the following may also be observed:

    INTERNAL_OPERATION_FAILED
    CONTROLLER_CONNECTION_LOST
    PCP_EVT_ALL_NODES_DEAD
    PCP_EVT_CONTROLLER_FAILED
    PSG_PI_EVENT_PATH_FAILURE
    ENCLOSURE_TOPOLOGY_STATE_CHANGE
    RAID_EVENT_PARTIAL_OFFLINE_STATE_CHANGE
    RAID_EVENT_MAINT_BLOCKED
    RAID_EVENT_ENCLOSURE_ARRAY_DEVICE_STATUS_CRITICAL

Workaround

If any of the above issues are encountered, it is recommended that customers contact Oracle Support for assistance.

Resolution

Proactive resolution of this issue is dependent on the current Release on the FS1-2:

1. For systems running Release 6.1.x, upgrade to Release 6.2.3-280.01 (or higher). Be sure to follow the SAS HBA firmware instructions in the patch Read Me.

2. For systems running Release 6.2.x, verify the SAS firmware version as noted above or in the 6.2.3-280.01 (or higher) patch Read Me. If the SAS HBA version is not current at 2.19.69.0 (or above), there are three options:

    a. Upgrade to 6.2.3-280.01 (or higher) and follow the SAS HBA firmware instructions in the patch Read Me.
    b. Power cycle each controller one at a time (see SAS HBA firmware instructions in the patch Read Me).
    c. Perform a restart of the system (outage required).

Note: Oracle strongly recommends you upgrade to the most current release.

See <Document 1967797.1> - "FS System: How to Download Software and Firmware Updates for the FS1-2."

Prior to upgrading to the 6.2.3-280.01 firmware release, please open a Service Request with Oracle Support and attach a current log set for analysis. Oracle Support will evaluate the system prior to the upgrade to ensure the upgrade is successful.

For additional FS1-2 information, also see the following documents:

    <Document:1936501.2> - "Information Center: Oracle FS1-2 Flash Storage System"
    <Document:1968129.1> - "FS System: FS1-2 Upgrade R6.x to R6.x Procedures"

History

28-Mar-2016: Document release, status Resolved
01-Apr-2016: Formatting update, no change in content
12-Apr-2016: Minor update, change Note in Resolution, add 1967797.1 doc ref
15-Apr-2016: Minor update, no change in content
23-May-2016: Minor update to instructions in "Resolution" section, no major change in content

Internal Comments:

For customers running 6.1.x that have encountered this issue, gather a log bundle and escalate to
Engineering for a resolution. Oracle Internal customers - do not bypass Oracle Support.

For customers running 6.2.x that have encountered this issue, immediate relief requires
that the affected controller(s) be rebooted to activate the PMC-Sierra driver. See the
Read Me for patch 22926853, specifically the SAS HBA firmware instructions.

For Internal Support ONLY - Also see:
FCO A0365-1: Proactive: Flash Storage System FS1-2 SAS Driver Failures May Lead to Loss of Data Access <Document:2125578.1>

Questions regarding this document should be addressed to
sunalertpublication_us_grp@oracle.com and copy the
responsible engineer/submitter listed below.

Internal Contributor/Submitter: bob.deguc@oracle.com
Internal Eng Responsible Engineer: Lon.Stowell@oracle.com
Oracle Knowledge Analyst: david.mariotto@oracle.com
Internal Eng Business Unit Group: Flash Storage
Internal Associated SRs: 3-11101399981, 3-11005630046, 3-11220894981, 3-11587948351

References

<NOTE:1968129.1> - FS System: FS1-2 Upgrade R6.x to R6.x Procedures
<BUG:21503042> - NDU FROM 6.1.11 TO 6.1.12 ON FS WAS DISRUPTIVE TO HOST
<BUG:22894116> - SYSTEM DOWN AFTER A DRIVE ISSUE
<NOTE:1936501.2> - Information Center: Oracle FS1-2 Flash Storage System
<BUG:21663457> - FAULTY SSD DRIVE NOT SHOWN IN GUI
<BUG:22070039> - COAXM101-R2F:NDU FAILED,UNEXPECTED FOFB EVENT WHILE DOING NDU
<BUG:22329624> - CAAXM093:SYSTEM CRITICAL WITH 2 DRIVE FAILURE
<BUG:22188962> - CAAXM088 1.2TB FIPS CAAXM088 COPYBACK RESTART DE-01; CONTROLLERS FAILED
<NOTE:1967797.1> - FS System: How to Download Software and Firmware Updates for the FS1-2

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback