Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-2366718.1
Update Date:2018-05-02
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  2366718.1 :   FS System: Procedure to Inhibit an ILOM Upgrade and Enhanced Allocation Migration During an FS1-2 Software Update  


Related Items
  • Oracle FS1-2 Flash Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Flash Storage>SN-EStor: FSx
  •  




In this Document
Purpose
Scope
Details
 Background
 Verify current system build type
 Inhibit Big Build COD conversion
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Internal Commands
Created from <SR 3-16643697611>

Applies to:

Oracle FS1-2 Flash Storage System - Version 6.2 to 6.2 [Release 6.2]
Information in this document applies to any platform.

Purpose

The purpose of this document is to inhibit a system from upgrading the Integrated Lights Out Management (ILOM) version and/or enabling Enhance Allocation or what is colloquially referred to as "Big Build".

Scope

It may be necessary to perform a software update to obtain bug fixes which would otherwise make Drive or Drive Group recovery difficult if not impossible.  In those instances, there will typically be Pinned Data in Flash Backed Memory (FBM) that should be preserved if at all possible.  If a system is being updated as part of a Drive or Drive Group recovery and is a "Small Build" system (Enhanced Allocation or Big Build is NOT enabled), it is important to inhibit ILOM updates.  Any ILOM update will cause all data in FBM to be lost and inhibit Big Build from being enabled, which will result in a Cold Start failure.

Details

Background

Stripe handles are used to internally manage the striping of data in LUNs and thus are one of the basic building blocks for LUNs in the FS1-2 system.  Prior to R6.2.11, the number of stripe handles was 131,072.  Systems with large numbers of SSD Drive Groups and Auto-Tiering can exceed this number and experience allocation failures.  See R6.2.16-0545.01 Patch README for more details.  The number of stripe handles was increased to over 1 million with Big Build.  Doing this involves a reorganization of the system's Configuration On Disk or COD.  Big Build is enabled anytime an FS1-2 has a Disruptive Upgrade (DU) to R6.2.11 or higher.  Once completed, it is not possible to revert back to a "Small Build" system without a complete backup and restore of data as well as recreating the FS1-2 configuration (LUNs, Hosts etc).

If performing an upgrade to help recover a "Small Build" system, the following items will help keep the system from attempting to install any ILOM updates or enable Big Build. For example, if attempting to recover Drives or Drive Groups on any release below 6.2.16, a non-disruptive upgrade to 6.2.16 will install fixes for many of the issues that will otherwise prevent recovery. Doing a Non-Disruptive Upgrade or NDU will inhibit the clean shutdown that would attempt to install an ILOM update which would erase all of memory--triggering data loss.

NOTE: Recent changes in PacMan for upgrades will prevent using an NDU if the Controllers are not booted far enough to respond to requests from PacMan for Controller information.

CAUTION: If a Disruptive Upgrade (DU) is performed by selecting EITHER or both of the following in the Update Software window of the GUI:

  • "Restart and update software (disrupts data access)" radio button
  • "Shutdown Controller" box under Software Update Options

or if doing the upgrade via fscli using EITHER or both of these options:

  • -disruptive
  • -forceControllerShutdown

on any FS1 with Small Build, that FS1 system will fail cold start in Boot State ConMan and require manual recovery.

Verify current system build type

There are several methods to determine whether or not to inhibit a COD conversion.  Using the ones via shell access are preferred as they obtain the information from the live system.

  • If shell access is available to the Pilots:
    • Use the ver command to extract the Controller Firmware version:
      [root@pilot2 ~]# ver | grep 172.30
      172.30.80.128 :  2060-00004-060216-054551
      172.30.80.129 :  2060-00004-060216-054551
      [root@pilot2 ~]#
       
    • Use the rpm utility to extract the Controller Firmware version:
      [root@pilot2 ~]# rpm -qa | grep oraclefs-controller
      oraclefs-controller-060216-054551.x86_64
      [root@pilot2 ~]#
       
      If the least two significant digits is 50 or higher (example: 06.02.nn-nnnn.5n) it is not necessary to inhibit Big Build COD conversion.
      If the least two significant digits is 48 or lower (example: 06.02.nn-nnnn.0n) it will be necessary to inhibit installing the Big Build COD conversion.

      NOTE: if the ver and rpm methods do not agree, STOP and engage Oracle Support before proceeding.
       
  • From the Oracle FS System Manger GUI, select the System tab and then select System Information.  At the bottom, verify the Controller Firmware Version.  As indicated above, focus on the least two significant digits.
  • After a log bundle has been run through the scanlog utility a BIG_BUILD file will be in the same directory as the scanlog_summary.txt file. If this file exists, it is not necessary to inhibit COD conversion.
  • After a log bundle has been run through the scanlog utility, use the COD dump:
    1. Locate a log bundle with the cod file. The name will be an alphanumeric string ending in .cod, cod.tar, or cod.tar.gz and extract as needed:
      % tar xvf A136B0559494795F.cod.tar
      A136B0559494795F.cod
      %
       
    2. Use dumpCod6211 utility to convert the cod file to text and grep for the 5 lines after "Master Block":
      % /cores_data/local/tools/pillar/dumpCod6211 A136B0559494795F.cod | grep -A 5 "Master Block"
      Master Block:
         signature: PDSCOD
         SSN:       AK00126934
         status:    1
         maj/min:   60/0
         gen:       0x59c9955a0000295f  (reset Mon Sep 25 23:46:34 2017 UTC)
      --
      Master Block:
         signature: HW_COD
         SSN:       AK00126934
         status:    1
         maj/min:   60/0
         gen:       0x59c9955500000406  (reset Mon Sep 25 23:46:34 2017 UTC)
      %
       
    3. If the maj/min line in the Master Block begins with 60 (like the above example) it is not necessary to inhibit Big Build COD conversion.
    4. If the maj/min line in the Master Block begins with 6 (like the example below) it will be necessary to inhibit installing the Big Build COD conversion.
      Master Block:
        signature: PDSCOD
        SSN:       AK00795540
        status:    1
        maj/min:   6/21
        gen:    0x5575ce7500002d62 (reset Mon Jun 8 17:18:45 2015 UTC)
       

Inhibit Big Build COD conversion

  1. Stage the higher software release.  See KM Document 1967797.1 FS System: How to Download Software and Firmware Updates for the FS1-2 for the latest release.
  2. If the system is on Small Build:
    1. Enable SSH with fscli.  See KM Document 2029847.1 FS System: How to Enable SSH Access to the Pilot for details.
    2. ssh to the active Pilot (Pilot 2 in this example) using the shared IP address and prevent the installation of Big Build:
      [root@pilot2 ~]# ssh pilot1 touch /var/lib/pillar/PDS_NO_COD_MIGRATION
      [root@pilot2 ~]# ssh pilot1 ls /var/lib/pillar/PDS_NO_COD_MIGRATION
      /var/lib/pillar/PDS_NO_COD_MIGRATION
      [root@pilot2 ~]# touch /var/lib/pillar/PDS_NO_COD_MIGRATION
      [root@pilot2 ~]# ls /var/lib/pillar/PDS_NO_COD_MIGRATION
      /var/lib/pillar/PDS_NO_COD_MIGRATION
      [root@pilot2 ~]#
       
    3. Exit ssh session.
       
  3. Launch the upgrade as a Disruptive Update.
    1. Select radio button "Restart and update software (disrupts data access)".
    2. Select "Ignore hardware status" box.
    3. Select "Ignore system alerts" box.
    4. Select "Ignore current requests" box (this will halt any tasks that are running).
    5. If a prior software update has failed, Select "Override failed software update" box.
    6. DO NOT Select "Shutdown Controller" box as this will result in data loss.
       
  4. After the upgrade completes, clear the flags that prevent a COD migration that were set in step 2 above:
    [root@pilot2 ~]# ssh pilot1 rm /var/lib/pillar/PDS_NO_COD_MIGRATION
    [root@pilot2 ~]# ssh pilot1 ls /var/lib/pillar/PDS_NO_COD_MIGRATION
    ls: cannot access /var/lib/pillar/PDS_NO_COD_MIGRATION: No such file or directory
    [root@pilot2 ~]# rm /var/lib/pillar/PDS_NO_COD_MIGRATION
    [root@pilot2 ~]# ls /var/lib/pillar/PDS_NO_COD_MIGRATION
    ls: cannot access /var/lib/pillar/PDS_NO_COD_MIGRATION: No such file or directory
    [root@pilot2 ~]#
     
  5. If necessary, continue with any Drive Group or Drive Recovery actions.
    NOTE: Do not attempt to restore a Drive Group With Data Loss if any new drives from logistics have been installed. Engineering advice on recovery will be required if a normal Drive Group Restore [without data loss] cannot be done.

If any other issues are encountered, engage FS Engineering for additional assistance.  See also Document 2366777.1 FS System: Procedure to Restore an FS1-2 Back to Small Build After a Failed Upgrade.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback