Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1478800.1
Update Date:2014-12-04
Keywords:

Solution Type  Technical Instruction Sure

Solution  1478800.1 :   Pillar Axiom: How to handle a Brick Missing Callhome Event  


Related Items
  • Pillar Axiom 500 Storage System
  •  
  • Pillar Axiom 600 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Axiom>SN-DK: Ax600
  •  




In this Document
Goal
Solution
References


Created from <SR 3-6001267881>

Applies to:

Pillar Axiom 500 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 600 Storage System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

How to recover from a Brick Missing event (BrickMissing, CM_EVT_BRICK_MISSING, and CM_EVT_BRICK_REMOVED)
 

If you still have questions after reading this article,  go to the  My Oracle Support Community - Pillar Axiom Storage System

Solution

Many times these events are spurious in nature.  Please check the following items to confirm the validity of the error:

  1. Please check if there has been a site power issue.
  2. Please check if this event was caused by a maintenance activity.
  3. Please check if this issue is being worked under a different SR.

In most cases the following may be attempted, usually via Webex, if the Brick hasn't recovered:

  1. Check that there's power going to the Brick's power supplies
  2. Check cabling in accordance to the Cabling Guide
  3. Check LED status of all the cable ports and component LEDs (power supplies, ES Module)
  4. Console into the Brick RAID Controllers to determine their status (via Brick Console cable)
  5. Evaluate event logs to determine how the Brick may have placed itself into a Missing status
  6. Gather and evaluate logs (especially affected brick's prior logs) to determine Action Plan.
  7. Engagement of Engineering to perform additional triage


Care must be taken before determining that the following actions may be taken (usually with L2 or Engineering's consent)

  1. Power cycling the whole brick
  2. Reseating/Replacing RAID Controllers
  3. Replacing/Reseating Disk Drives
  4. Replacing/reseating the ES Module

 
Other, less likely, possible causes of Brick Missing events:

  1. Rogue drive disrupts access to the entire Brick
  2. RAID Controllers stuck in boot ready state
  3. Disabled ports between the Slammer and Brick that prevents proper communication (which need to be enabled/replaced)

      
    NOTE: If this is the case, you have the option to reroute to entire Brick, one RAID Controller at a time, downstream of another Brick (or available Slammer port).
      

  4. Bad ES Module or mis-dialed thumb wheel setting on the ES Module (should be 0 for FC or SATA RAID Controller Bricks; 1 for FC-Expansion Bricks)
  5. Power supplies/PDU are off or faulty
  6. Power outage at the customer's end.

Once Recovery has taken place:

  1. Ensure that all provisioned Storage objects are online and accessible
  2. Ensure status of all components on the Brick (Drives, RAID Controllers) are Normal status
  3. Any Drive rebuild that takes place needs to be followed up on to completion
  4. Confirm there is host access to the provisioned Storage objects
  5. Customer followup that there are no further performance issues (including marginal drives)

References

<NOTE:1389622.1> - Pillar Axiom: How to establish a serial connection to a Brick Console.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback