Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1498053.1
Update Date:2016-12-19
Keywords:

Solution Type  Technical Instruction Sure

Solution  1498053.1 :   How to Decode Common Array Manager Alarms using the Event or Grid Code  


Related Items
  • Sun Storage 6580 Array
  •  
  • Sun Storage Common Array Manager (CAM)
  •  
  • Sun Storage 2540-M2 Array
  •  
  • Sun Storage 2510 Array
  •  
  • Sun Storage 6540 Array
  •  
  • Sun Storage 2540 Array
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 6180 Array
  •  
  • Sun Storage J4200 Array
  •  
  • Sun Storage J4400 Array
  •  
  • Sun Storage 6780 Array
  •  
  • Sun Storage 2530-M2 Array
  •  
  • Sun Storage 2530 Array
  •  
  • Sun Storage J4500 Array
  •  
  • Sun Storage 6140 Array
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Disk Software>SN-DK: CAM
  •  


Provide details to decoding Event or Grid codes.

Applies to:

Sun Storage 6140 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2530-M2 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2530 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 2510 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 6580 Array - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Goal

This document will explain the format of Event Codes seen in arrays managed by Sun Storage Common Array Manager (CAM) as well as how to extract more information from this code.

Solution

CAM alarms contain useful information about problems that a given array may be seeing.  One of the components of the alarm is the "Grid Code" (also seen as the event code in other forms of the alarm).  Here is a typical alarm with the Grid Code highlighted:

Alarm ID   : alarm137
Description: Drive Tray.08.Drive.10 failed.
Severity   : Critical
Element    : t8drive10
GridCode   : 63.66.1023
Date       : 2012-10-11 18:38:04

 

Using the Grid or Event Code, we can decipher additional information about a problem.  This code is arranged in 3 parts separated by periods.  In the example above, those parts are 63, 66 and 1023.  The first part is used to identify the Source.  The value of 63 equates to a Sun StorageTek 6540 array (other possibilities are listed in the first table below).  The second part is the Event Type.  The value of 66 equates to a "ProblemEvent" (other possibilities are listed in the second table below).  And the last part is a Description of the Event Type.  The complete list of possible values for these Descriptions is too large for this document as well as the Description itself is self evident.

 
Source ValueSource
7.yy.zzzz Management Host
48.yy.zzzz Sun StorageTek 6130
57.yy.zzzz Sun StorageTek 6140
59.yy.zzzz StorageTek Flexline 380
63.yy.zzzz Sun Storagetek 6540
69.yy.zzzz Sun StorageTek 2530
70.yy.zzzz Sun StorageTek 2540
72.yy.zzzz StorageTek Flexline 280
73.yy.zzzz Sun StorageTek 2510
74.yy.zzzz StorageTek Flexline 240
77.yy.zzzz Sun Storage J4200
78.yy.zzzz Sun Storage J4400
79.yy.zzzz Sun StorageTek 6580
80.yy.zzzz Sun StorageTek 6780
83.yy.zzzz Sun Storage J4500
90.yy.zzzz Sun StorageTek 6180
92.yy.zzzz Sun StorageTek 2530M2
93.yy.zzzz Sun StorageTek 2540M2
86.yy.zzzz Sun StorageTek F5100

 

 

INTERNAL ONLY:

 

There are additional sources for arrays that CAM was supposed to support but never made it and should never be seen:

 

 
Source ValueSource
84 B6000
85 NEM
94 6190
95 6590

 

xx.66.9999 is a dummy GridCode which is returned by CAM each time it does not match the Fault ID from SYMbol to the array model. See for example <Document 1519083.1> Sun Storage Common Array Manager (CAM) Returns the Alarm with Event Code 93.66.9999 and Fault ID 434 for a Sun Storage 2500-M2 Array

 

 
Event ValueEvent TypeNotes
xx.4.zzzz Value Change Event Resolved Reporting Change to Optimal
xx.5.zzzz Value Change Event Problem Reporting Change to Non-Optimal
xx.10.zzzz Audit Event Weekly Internal Audit
xx.11.zzzz Communications Established Event Management Communication
xx.12.zzzz Communications Lost Event  Management Communication
xx.14.zzzz Discovery Event Initial Array Discovery
xx.19.zzzz Location Change Event Changes to Customer Information Page
xx.20.zzzz Log Event Typically a Diagnostic Test Issue
xx.22.zzzz Quiesce End Event IO Successfully Quiesced
xx.23.zzzz Quiesce Start Event IO Quiescence Started
xx.25.zzzz State Change Event Reporting Change to Optimal
xx.26.zzzz  State Change Event Reporting Change to Non-Optima
xx.40.zzzz Component Insert Event Component Inserted
xx.41.zzzz Component Remove Event Component Removed
xx.64.zzzz Problem Change Event Previously Reported Problem Changed
xx.65.zzzz Problem Clear Event  Previously Reported Problem Fixed
xx.66.zzzz Problem Event Problem being Reported
xx.74.zzzz Revision Baseline Event Array is at Firmware Baseline
xx.75.zzzz Revision Delta Event  Array is not at Firmware Baseline

Events that are not errors will not generate alarms.  For example, Revision Baseline Event indicates that the array is at the baseline.  As such, no further actions are needed.

Using the command ras_admin and the Grid Code, it is possible to obtain additional information about the failure, including the array type and what to do to resolve the problem:

 

# ./ras_admin advisor -e 90.66.1023
Event Code         : 90.66.1023
Event Type         : 6180.ProblemEvent.REC_FAILED_DRIVE
Severity           : 0
Sample Description : Drive {0} failed.
Probable Cause     : A drive has failed.
Recommended Action : Replace the disk drive.

 

The ras_admin command can be found in the following locations:

  • Solaris: /opt/SUNWsefms/bin
  • Linux: /opt/sun/cam/private/fms/bin
  • Windows: <Drive>: \Program Files\Sun\Common Array Manager\Component\fms\bin

Since ras_admin is only processing the Event Code and not an actual alarm, things like location will not match the actual alarm.  In this case, ras_admin has no idea which drive location actually failed and so it leaves a value in {}.  If there is more than one value, the subsequent instances will increment (1..2..).

 

Do you still have questions?  You can use My Oracle Support Communities.  Communities put you in touch with industry professionals like yourself.  They are monitored by Oracle support engineers, so you can expect reliable and correct answers.  Ask questions and see what others are asking about in the Disk Storage 2000, 3000, 6000 RAID Arrays & JBODs Community.

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback