Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2189092.1
Update Date:2018-01-22
Keywords:

Solution Type  Problem Resolution Sure

Solution  2189092.1 :   "ASR: HALRT-02013: HDD Disk Controller Battery Degraded." during Preventive Maintenance  


Related Items
  • Exadata X3-2 Hardware
  •  
  • Exadata X4-2 Hardware
  •  
  • Exadata X3-2 Half Rack
  •  
  • Exadata X4-2 Quarter Rack
  •  
  • Exadata X3-2 Full Rack
  •  
  • Exadata X4-2 Half Rack
  •  
  • Exadata X4-2 Full Rack
  •  
  • Exadata X3-2 Eighth Rack
  •  
  • Exadata X4-2 Eighth Rack
  •  
  • Exadata X3-2 Quarter Rack
  •  
Related Categories
  • PLA-Support>Sun Systems>x86>Engineered Systems HW>SN-x86: Exadata ASR
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-13421944093>

Applies to:

Exadata X3-2 Half Rack - Version All Versions and later
Exadata X3-2 Full Rack - Version All Versions and later
Exadata X4-2 Full Rack - Version All Versions and later
Exadata X4-2 Half Rack - Version All Versions to All Versions [Release All Releases]
Exadata X3-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Symptoms

During Engineered Systems "Preventive Maintenance (PM) after a BBU is replaced fault "HALRT-02013: HDD Disk Controller Battery Degraded" may be reporetd by alerthistory and ASR.

Cause

Bug 23518700 : WE SHOULD NOT FAULT BATTERY UNTIL LEARN CYCLE HAS COMPLETED.
Bug 22992608 : BBU DROP FOR REPLACEMENT AND REENABLE ARENT CONSISTENT WITH BBU CHARGE BEHAVIOR

False ASR's (Automatic Service Requests) frequently get generated by assets which are still ACTIVE in the ASR Manager while service/Maintenance is being performed. Althrough  Preventive Maintenance Documentation (KM#1356473.1) does indicate that ASR should be "disabled" during PM service. Unfortunately, due to the expansive nature of the PM activities this step often gets overlooked. 

Due to a bug in Exadata software, when a replacement BBU (which are shipped with a partial charge) is inserted the software may prematurely fault the BBU as "degraded" because it's initial charge level is lower than expected. This will result in an alert being logged in alerthistory. If the asset is ACTIVE in ASR then an ASR will be generated as well. Since it's common for BBU's to be replaced quickly during PM service this can lead to multiple false ASR's being logged in a short period of time. 

If this kind of fault occurs unexpectedly during normal operation it may be a legitimate failure and should diagnosed and addressed as appropriate.

Solution

The fault will be cleared from alerthistory automatically once the BBU finishes charging up. Any ASR's that get generated will have to be manually investigated and closed. 
Upon receiving such an ASR, the first thing the TSE (Technical Service Engineer) Should do is check for an actual fault and existing SR/PM SR/ASR/ETC


IF A PREVENTIVE MAINTENANCE OR OTHER SR/ASR/FE TASK IS FOUND/BEING PERFORMED: 
Set the assets into "Maintenance Mode" (Ops Center) or create a "Blackout" (OEM Cloud Control )before replacing the batteries. This will prevent false ASR's from being generated. Leave the asset in "Maintenance mode"/"Blackout" until the BBU charging is completed and reported normal by alerthistory or MegaCLi -GetBbuStatus -a0

Enterprise Ops Manager Feature Reference Guide - 9.10.1 "Using Maintenance Mode" 

To Place Assets in Maintenance Mode
1)Select an asset in the Navigation pane.
2)Click Place in Maintenance in the Actions pane.
3)Click Place to confirm the action.

To Remove Assets From Maintenance Mode
1)Select the asset in the Navigation pane.
2)Click Remove From Maintenance in the Actions pane.
3)Click Remove to confirm the action.

Oracle Enterprise Manager Cloud Control - Chap.5 "Using Blackouts"

To Create a "Blackout"
1)"Enterprise" Menu -> "Monitoring" -> "Blackouts and Notification Blackouts"
2)Click "Create"
3)Choose "Blackout" (stops ALL monitoring/reporting for maintenance/service purposes)
4)Continue to follow the wizard to complete creation of the Maintenance blackout.

To Purge a "Blackout" (after Maintenance is completed/ended)
1)"Enterprise" Menu -> "Monitoring" -> "Blackouts and Notification Blackouts"
2)Find or search for the Blackout you want to change/end
3)From the "Show" drop-down menu select "History"
4)In the teable select the Blackout/Notification Blackout you want to remove/end and click "DELETE"
5)Click 'DELETE" on the confirmation dialog.

To Enable/Disable assets via the ASR Manager:
(Also see: ASR - how to disable Service Processor assets to avoid Automatic Service Request generation during System Maintenance (Doc ID 2079929.1)
1)Login to your ASR Manager and start the ASR prompt: (V4.x= #/opt/SUNWswasr/bin/asr , V5.x #/opt/asrmanager/bin/asr)
2)Get list of managed assets: #asr> list_asset
3)disable ASR asset/assets by IP Address, Hiostname, or subnet (useful to disable a group of assets within the same subnet, great for Engineered systems)
asr> disable_asset [-i | -h | -s ]
4)to re-enable assets use asr>enable_asset [-i | -h | -s ]

IF PREVENTIVE MAINTENANCE/OTHER SERVICE is ~NOT~ being performed: continue support as normal.

IF ADDITIONAL HELP IS NEEDED - If you already have an SR/ASR open please ask the SRO (Service Request Owner) to collaborate with the appropriate team. Or, create a new SR for the appropriate team (contact Oracle HUB and they will assist you with creating a new SR)


FOR ORACLE TECHNICAL SERVICE ENGINEERS(TSE's)

Before you do anything else first check to see if there is a Preventive Maintenance (PM) SR, or other SR/ASR open.

To review SR history for the current asset:
On the SR, Click the "Maximize" Button"
Next to the Serial # Field click the "View Asset Details" icon (speech bubble with "i" in it, right of the search glass button)
Review the "Related SR's" Tab for other SR's opened for that particular asset/node. (will show Sr's open ONLY for the current serial # you are viewing. This will NOT show you Sr's opened for any other related assets I.E the Parent S/N or other Child assets. you MUST search those separately.

To Find PM SR's/Review Parent asset serial # (I.E the Rack Master Serial #) Service History):
Highlight the Rack Master Serial # ("Serial # (Parent)" Field)
Click "asset", then "Menu" Button (Far Right), then "Create New View"
Plug the Rack Master Serial # into the "Serial #" Field and click "Apply"
Click the "Instance" # then "Related SR's" tab. 
If there is a Preventive Maintenance SR found then this is likely a false ASR. Link this Knowledge Article,  PM SR as well as bugs 2292608 & 23518700 to the ASR. Advise both the customer and PM SR owner and Field Engineer to set the assets into maintenance mode (Ops Center), Create a "blackout" (Cloud Control, or disable ASR (via ASRmanager) before replacing batteries to prevent false ASR's. (Refer them to this Article, and also PM KM#1356473.1 step 2, which states to disable ASR before doing service)

OPTIONAL - You can optionally search the SR/ASR history for other child assets to see if any other ASR's of the same type have also been logged. It's quite common for several such ASR's to occur around the same time as PM service progresses. If you do this, please link any other BBU Degraded ASR's you found to the PM SR.

If there is NOT a Preventive Maintenance occurring the fault is likely valid and should be addressed as normal. There MAY also be another SR/Field service task open already as well, so don't assume that every ASR you receive is a valid fault.

 

References

<BUG:22992608> - BBU DROP FOR REPLACEMENT AND REENABLE ARENT CONSISTENT WITH BBU CHARGE BEHAVIOR
<BUG:23518700> - WE SHOULD NOT FAULT BATTERY UNTIL LEARN CYCLE HAS COMPLETED.
<NOTE:2079929.1> - ASR - how to disable Service Processor assets to avoid Automatic Service Request generation during System Maintenance
https://docs.oracle.com/cd/E37710_01/install.41/e18475/ch4_asr_enviro_admin.htm#ASRUD189
https://docs.oracle.com/cd/E63000_01/EMADM/blackouts.htm#EMADM15282
<NOTE:1356473.1> - How to Perform Exadata or SuperCluster Preventive Maintenance Service

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback