Asset ID: |
1-72-1502575.1 |
Update Date: | 2013-05-20 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1502575.1
:
Pillar Axiom: Brick Temp Degrade Events
Related Items |
- Pillar Axiom 300 Storage System
- Pillar Axiom 500 Storage System
- Pillar Axiom 600 Storage System
|
Related Categories |
- PLA-Support>Sun Systems>DISK>Axiom>SN-DK: Ax600
|
In this Document
Created from <SR 3-6308205361>
Applies to:
Pillar Axiom 300 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 600 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 500 Storage System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Symptoms
Western Digital Drives reporting Temp Degrade Events
<SystemEventInformation>
<EventType>BRICK_TEMP_DEGRADE</EventType>
<Severity>WARNING</Severity>
<Category>SYSTEM</Category>
<Time>2012-10-10T13:16:50.146</Time>
<ComponentIdentity>
<WuName>200C000B083A6FF1</WuName>
</ComponentIdentity>
<ComponentName>/Brick004</ComponentName>
<SourceNodeIdentity>
<Id>2009000B08046F6A</Id>
<Fqn>/Slammer1/1</Fqn>
</SourceNodeIdentity>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.length</ParameterName>
<ParameterValue>52</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.id</ParameterName>
<ParameterValue>BRIX_TMP_DEGRADE</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.reserved</ParameterName>
<ParameterValue>0</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.time</ParameterName>
<ParameterValue>35945144</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.wwn</ParameterName>
<ParameterValue>200C000B083A6FF1</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.raidControllerFirmwareVersion</ParameterName>
<ParameterValue>002114</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.controllerId</ParameterName>
<ParameterValue>0</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.reserved1</ParameterName>
<ParameterValue>0</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.BrickEventHeader.info</ParameterName>
<ParameterValue>BV_INFO</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.cruId</ParameterName>
<ParameterValue>CRU_DRIVE1</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.driveSeverity</ParameterName>
<ParameterValue>TMP_DEGRADE_DRIVE_SEV_LOW</ParameterValue>
</EventParameterList>
<EventParameterList>
<ParameterName>BrickTempDegradeEvent.entryExitResult</ParameterName>
<ParameterValue>BRIX_TMP_DEGRADE_EXIT_OK</ParameterValue>
</EventParameterList>
<EventGuid>001517D0AEA4000132C3FA933C1A113C</EventGuid>
</SystemEventInformation>
Cause
The BRICK_TEMP_DEGRADE event indicates that a drive entered TDM (Temporary Degraded Mode) and came back online.
A drive entering into TDM (Temporary Degraded Mode) does not necessarily indicate a failing drive and hence does not require a replacement if this is the first instance on the drive. The drive maintains a TDM Counter and when the drive exceed the TDM count LIMIT(7), then it will be faulted and a rebuild will be triggered to target hotspare.
Solution
The BRICK_TEMP_DEGRADE event indicates that a drive entered TDM (Temporary Degraded Mode) and came back online.
This is a software workaround for a hardware issue where the hard drive would not respond as quickly as expected. The workaround
briefly takes the drive offline and restarts it, causing 2 BRICK_TEMP_DEGRADE events for that one occurance:
2013-04-10T03:00:30.163 WARNING BRICK_TEMP_DEGRADE /Slammer1/0 /Brick003 CRU_DRIVE4
2013-04-10T03:00:09.868 WARNING BRICK_TEMP_DEGRADE /Slammer1/0 /Brick003 CRU_DRIVE4
This workaround is limited to 1 drive per brick to avoid loss of access to data. The drive maintains a TDM Counter and when the
drive exceeds the TDM count LIMIT(7) (means 7 of these double events), then it will be faulted and a rebuild will be triggered
to target hotspare. After replacing the faulted drive with a new one, copy back will start to the new drive from the hotspare.
There is no need to replace any drive proactively because of these BRICK_TEMP_DEGRADE messages.
The handling for this workaround got improved in FW R5.4.3 or higher.
Attachments
This solution has no attachment