Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1442959.1
Update Date:2014-12-10
Keywords:

Solution Type  Problem Resolution Sure

Solution  1442959.1 :   Pillar Axiom: FileSystemBlockQuarantined Events  


Related Items
  • Pillar Axiom 300 Storage System
  •  
  • Pillar Axiom 600 Storage System
  •  
  • Pillar Axiom 500 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Axiom>SN-DK: Ax600
  •  




Applies to:

Pillar Axiom 300 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 500 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 600 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

 Your Axiom generated an event for FileSystemBlockQuarantined.

You may have an Administrator Action in the GUI (flashing yellow triangle in the lower left corner) for File System Block Quarantined, and you may have a filesystem(s) offline or in Read-only mode.

If you still have questions after reading this article,  go to the  My Oracle Support Community - Pillar Axiom Storage System

Cause

The most common cause of File System Block Quarantined events is an Axiom software bug found in a downrev software version.

 

 

Solution

If this is a recurring issue that is being worked under a different SR, this new SR can be safely ignored.

Upgrading your Axiom software to the latest released version should prevent this from re-occurring, as the most common cause of this is software bugs found in downrev versions.  You can find the latest released software version for your Axiom model in Oracle Support Document 1558848.1 (Pillar Axiom: Current Recommended Software Versions).

To recover the quarantined data, restore the affected data from backup or from a snapshot.

To repair the filesystem, make a clone of the affected filesystem, FSCK with fix on the clone, then revert the fixed clone to the source filesystem.

If the quarantined blocks are in a snapshot, delete the snapshot then re-create it.

 

 

Example of a events.xml FileSystemBlockQuarantined event:

<EventID>ID86f01a7c-d21d-b211-99a2-882bc22e04ad</EventID>
<EventType>FileSystemBlockQuarantined</EventType>
<Severity>Warning</Severity>
<Timestamp>2012-03-01T10:27:49-06:00</Timestamp>
<InternalEventCode>0x90020</InternalEventCode>
<ReportingControlUnit>
<SlammerName>Slammer1</SlammerName>
<SlammerID>ID80689e85-d21d-b211-b5a1-001b21974517</SlammerID>
<SlammerFQN>/Slammer1</SlammerFQN>
<ControlUnitNumber>0</ControlUnitNumber>
<ServiceType>NAS</ServiceType>
</ReportingControlUnit>
<InternalParameterList>
<Parameter>
<Name>ReportingControlUnitWWN</Name>
<Value>2008000b08047d92</Value>
</Parameter>
<Parameter>
<Name>RawEventContents</Name>
<Value>9C700D0002000000DBC82D00000000000500000003</Value>
</Parameter>
</InternalParameterList>
<ParameterList>
<Parameter>
<Name>FileSystemFQN</Name>
<Value/>
</Parameter>
<Parameter>
<Name>FileSystemID</Name>
<Value>ID00000000-0000-0000-0000-000000000000</Value>
</Parameter>
<Parameter>
<Name>FileSystemName</Name>
<Value/>
</Parameter>
<Parameter>
<Name>FileSystemIdentifier</Name>
<Value>880796</Value>
</Parameter>
<Parameter>
<Name>FileServerIdentifier</Name>
<Value>2</Value>
</Parameter>
<Parameter>
<Name>BlockNumber</Name>
<Value>3000539</Value>
</Parameter>
<Parameter>
<Name>QuarantinedReasonCode</Name>
<Value>5</Value>
</Parameter>
<Parameter>
<Name>QuarantinedReason</Name>
<Value>Btree block is corrupted</Value>
</Parameter>
<Parameter>
<Name>Entry</Name>
<Value>3</Value>
</Parameter>
<Parameter>
<Name>Pages</Name>
<Value>0</Value>
</Parameter>
</ParameterList>
</Event>


1) The important items to review are the listed in bold above that include the following:
EventType: FileSystemBlockQuarantined
Timestamp: 2012-03-01T10:27:49-06:00
SlammerFQN: /Slammer1
ControlUnitNumber: 0
2) Once obtained, the next most important information is the FileSystemFQN and FileSystemName. As in this example, the FileSystem is no defined, this is more than likely a snapshot
3) In this instance, there is a FileSystemIdentifier which is important to note as this is what will be used in the Slammer logs analysis for a FileSystemID which = 880796
4) The next two important pieces of information is the following:
QuarantinedReasonCode = 5
QuarantinedReason = Btree block is corrupted
5) In this case, since the QuarantinedReason is a "Btree block is corrupted", this is a severe issue and the service request should be escalated to development for investigation.
NOTE: To provide more insight to this document, we will trace for the FileSystemID in the Slammer logs to provide more information to development
6) The Slammer Control Unit number is: 2008000B08047D92 and will navigate to this directory to search for the FileSystem ID.
NOTE: Prior to execution, once in the Slammer Control Unit directory:

/home/SR-defect-logs/defect_sr3-5390648009/mlog_031312/tracelog/callhome/slammer/2008000B08047D92/120225.061038-120313.182645.GMT.ws-live.OUTLOG

...execute: tracemfs
7) Execution of tracemfs will clean up the output-*.txt file for easier viewing.
8) Once the tracemfs has completed, vi to the mfslog file created and search for the FileSystemID: 880796
9) Below is an example of the output received for this particular issue:

2012/03/01-09:57:21+823367200ns: "MFSMI:NAS_MFS_EVENT_FS_SPCMP_QRTN event:"
2012/03/01-09:57:21+823384267ns: "qrtn_fsid: 880796"
2012/03/01-10:25:33+079075806ns: "MFSMI:NAS_MFS_EVENT_FS_SPCMP_QRTN event:"
2012/03/01-10:25:33+079093182ns: "qrtn_fsid: 880796"

10) Putting all the information together in this instance is as follows:
FileSystemBlockQuarantined event for FileSystemID: 880796 for reason code of: 5 in Btree block for a SPCMP (Space [spc] Map [mp]) area. Again, as this was in Btree, this case should always be sent to an L2 regardless if the FileSystemBlockQuarantined event was for Space Map.

Goal

 

This document is provided on how to ascertain if a FileSystemBlockQuarantined event from the events.xml file from a Callhome log bundle is space map or file system data. In this example, the discussion point will surround space map and how to specifically handle this type of space map event.

This document will be appended to to encompass all FileSystemBlockQuarantined events.

Community Discussions

Still have questions? Use the My Oracle Support Pillar Axiom Storage System Community window below, to search for similar discussions or start a new discussion on this subject.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback