![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2216385.1 : FS System: System Status READ ONLY or SYSTEM_FAILURE_CONSERVATIVE Due to Pilot Resource Failure
In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:Oracle FS1-2 Flash Storage System - Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform. SymptomsOverall System Status: Read Only / SYSTEM_FAILURE_CONSERVATIVE Firmware is 6.2.9 or lower Log bundle indicates the following observations: server.log.*: INFO DATA&TIME com.pillardata.server.systemstate.SystemStateMonitor.update(SystemStateMonitor.java:253) PMICommandProcessor - Transitioning system state from NORMAL to SYSTEM_FAILURE_CONSERVATIVE because of PCP_NOT_ACTIVE server-jni.log.*: DATE&TIME pilot1 java: 210122682 16505 MemAlloc::getConnection() 0x7f5b7807c010 0x7f5b7807c050 empty connection pool
The event log should NOT contain any indication that PERSISTENCE access was lost but WILL contain multiple OPERATION_FAILED events for UNSATISFIED_REQUEST_PMI_COMMUNICATION_ERROR nature such as: <EventType>OPERATION_FAILED</EventType>
<Severity>WARNING</Severity> <Category>AUDIT</Category> <Time>2016-12-19T15:11:38.225</Time> <ComponentIdentity> <Guid>414B303032363932A13F17232B4FA59C</Guid> </ComponentIdentity> <ComponentName>/InitiatorDiscoveryOperation/2021429/{{USER_2}}</ComponentName> <EventParameterList> <ParameterName>EventParameters.TaskFailed.csiError_1</ParameterName> <ParameterValue>UNSATISFIED_REQUEST_PMI_COMMUNICATION_ERROR</ParameterValue>
Further review of the server.log.* files for the time of the condition, will identify the error "UNSPECIFIED_BLT_ERROR ErrorNumber: -8" such as: INFO 2016-12-19 15:11:38,174 com.pillardata.pmi.net.InfoLogger.logError(InfoLogger.java:105) PMICommandProcessor - MessageHeader[
revision=0 messageID=0x2961962955 transactionID=0 sourceNodeId=2008000101000000 sourceComponent="PDS_COMP_PACMAN(0x1f)" destNodeId=2008000101000001 destComponent="PDS_COMP_CM(0x1c)" flags=392 type="PDS_MSG_TYPE_SUPER_CMD(0x5)" command="CM_MSG_GET_SAN_INITIATOR(0x1c0d43)" result="PMI_EOK(0x0)" operationCount=1 operationSize=8 reserve2=0 time=Timeval[ seconds=1482160298 microseconds=170000 ] bodySize=8 ] ErrorCode: UNSPECIFIED_BLT_ERROR ErrorNumber: -8
If ALL these aspects match and persistence access was not lost, then it is likely that the cause of SYSTEM_FAILURE_CONSERVATIVE was the pilot running out of resources (Bug 24314003) The most common cause of READ ONLY / SYSTEM_FAILURE_CONSERVATIVE is loss of access to persistence. It is important to verify from log review the underlying cause. If access to persistence was lost, failing over the pilots per this KM doc will not resolve the condition.
CauseA defect in the pilot software causes the failure of PCP and the active pilot transitions to SYSTEM_FAILURE_CONSERVATIVE SolutionEnsure the standby pilot is in a normal status. A suggestion is to ssh to the pilot and confirm the status from a review of the /var/log/pcp.log Fail the active pilot over to the standby. # fscli login -u pillar FS1-IP_ADDRESS # fscli pilot -forceFailover
Due to the nature of the pilot resource exhaustion, the fscli pilot -forceFailover command may not result in a pilot failure. An alternative is to ssh to the pilot and issue "service pilotcfg restart" to perform the failover.
Defect 24314003 is resolved in 6.2.10 and higher. References<BUG:24314003> - COAXM099 BCD ALLOCATION ISSUES PREVENTING JNI REPLY TO STATUS, LEADING TO SXL_STAttachments This solution has no attachment |
||||||||||||||||||
|