![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1448492.1 : Pillar Axiom: Pilot Healthcheck Serial Link Not Responding
Created from <SR 3-5573577151> Applies to:Pillar Axiom 500 Storage System - Version Not Applicable to Not Applicable [Release N/A]Pillar Axiom 600 Storage System - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. Symptoms
Pilot Healthcheck Serial Link Not Responding
To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage Pillar Axiom System
Changes
Pilot replacement, Loss of power, Bad Pilot Hard drive, Axiom move, Cabling error
Cause
The active pilot did not receive the heartbeat single from the stand by pilot. This could be due to error on the standby pilot, or the cable connection. This will not affect data serving as pilots are for the management functions.
Solution
From Pilot: 1- Check pilot status from within the GUI. 2- Check power to the pilot. 3- Check serial cable connector is connected correctly. The serial cable should be connected to bottom serial ports next to the WHITE video interface. 4- Connect a keyboard and monitor to the Pilot in question. Check the condition of the pilot from host console, create an SR for investigation and report condition.
Troubleshooting serial port: The serial communication is used to monitor pilot heartbeat as part of pcp process. Pcp log will report serial communication status. It is located on /var/log/pcp.log. In case of serial link failure it will report: pcp:warning Serial Link Timeout - 1611 1565 3 The status is also reported in callhome logs if pilot logs are collected. İt is located on ../log/pcp_runtime_info.log.<time_stamp> like pcp_runtime_info.log.130604171304.
... -------------------------- FofbAdaptor settings -------------------------- m_ecWarmstartCount = 0 m_ecWarmstartTimer = 0 m_guid = 2008fffffffffff2 m_masterWritten = 1 m_fofbCodReset = 0 m_shuttingDown = 0 m_pilotState = 3 m_otherPilotState = 5 m_runPacman = 0 m_pilotHeartbeatCodReset = 0 m_lastSerialHeartbeat = 2439 m_pacmanHeartbeat = 0 m_smProviderHeartbeat = 0 m_excludedStateSet = 0 m_sameSwVerPrinted = 1 m_softwareUpdateInProgress = 1 m_buddyColdStartInProgress = 0 m_blockPacmanConman = 0 m_serialFailedEventSent = 1 System state = 8 ...
The port driver is pcp itself, it just uses the OS termios, fcntl, and system ioctls. It does not use any OS serial drivers other than termios. If you see this string in the /var/log/pcp.log, it means that pcp is unable to attach to the serial port: pcp:warning Unable to open serial port /dev/ttyS0 If you see below string in the /var/log/pcp.log, it means that pcp is successfully attached to the serial port: pcp:debug Successfully configured I/O parameters for serial port /dev/ttyS0
Test serial communication between pilots:
You can test serial port communication via redirecting port output.
If cable is not working then you will not see any output.
Test pilot serial ports:
Another failure scenario is related to pilot CU serial port failure. If you replaced cable with good one but serial link is still not working it may be the case. To identify failing FRU you need to use another device capable to serial communication like a laptop with serial port or USB to serial adapter. You can use USB to serial adapter in brick console cable set.
To perform test:
The serial cable between pilots can be tested via connecting to the Slammer console.
Example of working serial link:
When you look in the current /var/log/pcp.log you should see the fofb node matrix go by every few seconds. Toward the end of this, you should see the NODE_PASSIVE NOOP messages from the other pilot, which would tell you that the serial port is working.
Here is serial port test entry by going to pilot2 and redirecting stdout to the tty. [root@pilot2 root]# echo "THIS IS A SERIAL PORT TEST" > /dev/ttyS0
Here is what you should see every 5 seconds on pilot1 [root@pilot1 dev]# cat /dev/ttyS0 <<<PILOT_TWO****NODE_PASSIVE*CMD_NOOP*****>>> <<<PILOT_TWO****NODE_PASSIVE*CMD_NOOP*****>>> THIS IS A SERIAL PORT TEST <<<PILOT_TWO****NODE_PASSIVE*CMD_NOOP*****>>> If you cannot get response from any serial port run lsof on /dev/ttyS0 to make sure pcp is bound to it.
Attachments This solution has no attachment |
||||||||||||
|