![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2004628.1 : How to Detect a Bad (Failed) Front Panel Due to Failed FPP (Front Panel Processor) / I2C MUX
In this Document
Created from <SR 3-10650190941> Applies to:Oracle Fabric Interconnect F1-15 - Version All Versions to All Versions [Release All Releases]Oracle Fabric Interconnect F1-4 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. SymptomsWhen running 'show hardware, all Fabric Interconnect hardware information is missing (No BASE WWN/MAC, no S/Ns etc...) Also running 'show iocards' doesn't display any IO cards/modules even though there are IO Modules inserted. Here are applicable output that shows missing hardware information - note the missing hardware information such as no Model, S/N, Base WWN, Base MAC etc... # Also can find these messages in /var/log/user.log, in particular note the WARNING] xcm::XCManager !!!!! FB type: Can't open file /tmp/xsigo-fabricready fb_type=-2 errors: Apr 26 13:54:07 iop-3 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-3 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN Apr 26 13:54:07 iop-7 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-7 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN Apr 26 13:54:07 iop-3 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10g-3 shutdown_timer set to 20 secs Apr 26 13:54:07 iop-7 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10g-7 shutdown_timer set to 20 secs Apr 26 13:54:07 iop-5 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10x1g-5 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN Apr 26 13:54:07 iop-5 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10x1g-5 shutdown_timer set to 20 secs Apr 26 13:54:07 iop-3 vn2_agent[563]: [WARNING] vn2Debug xvnd disconnected, err 1, arg 0 Apr 26 13:54:07 iop-3 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-3 [procmon::fsmerror] FSM got error: Failed to read register pc on process 537: No such process Apr 26 13:54:07 iop-3 vn2_agent[563]: [ERR] VN2::VN2agent vn10g-3 XCM connection Down Apr 26 13:54:07 iop-1 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-1 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(552) terminated with signal SIGSEGV (Segmentation violation) Apr 26 13:54:07 iop-3 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-3 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(554) terminated with signal SIGSEGV (Segmentation violation) Apr 26 13:54:07 iop-5 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10x1g-5 [procmon::fsmerror] FSM got error: Call to ptrace failed, errno: 3 Apr 26 13:54:07 iop-7 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-7 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(554) terminated with signal SIGSEGV (Segmentation violation) Apr 26 13:54:07 iop-5 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10x1g-5 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(553) terminated with signal SIGSEGV (Segmentation violation)
Apr 26 13:54:33 ainf004 shutdown[2403]: shutting down for system reboot Apr 26 13:54:33 ainf004 boot: [ALERT] Rebooting Base OS now Apr 26 13:54:33 ainf004 shutdown[2409]: shutting down for system reboot Apr 26 13:58:00 ainf004 boot: [ALERT] XSIGOS is starting up
Upon Chassis reboot Chassis Mgr fails to start, fabricready fb_type=2 error:
Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(562): Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: Failed to get a response from Chassis Mgr. Retrying ... 31 Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(1074): Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: ChassisAddressResponse from Chassis Manager is invalid Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Apr 26 14:14:18 ainf004 xc_manager[2726]: [WARNING] xcm::XCManager !!!!! FB type: Can't open file /tmp/xsigo-fabricready fb_type=-2 Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(562): Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: Failed to get a response from Chassis Mgr. Retrying ... 32 Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(1074): Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: ChassisAddressResponse from Chassis Manager is invalid Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm
May also see errors similar to this: Feb 16 09:15:39 ainf004 xc_manager[2590]: [ERR] xcm::XCManager MIMM_XCM_IPC_FAIL Feb 16 09:15:40 ainf004 imagemanager: [NOTICE] Checking for fabric upgrade Feb 16 09:15:40 ainf004 imagemanager: [ERR] **** Warning: Failed to update switch fabric Feb 16 09:15:40 ainf004 processmonitor[2534]: [CRIT] procmon::ProcessMonitor [procmon::procdisconnect] Disconnect from $XSIGOROOT/bin/scriptclient (attached), starting killer timer Feb 16 09:15:40 ainf004 processmonitor[2534]: [WARNING] procmon::ProcessMonitor [procmon::procexit] Process $XSIGOROOT/bin/scriptclient(2909) exited with status 0 Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm ERROR: /xsigo/xsigos/tags/3.8.2-XGOS-39386/management/ximm/xmlcore/src/xg-xmlcore-xml.cc(783) Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm ERROR: /xsigo/xsigos/tags/3.8.2-XGOS-39386/management/ximm/xmlcore/src/xg-xmlcore-xml.cc(783): Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm non fatal error encountered for Element(efabric:Fabric) ximm::txbegintimeout 0xb5fe80e0
ChangesMay be due to power outage in datacenter or Fabric Interconnect XgOS upgrade which power cycles the Fabric Interconnect as part of the XgOS upgrade. Most likely caused due to power cycle event. CauseSystem hardware information is lost / removed when the Fabric Interconnect FPP (Front Panel Processer)/ I2C MUX is failed. This prevents the whole Fabric Interconnect from functioning though the system will power on, fans will spin and can get to the Fabric Interconnect CLI (Command Line Interface). Generating the Fabric Interconnect diagnostic log bundle will be required for Oracle Support Engineers to prove this is due to failed FPP. However, running 'show hardware' from the 'admin' CLI may prove to be sufficient if all the hardware information is missing as noted in the CLI output included above. SolutionReplace the Front Panel which contains the FPP (Front Panel Processor). See this KB on how to replace Fabric Interconnect Gen2 Front Panel:
ReferencesHow to replace a Gen2 Front Panel on Oracle Fabric Interconnects (Xsigo) (Doc ID 1663431.1)How to Detect a Bad (Failed) Front Panel HCA (Doc ID 1516022.1) How to Create/Upload Diagnostic Log File Bundle for an Oracle Fabric Interconnect (Doc ID 1517366.1) Attachments This solution has no attachment |
||||||||||||||||||||
|