Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2004628.1
Update Date:2016-12-01
Keywords:

Solution Type  Problem Resolution Sure

Solution  2004628.1 :   How to Detect a Bad (Failed) Front Panel Due to Failed FPP (Front Panel Processor) / I2C MUX  


Related Items
  • Oracle Fabric Interconnect F1-15
  •  
  • Oracle Fabric Interconnect F1-4
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Oracle Virtual Networking
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-10650190941>

Applies to:

Oracle Fabric Interconnect F1-15 - Version All Versions to All Versions [Release All Releases]
Oracle Fabric Interconnect F1-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

When running 'show hardware, all Fabric Interconnect hardware information is missing (No BASE WWN/MAC, no S/Ns etc...) Also running 'show iocards' doesn't display any IO cards/modules even though there are IO Modules inserted.

Here are applicable output that shows missing hardware information - note the missing hardware information such as no Model, S/N, Base WWN, Base MAC etc...

#
# Xsigo System Hardware Status
# Model:
# Serial:
# Base MAC: 00:00:00:00:00:00
# Base WWN: 00:00:00:00:00:00:00:00
# Locator LED: off
#
# Date: Mon Apr 27 13:30:24 BST 2015
# User: admin
#



## Chassis Ethernet Interface ###################################################################################################

ethmgmt   Link encap:Ethernet  HWaddr 00:13:97:00:06:98  ^M
        inet addr:10.101.79.82  Bcast:10.101.255.255  Mask:255.255.0.0^M
        inet6 addr: fe80::213:97ff:fe00:698/64 Scope:Link^M
        UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1^M
        RX packets:70266 errors:0 dropped:25025 overruns:0 frame:0^M
        TX packets:957 errors:0 dropped:0 overruns:0 carrier:0^M
        collisions:0 txqueuelen:1000 ^M
        RX bytes:37625091 (35.8 MiB)  TX bytes:172000 (167.9 KiB)^M
^M


## NO IO CARDS FOUND ############################################################################################################



## Front Panel Version status ###################################################################################################

model            serial            xt-ver            primary-boot-ver            secondary-boot-ver            diag-ver
----------------------------------------------------------------------------------------------------------------------------------

1 record displayed


## Front Panel Environment status ###############################################################################################

state                                        temperatures                                voltages
----------------------------------------------------------------------------------------------------------------------------------

indeterminate
1 record displayed


## Fabric Card status ###########################################################################################################

name           model           serial           state                   speed           temperatures           voltages
----------------------------------------------------------------------------------------------------------------------------------
1                                               indeterminate           SDR
1 record displayed


## System Control Processor status ##############################################################################################

serial        cpu-usage        mem-usage        temperatures                                                    voltages
----------------------------------------------------------------------------------------------------------------------------------
            0                21.0988          hd_temp_current=27 hd_temp_maximum=27 hd_temp_minimum=27
1 record displayed


## Power supply status ##########################################################################################################

model              id              descr              state                         serial              vendor-model
----------------------------------------------------------------------------------------------------------------------------------
                 1                                  up/indeterminate
                 2                                  up/indeterminate
2 records displayed


## Fan controller status ########################################################################################################

model                  state                  serial-num                  actual-temp                  max-temp
----------------------------------------------------------------------------------------------------------------------------------
                     up
1 record displayed


## Fan status ###################################################################################################################

 Also can find these messages in /var/log/user.log, in particular note the WARNING] xcm::XCManager !!!!! FB type: Can't open file /tmp/xsigo-fabricready fb_type=-2 errors:

Apr 26 13:54:07 iop-3 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-3 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN

Apr 26 13:54:07 iop-7 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-7 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN

Apr 26 13:54:07 iop-3 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10g-3 shutdown_timer set to 20 secs

Apr 26 13:54:07 iop-7 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10g-7 shutdown_timer set to 20 secs

Apr 26 13:54:07 iop-5 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10x1g-5 [procmon::shutdowntimeout1] shutdown timeout in state WAITING_SHUTDOWN

Apr 26 13:54:07 iop-5 processmonitor[524]: [NOTICE] procmon::ProcessMonitor vn10x1g-5 shutdown_timer set to 20 secs

Apr 26 13:54:07 iop-3 vn2_agent[563]: [WARNING] vn2Debug xvnd disconnected, err 1, arg 0

Apr 26 13:54:07 iop-3 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10g-3 [procmon::fsmerror] FSM got error: Failed to read register pc on process 537: No such process

Apr 26 13:54:07 iop-3 vn2_agent[563]: [ERR] VN2::VN2agent vn10g-3 XCM connection Down

Apr 26 13:54:07 iop-1 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-1 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(552) terminated with signal SIGSEGV (Segmentation violation)

Apr 26 13:54:07 iop-3 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-3 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(554) terminated with signal SIGSEGV (Segmentation violation)

Apr 26 13:54:07 iop-5 processmonitor[524]: [ERR] procmon::ProcessMonitor vn10x1g-5 [procmon::fsmerror] FSM got error: Call to ptrace failed, errno: 3

Apr 26 13:54:07 iop-7 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10g-7 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(554) terminated with signal SIGSEGV (Segmentation violation)

Apr 26 13:54:07 iop-5 processmonitor[524]: [WARNING] procmon::ProcessMonitor vn10x1g-5 [procmon::procterminated] Process $XSIGOROOT/bin/vn2_agent(553) terminated with signal SIGSEGV (Segmentation violation)

 

Apr 26 13:54:33 ainf004 shutdown[2403]: shutting down for system reboot

Apr 26 13:54:33 ainf004 boot: [ALERT]  Rebooting Base OS now

Apr 26 13:54:33 ainf004 shutdown[2409]: shutting down for system reboot

Apr 26 13:58:00 ainf004 boot: [ALERT]  XSIGOS is starting up

 

Upon Chassis reboot Chassis Mgr fails to start, fabricready fb_type=2 error:

 

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(562):

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: Failed to get a response from Chassis Mgr. Retrying ... 31

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(1074):

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: ChassisAddressResponse from Chassis Manager is invalid

Apr 26 14:14:07 ainf004 mimm[2720]: [ERR] Ximm

Apr 26 14:14:18 ainf004 xc_manager[2726]: [WARNING] xcm::XCManager !!!!! FB type: Can't open file /tmp/xsigo-fabricready fb_type=-2

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(562):

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: Failed to get a response from Chassis Mgr. Retrying ... 32

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm ERROR: /tmp/xgos_build/checkout.22448/include/xg-equipment-ImpChassisDataShell.cc(1074):

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm Mimm-Init: ChassisAddressResponse from Chassis Manager is invalid

Apr 26 14:14:37 ainf004 mimm[2720]: [ERR] Ximm

 

 May also see errors similar to this:

Feb 16 09:15:39 ainf004 xc_manager[2590]: [ERR] xcm::XCManager MIMM_XCM_IPC_FAIL

Feb 16 09:15:40 ainf004 imagemanager: [NOTICE]  Checking for fabric upgrade

Feb 16 09:15:40 ainf004 imagemanager: [ERR]  **** Warning: Failed to update switch fabric

Feb 16 09:15:40 ainf004 processmonitor[2534]: [CRIT] procmon::ProcessMonitor [procmon::procdisconnect] Disconnect from $XSIGOROOT/bin/scriptclient (attached), starting killer timer

Feb 16 09:15:40 ainf004 processmonitor[2534]: [WARNING] procmon::ProcessMonitor [procmon::procexit] Process $XSIGOROOT/bin/scriptclient(2909) exited with status 0

Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm ERROR: /xsigo/xsigos/tags/3.8.2-XGOS-39386/management/ximm/xmlcore/src/xg-xmlcore-xml.cc(783)

Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm ERROR: /xsigo/xsigos/tags/3.8.2-XGOS-39386/management/ximm/xmlcore/src/xg-xmlcore-xml.cc(783):

Feb 16 09:15:55 ainf004 mimm[2584]: [ERR] Ximm non fatal error encountered for Element(efabric:Fabric) ximm::txbegintimeout     0xb5fe80e0

 

 

Changes

May be due to power outage in datacenter or Fabric Interconnect XgOS upgrade which power cycles the Fabric Interconnect as part of the XgOS upgrade.   Most likely caused due to power cycle event.

Cause

System hardware information is lost / removed when the Fabric Interconnect FPP (Front Panel Processer)/ I2C MUX is failed.   This prevents the whole Fabric Interconnect from functioning though the system will power on, fans will spin and can get to the Fabric Interconnect CLI (Command Line Interface).  Generating the Fabric Interconnect diagnostic log bundle will be required for Oracle Support Engineers to prove this is due to failed FPP.   However, running 'show hardware' from the 'admin' CLI may prove to be sufficient if all the hardware information is missing as noted in the CLI output included above.    

Solution

 Replace the Front Panel which contains the FPP (Front Panel Processor).  See this KB on how to replace Fabric Interconnect Gen2 Front Panel:

 

 

 

References

How to replace a Gen2 Front Panel on Oracle Fabric Interconnects (Xsigo) (Doc ID 1663431.1)
How to Detect a Bad (Failed) Front Panel HCA (Doc ID 1516022.1)
How to Create/Upload Diagnostic Log File Bundle for an Oracle Fabric Interconnect (Doc ID 1517366.1)

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback