![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1952091.1 : Fabric Interconnect 2x8 FC IO Module Repeatedly Crashing
In this Document
Created from <SR 3-9956195031> Applies to:Oracle Fabric Interconnect F1-4 - Version All Versions to All Versions [Release All Releases]Oracle Fabric Interconnect F1-15 - Version All Versions to All Versions [Release All Releases] Oracle Virtual Compute Appliance X4-2 Hardware - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. Symptoms2x8 FC IO Module repeatedly crashing. After generating the 'get-log-files -cores' diagnostic log script on the Fabric Interconnect to get all 'cores' files along with normal diagnostic files, find something similar to the below, note the diag and dmsg files for fccard_3 below: Example: diag_fccard_3_ts_1412259529 dmesg.1.gz ib.log opensm.log syslog.log.1.gz tech-support wtmp.1.gz
Looking at the contents of dmsg_iocard-3_ts40_0 reveals something similar to: <4>[42949388.000000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4020000) mask((0x377ffff)
-bash-3.2$ grep -i "IOCARD=3" user.log
Nov 28 19:27:45 iop-3 chassisAgt[547]: [NOTICE] chassisagt vhba2x8g-3 [IOCARD=3] (proc_equipmentStateMsg) component=fiberChannelChip, state=operStateUp stateQual=default
Nov 28 19:27:46 fpp chassisCtr[469]: [NOTICE] chassisctr fpp-1 [chassis::cardstatechange] (reportState) [IOCARD=3] Operational state changed. OldState=operStateInitializing, NewState=operStateUp, Qualifier=default Nov 29 22:54:25 iop-3 chassisAgt[547]: [NOTICE] chassisagt vhba2x8g-3 [IOCARD=3] (proc_equipmentStateMsg) component=vhChip, state=operStateCriticalFailure stateQual=vhChip Nov 29 22:54:27 iop-3 chassisAgt[547]: [NOTICE] chassisagt vhba2x8g-3 [IOCARD=3] (collectVHCardInfo) Saving diagnostics information. File = /var/log/coredumps/diag_fccard_3_ts_1417319666 Nov 29 22:54:31 fpp chassisCtr[469]: [ERR] chassisctr fpp-1 [chassis::cardfailure] [IOCARD=3] One or more HW components on IO boards are failed. Unable to Recover. Card turned off. OldState=operStateInitializing, NewState=operStateFailed, Qualifier=vhChip Reset counter=3 Nov 29 22:54:31 fpp chassisCtr[469]: [NOTICE] chassisctr fpp-1 [IOCARD=3] Power Down Nov 29 22:54:31 fpp chassisCtr[469]: [NOTICE] chassisctr fpp-1 [chassis::cardstatechange] (reportState) [IOCARD=3] Operational state changed. OldState=operStateUp, NewState=operStateCriticalFailure, Qualifier=vhChip Nov 29 22:54:32 fpp chassisCtr[469]: [NOTICE] chassisctr fpp-1 [chassis::cardstatechange] (reportState) [IOCARD=3] Operational state changed. OldState=operStateCriticalFailure, NewState=operStateFailed, Qualifier=vhChip Nov 29 22:58:23 fpp chassisCtr[470]: [NOTICE] chassisctr fpp-1 [chassis::disconnect_iocard] [IOCARD=3] Chassis controller process received disconnect event from chassis agent.
-bash-3.2$ grep -i iop-3 syslog.log
Nov 29 22:22:13 iop-3 klogd: [96515.730000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff)
Nov 29 22:22:20 iop-3 klogd: [96522.600000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:22:45 iop-3 klogd: [96547.760000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:25:01 iop-3 klogd: [96683.010000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:25:32 iop-3 klogd: [96714.210000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:25:40 iop-3 klogd: [96722.020000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:27:56 iop-3 klogd: [96858.020000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:28:42 iop-3 -- MARK -- Nov 29 22:29:31 iop-3 klogd: [96952.840000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:31:14 iop-3 klogd: [97055.830000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:34:13 iop-3 klogd: [97233.970000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:35:16 iop-3 klogd: [97297.200000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:38:47 iop-3 klogd: [97508.500000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:39:45 iop-3 klogd: [97565.360000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:39:57 iop-3 klogd: [97577.200000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:42:24 iop-3 klogd: [97724.300000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:43:27 iop-3 klogd: [97786.980000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:46:12 iop-3 klogd: [97951.900000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:48:44 iop-3 -- MARK -- Nov 29 22:49:45 iop-3 klogd: [98164.840000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:49:54 iop-3 klogd: [98173.690000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:51:24 iop-3 klogd: [98263.420000] vh4_pcieif_block_intr_handler():Spurious VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4000000) mask((0x377ffff) Nov 29 22:54:24 iop-3 klogd: [98443.460000] vh4_pcieif_block_intr_handler():FATAL: VH4_PCIEIF_INT_STATUS_LINK_DOWN link_status(0x7245) misc_int_status(0x100) Nov 29 22:54:24 iop-3 klogd: [98443.470000] vh4_post_event(): vh_msg->event: 0xa5a5 (VH4_PCIEIF_INT_STATUS_LINK_DOWN) Nov 29 22:54:24 iop-3 klogd: [98443.700000] vh4_pcieif_block_ql_handler():read_reg_config_dword,write_pcie_reg_config_dword,pcie_reg_config_dword not installed vh4_pcieif_block_nonfatal_handler(): Unsupported interrupt status(0x4000000) ChangesNo manual changes occurred in the environment. CauseFollowing line indicates that the PCIe link between VH4 (FPGA) and QLogic (ASIC) is flapping: VH4_PCIEIF_INT_STATUS_LINK_DOWN link status(0x7a44) status(0x4020000)
This is an indication of a HW issue. SolutionRMA the 2x8 FC IO Module using this CAP (Canned Action Plan) Document: References<NOTE:1518778.1> - How to Replace a Defective I/O Module on a Oracle Fabric Interconnect Chassis?Attachments This solution has no attachment |
||||||||||||||||||||
|