Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2174550.1
Update Date:2016-09-28
Keywords:

Solution Type  Problem Resolution Sure

Solution  2174550.1 :   Exalogic X4-2 & X5-2 Solaris Compute Nodes Kernel Panic Due To Old IB HCA FW Version  


Related Items
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Exalogic Elastic Cloud X5-2 Hardware
  •  
  • Exalogic Elastic Cloud X4-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-12733574021>

Applies to:

Exalogic Elastic Cloud X4-2 Hardware - Version X4 to X4 [Release X4]
Exalogic Elastic Cloud X5-2 Hardware - Version X5 to X5 [Release X5]
Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.0 to 2.0.6.2.160719
Oracle Solaris on x86-64 (64-bit)

Symptoms

In Exalogic X4-2 and X5-2 racks running Solaris Physical release 2.0.6.X, issue of Solaris Compute Nodes kernel panic is seen.

Below is the kernel panic messages seen in Core dump file captured during the issue. As we can see the crash occurs at "ibt_status_t hermon" IB HCA stack.

<trap>int genunix:ddi_dma_unbind_handle+0x14((ddi_dma_handle_t)0xffffc2691bf83600)
ibt_status_t hermon:hermon_ci_unmap_mem_iov+0x45((ibc_hca_hdl_t)0xffffc10361793000, (ibt_mi_hdl_t)0xffffc281c0019ee8)
ibt_status_t ibtl:ibt_unmap_mem_iov+0x28((ibt_hca_hdl_t)0xffffc28183132880, (ibt_mi_hdl_t)0xffffc281c0019ee8)
void eib:eib_data_tx_comp+0x7b((eib_t *)0xffffc105766ed000, (eib_wqe_t *)0xffffc105776e0180)
uint_t eib:eib_data_tx_comp_handler+0xb5((caddr_t)0xffffc28181dba530, (caddr_t)0)
void unix:av_dispatch_softvect+0x62((uint_t)1)
void apix:apix_dispatch_softint+0x30((uint_t)0, (uint_t)0)

CAT(vmcore.2/11X)> modinfo ibtl
ID flags modctl textaddr size cnt name
104 LI 0xffffc2818151fcf8 0xfffffffff7c17000 0x1f9c8 1 ibtl (IB Transport Layer)
CAT(vmcore.2/11X)> modinfo eib
ID flags modctl textaddr size cnt name
207 LI 0xffffc2818220fc50 0xfffffffff8008000 0x24358 1 eib (Ethernet Over InfiniBand)
CAT(vmcore.2/11X)> modinfo hermon
ID flags modctl textaddr size cnt name
120 LI 0xffffc281815323a8 0xfffffffff7d04000 0xa538 1 hermon (ConnectX IB Driver)

 

Cause

This issue happens due to known Bug 23543706 in IB HCA Card Firmware version 2.11.1282 in X4-2 and X5-2 racks.

Solution

Contact Support by opening Service Request in case you run into this issue for further assistance and guidance on this issue.

INTERNAL NOTE TO SUPPORT

Once confirmed by Solaris Support team that the Customer is running into known Bug with IB HCA Card Firmware version as described in Bug 23543706, reach out to Exalogic Development (Dev Prabhu) for providing one off Patch for Customer for below Bug 23543706. This Bug 23543706 is fixed in upcoming October 2016 PSU.

BUG 23543706 - IB CARD FIRMWARE V2.31.5350 FOR EXALOGIC X4-2 AND X5-2 COMPUTE NODES 

References

<BUG:20546737> - EIB_DATA_TX_COMP TRIPS OVER FREED DATA
<BUG:23543706> - IB CARD FIRMWARE V2.31.5350 FOR EXALOGIC X4-2 AND X5-2 COMPUTE NODES
<NOTE:1268557.1> - Exalogic Elastic Cloud Software Known Issues

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback