Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1991608.1
Update Date:2018-02-27
Keywords:

Solution Type  Problem Resolution Sure

Solution  1991608.1 :   IB Errors: Kernel: IB0: ipoib_cm_handle_tx_wc: failed cm send event: vend_err 81  


Related Items
  • Linux OS
  •  
  • Linux OS
  •  
  • Exalogic Elastic Cloud X4-2 Hardware
  •  
Related Categories
  • PLA-Support>Infrastructure>Operating Systems and Virtualization>Operating Systems>Oracle Linux
  •  




In this Document
Symptoms
Cause
Solution
References


Applies to:

Linux OS - Version Oracle Linux 5.8 with Unbreakable Enterprise Kernel [2.6.39] to Oracle Linux 6.9 with Unbreakable Enterprise Kernel [3.8.13] [Release OL5U8 to OL6U9]
Linux OS - Version Oracle Linux 6.6 with Unbreakable Enterprise Kernel [3.8.13] to Oracle Linux 7.3 with Unbreakable Enterprise Kerne [4.1.12] [Release OL6U6 to OL7U3]
Exalogic Elastic Cloud X4-2 Hardware - Version X4 to X4 [Release X4]
Linux x86-64

Symptoms

In InfiniBand client, follow IB errors were seen for many hours:

kernel: ib0: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=423 vend_err 81)

Company with above IB errors, some processes hung around the time and I/O stops:

kernel: INFO: task xxxx:nnnn blocked for more than 120 seconds.

Or even

kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

 

Cause

The vend_err 81 is a timeout due to no response.

In IB connected mode, this means the transmit cannot be ack'ed from the remote system (e.g. ZFS Storage Appliance), and this timeout event is logged.

UEK2 kernel behaves the same as UEK3 kernel, and code review shows that function ipoib_cm_handle_tx_wc() is unchanged between these Oracle Linux versions.

Solution

Check the InifiniBand components in remote system (e.g. ZFS Storage Appliance) or the whole remote system, if any crash/reboot/hang during the time frame.

References

<NOTE:1950074.1> - Oracle Linux: Warning Message "ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=## vend_err ##)"
<BUG:19781867> - OL 6.5/7.0 IB ERRORS: KERNEL: IB0: IPOIB_CM_HANDLE_TX_WC: FAILED CM SEND EVENT
<NOTE:2215794.1> - Oracle Linux : Error " kernel: ib1: ipoib_cm_handle_tx_wc: failed cm send event (status=12, wrid=459 vend_err 85) "

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback