![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1614576.1 : What Does "Hermon0: CQE Transport Retry Counter Exceeded" in /var/log/messages File Mean?
Created from <SR 3-8280566671> Applies to:Oracle Exadata Hardware - Version 11.2.3.1.1 and laterSPARC SuperCluster T4-4 Half Rack - Version All Versions and later Solaris SPARC Operating System - Version 11.1 to 11.1 [Release 11.0] Information in this document applies to any platform. SymptomsThe following entry is in the /var/log/messages file - what does this mean? hermon0: CQE transport retry counter exceeded
or WARNING: mcxnex0: CQE ERR: cqe fffff61bcf02f280 QPN 4000f7 indx 14 status 0x15 vendor syndrome 81
WARNING: mcxnex0: CQE transport retry counter exceeded
CauseThe message "hermon0: CQE transport retry counter exceeded" is related to the Infiniband HCA driver and simply means that the IB connection has gone down. SolutionWhen an Infiniband stack client (ULP a.k.a. upper layer protocol) creates a queue pair it will specify a retry timeout and a retry count to the hardware which will dictate what the HCA hardware/firmware does once a message is sent via that QP onto the fabric.
In other words, if you are in the midst of any maintenance operations where the connection may be restarted due to a remote machine reboot, then this is expected. This message is purely informational as long as this doesn't occur frequently and unexpectedly.
Attachments This solution has no attachment |
||||||||||||
|