Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2280208.1
Update Date:2017-07-25
Keywords:

Solution Type  Problem Resolution Sure

Solution  2280208.1 :   Exalogic : "Bad Q_Key" messages reporting on opensm.log from exalogic compute nodes  


Related Items
  • Exalogic Elastic Cloud X5-2 Hardware
  •  
  • Exalogic Elastic Cloud X4-2 Hardware
  •  
  • Oracle Exalogic Elastic Cloud Software
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-14715216201>

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.161018 to 2.0.6.2.170117
Exalogic Elastic Cloud X4-2 Hardware - Version X4 to X4 [Release X4]
Exalogic Elastic Cloud X5-2 Hardware - Version X5 to X5 [Release X5]
Linux x86-64
Oracle Virtual Sever (64-bit)

Symptoms

In X5-2 Exalogic environment, we can see /var/log in two NM2-GW switches are quickly exhausted.

[root@XXXXX01gw01 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 418M 382M 23M 95% /
tmpfs 247M 268K 247M 1% /dev/shm
/dev/sda3 16M 16M 0M 100% /var/log
/dev/sda2 16M 1.6M 14M 11% /config
tmpfs 247M 1.0M 246M 1% /tmp
[root@tbeael01gw01 ~]#

[root@XXXXX01gw02 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 418M 382M 23M 95% /
tmpfs 247M 268K 247M 1% /dev/shm
/dev/sda3 16M 16M 0M 100% /var/log
/dev/sda2 16M 1.6M 14M 11% /config
tmpfs 247M 1020K 246M 1% /tmp

After checking nonempty opensm log, "Bad Q_Key" messages from all compute node, are filling in opensm log. Here is an example from a quarter X5-2 connecting Exadata.

## GW01 ##

opensm.log.1.5 May 11 21:05:41 ~ May 11 21:17:30
opensm.log.1.4 May 11 21:17:30 ~ May 11 21:29:29
opensm.log.1.3 May 11 21:29:29 ~ May 11 21:41:29
opensm.log.1.2 May 11 21:41:29 ~ May 11 21:52:28
opensm.log.1.1 May 11 21:52:28 ~ May 11 22:01:11
opensm.log.1 <== empty
opensm.log <== empty

The frequency of the "Bad Q_Key" messages is distributed across all 8 ExaLogic compute nodes.

May 11 21:05:41 ~ May 12 10:48:57

6660 (XXXXX01cn01
6659 (XXXXX01cn02
6660 (XXXXX01cn03
13318 (XXXXX01cn04
13320 (XXXXX01cn05
13322 (XXXXX01cn06
6660 (XXXXX01cn07
6660 (XXXXX01cn08

## GW02 ##

opensm.log.1.5 May 11 22:42:57 ~ May 11 22:50:57
opensm.log.1.4 May 11 22:50:57 ~ May 11 22:58:56
opensm.log.1.3 May 11 22:58:56 ~ May 11 23:05:41
opensm.log.1.2 May 11 23:05:41 ~ May 11 23:13:09
opensm.log.1.1 May 11 23:13:09 ~ May 11 23:21:27
opensm.log.1 <== empty
opensm.log <== empty

The frequency of the "Bad Q_Key" messages is distributed across all 8 ExaLogic compute nodes.

May 11 22:42:57 ~ May 11 23:21:27

9268 (XXXXX01cn01
9260 (XXXXX01cn02
9256 (XXXXX01cn03
9257 (XXXXX01cn04
9262 (XXXXX01cn05
9262 (XXXXX01cn06
9248 (XXXXX01cn07
9252 (XXXXX01cn08

Due to this issue:

  • GW switch function compromised by dysfunctional opensmd and whereismasterd service.
  • No master SM in IB fabric network.
  • Dysfunctional jobs like creating guest vServers, restart Control stack and etc in Exalogic.
  • Outage in Exadata like missing default IPoIB netowrk in compute and cell node
  • etc..

Changes

Upgrading to Oct 2016 PSU or Jan 2017 PSU.

Cause

This issue happens because of the following Bug in Exalogic X4/X5 applying Oct 2016 PSU or Jan 2017 PSU.

BUG 25720649 - "Bad Q_Key" reporting on opensm.log from nodes

Solution

Bug 25720649 is included into the April 2017 PSU.

Exalogic Infrastructure April 2017 PSU – Fixed Bugs List (Doc ID 2251392.1)

Please upgrade to April 2017 PSU or later.

Exalogic Patch Set Updates (PSU) Master Note (Doc ID 1314535.1)

 

References

<BUG:25720649> - "BAD Q_KEY" REPORTING ON OPENSM.LOG FROM NODES
<NOTE:2251392.1> - Exalogic Infrastructure April 2017 PSU – Fixed Bugs List
<NOTE:2156050.1> - Known Issues for Exalogic 2.0.6.2.X Patch Set Updates
<NOTE:1314535.1> - Exalogic Patch Set Updates (PSU) Master Note

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback