Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1406959.1
Update Date:2012-04-10
Keywords:

Solution Type  Problem Resolution Sure

Solution  1406959.1 :   Sun ZFS Storage Appliance: OpenSM messages continually sent to Subnet Manager in Exadata environment  


Related Items
  • Oracle Exadata Hardware
  •  
  • Sun Datacenter InfiniBand Switch 72
  •  
  • Sun InfiniBand Switch 9P
  •  
  • Sun Datacenter InfiniBand Switch 36
  •  
  • Sun Datacenter InfiniBand Switch 648
  •  
  • Sun Datacenter Switch 3456
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
  • Sun Datacenter Switch 3x24
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  
  • _Old GCS Categories>Sun Microsystems>Switches>Sun InfiniBand IB
  •  




In this Document
  Symptoms
  Cause
  Solution


Applies to:

Sun Datacenter InfiniBand Switch 36 - Version: Not Applicable and later   [Release: N/A and later ]
Sun Datacenter InfiniBand Switch 648 - Version: Not Applicable and later    [Release: N/A and later]
Sun Datacenter InfiniBand Switch 72 - Version: Not Applicable and later    [Release: N/A and later]
Sun Datacenter Switch 3456 - Version: Not Applicable and later    [Release: N/A and later]
Sun Network QDR InfiniBand Gateway Switch - Version: Not Applicable and later    [Release: N/A and later]
Information in this document applies to any platform.

Symptoms

- ZFSSA has HCA (4242A )
- Subnet Manager (Running on Linux)
- OpenSM 3.3.9_MLNX_20110704_c986c53

It seems ZFSSA is sending continuous error messages to Subnet Manager and they would like you to stop sending this message.
Here are the details of error message

--------------------------------------------------------
OpenSM 3.3.9_MLNX_20110704_c986c53

Dec 08 15:06:57 404517 [B1D9A700] 0x01 -> sa_mad_ctrl_send_err_callback: ERR 1A06: MAD transaction completed in error Dec 08 15:06:57 404523 [B1D9A700] 0x01 -> SA MAD dump:
                base_ver................0x1
                mgmt_class..............0x3
                class_ver...............0x2
                method..................0x6 (SubnAdmReport)
                status..................0x0
                resv....................0x0
                trans_id................0x2db0890
                attr_id.................0x2 (Notice)
                resv1...................0x0
                attr_mod................0x0
                rmpp_version............0x0
                rmpp_type...............0x0
                rmpp_flags..............0x0
                rmpp_status.............0x0
                seg_num.................0x0
                payload_len/new_win.....0x0
                sm_key..................0x0000000000000000
                attr_offset.............0x0
                resv2...................0x0
                comp_mask...............0x0000000000000000

# Subnet Manager is receiving this error message continuously

Cause

The message mentioned above are related to event forwarding and not timeout retrieving NodeInfo as commonly misunderstood.
The timeout issue was addressed by CR 6923287.
This issue is not related to the above CR.

Having said that, the message refered to are not known to cause any harm - only "annoying" messages in the log.


Solution

Ignore the above message.

It is also required to be mentioned here that Oracle may not be officially supporting the OpenSM version being used.

To provide some information about OpenSM as learnt in the SR which is associated with this document.

OpenSM is an open source implementation of a SM administered through the OpenFabrics Alliance
and many companies in the industry do contribute to SW delivered by this organization.

A modified version of OpenSM is running embedded on Oracles InfiniBand switches.
We have added features to make OpenSM integrate better with higher level management SW (typically for all the Exa* programs).

Oracle is supporting the embedded OpenSM, but we do not support OpenSM in general.
That kind of support needs to be done by the company or organization supplying the OpenSM version in question.
We are making a few exceptions to this for key HPC customers using Oracle IB switches as a key component in their system.

If a customer is not using a Oracle IB switch nor a OpenSM version supplied by Oracle,
then Oracle does not have a release vehicle for such SW in the host environment.



Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback