Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1678281.1
Update Date:2018-01-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1678281.1 :   Pillar Axiom: Solaris 9 Reports APM Errors on /var/adm/messages  


Related Items
  • Pillar Axiom 600 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Axiom>SN-DK: Ax600
  •  




Created from <SR 3-8421626671>

Applies to:

Pillar Axiom 600 Storage System - Version All Versions and later
Information in this document applies to any platform.

Symptoms


Solaris 9 host is continuously reporting "HBA_STATUS_ERROR_INVALID_HANDLE" errors. The errors can be seen after a few days of normal operation following a system reboot. Error strings are recorded in /var/adm/messages and also can be collected via APM host traces.

 

axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-1                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-2                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-3                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-2200-4                  status 3
axiompmd[169]: [ID 702911 user.error] HBA_GetAdapterAttributes adapter 0 failed, status HBA_STATUS_ERROR_INVALID_HANDLE
axiompmd[169]: [ID 702911 user.error] HBA_GetAdapterAttributes adapter 1 failed, status HBA_STATUS_ERROR_INVALID_HANDLE
axiompmd[169]: [ID 702911 user.error] HBA_GetAdapterAttributes adapter 2 failed, status HBA_STATUS_ERROR_INVALID_HANDLE
axiompmd[169]: [ID 702911 user.error] HBA_GetAdapterAttributes adapter 3 failed, status HBA_STATUS_ERROR_INVALID_HANDLE
axiompmd[169]: [ID 702911 user.error] HBA_GetAdapterAttributes adapter 4 failed, status HBA_STATUS_ERROR_INVALID_HANDLE
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-0                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-1                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-2                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-375-3108-xx-3                  status 3
axiompmd[169]: [ID 702911 user.error] Unable to retrieve HBA_GetAdapterAttributes name QLogic Corp.-2200-4                  status 3
axiompmd[169]: [ID 702911 user.error] Resetting Connection to 192.168.202.220 after Error in Response
axiompmd[169]: [ID 702911 user.error] SSL_read failed: returned 0, SSL_ERROR_SYSCALL; socket error 0, Error 0
axiompmd[169]: [ID 702911 user.error] ssl_recvMsg 192.168.202.220 failed MGMTILIB_SSL_ERROR

  

Cause

APM calls HBA_GetNumberOfAdapters() which works. For each adapter, it then calls HBA_GetAdapterName() which works. For each name it calls HBA_OpenAdapter() which works and returns a handle to the adapter object. It then passes that handle to a call to HBA_GetAdapterAttributes(), which fails with status HBA_STATUS_ERROR_INVALID_HANDLE.

This is the result of one or more of the resource leaks in the HBA API library, or perhaps in the IMA library which is also used by this daemon. Several such problems have been seen over the years; all known ones are now fixed in recent updates to Solaris 11, but presumably haven't all been back-ported to Solaris 9.
 

Solution

APM cannot provide a fix for this problem as it is related to OS storage drivers and libraries. The fix should be provided by the Solaris storage drivers team on latest SFK (SAN Foundation Kit) patches. Please check your SAN patches and update with the latest released patches.

The APM team has provided the following workaround for this problem:

1- Restarting the APM daemon/service will clear the error condition (it is not disruptive).

  1.a - "/etc/rc2.d/S31axiompmd stop" command will stop the APM daemon.

  1.b- Then, start APM again with "/etc/rc2.d/S31axiompmd start" command.

2- If that doesn't work, a host restart should do it.



References

<NOTE:1500298.1> - Pillar Axiom: How to collect AxiomOne Path Manager (APM) logs on a host
<BUG:18403940> - SOLARIS HBA API RETURNS HBA_STATUS_ERROR_INVALID_HANDLE
<NOTE:1906876.1> - Pillar Axiom: How to Collect a System Information Log and Transfer it to Oracle

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback