![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1542550.1 : Sun Storage 7000 Unified Storage System: Communication with the cluster peer via a cluster interconnect link has been lost
In this Document
Created from <SR 3-7001849821> Applies to:Oracle ZFS Storage ZS3-4 - Version All Versions to All Versions [Release All Releases]Oracle ZFS Storage ZS3-BA - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage ZS4-4 - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage Appliance Racked System ZS4-4 - Version All Versions to All Versions [Release All Releases] Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases] 7000 Appliance OS (Fishworks) SymptomsTo discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance
NOTE: To confirm that the cluster 'links' cabling is correctly configured - See Document ID 2081179.1
Alerts mentioned below may be seen on 7000 Series ZFS Storage Appliance working in a cluster configuration. Issue reported on AK firmware version: 2011.1.x SUNW-MSG-ID: AK-8001-RK, TYPE: alert, VER: 1, SEVERITY: Minor
EVENT-TIME: Thu Jan 31 15:42:59 2013 PLATFORM: i86pc, CSN: <serialno>, HOSTNAME: <hostname> SOURCE: svc:/appliance/kit/akd:default, REV: 1.0 EVENT-ID: 665840e4-94f8-6516-eb1a-f90c0edd0c59 DESC: Communication with the cluster peer via a cluster interconnect link has been lost. AUTO-RESPONSE: None. IMPACT: Cluster reliability is impaired. If the cluster peer is functioning normally but no cluster interconnects remain active, arbitrary and unwanted cluster takeover may occur. REC-ACTION: Check the cluster interconnect cables and the state of the cluster peer. Contact your vendor for support if an interconnect link remains inexplicably down.
PLEASE NOTE: After upgrade to 2013.1.6.0, 'cluster link down' alerts are reported even on a 'normal' reboot. See MOS Doc ID 2195659.1 (See also - Bug 23092294 clustron component fault shows up in problems while links are still active)
Changes
CausePlease check if any support bundle was being generated on the cluster peer head at the time the alert was generated. One of the cause can be 'gcore taking more than 30 seconds to collect akd core' while generating a support bundle.
In case you were collecting a support bundle (or a manual 'gcore' was executed by a Technical Support Engineer) may trigger this alert on the peer head. When collecting a support bundle it does a 'gcore' of akd process - it freezes the akd process so that it can get consistent memory image of akd while creating core file. Once, the core file is created, it unfreezes the akd process so that normal operation can resume. Heartbeats using dlpi link (ethernet) are stopped during this time. Serial port heartbeats continue using the clustron kernel driver and are not affected by akd process. The peer head notices that the heartbeats have stopped from the dlpi link and an alert (alert.ak.xmlrpc.cluster.link.down) is posted - after cio_alert_delay (30 seconds default)
cd /tmp; gcore -o akd `pgrep -ox akd` && mv /tmp/akd.`pgrep -ox akd` /var/ak/dropbox
SolutionIf the issue is identified as that mentioned in the above section, then alerts are known not to cause any issues with cluster functionality or reliability - and can be ignored. This Bug: 16083259 was closed as duplicate of Bug: 21224255 which is fixed in 2013.1.6.0 References<BUG:16083259> - GCORE OF AKD SLOWER IN ZFS. CAUSES ALERT.AK.XMLRPC.CLUSTER.LINK.DOWN ON PEER<NOTE:1402545.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Cluster Problems <NOTE:2021771.1> - Oracle ZFS Storage Appliance: Software Updates <BUG:21224255> - UNRESOLVED CLUSTRON LINK DOWN STATE SHOULD BE TRACKED AS A FMA FAULT <NOTE:2262465.1> - Oracle ZFS Storage Appliance: Reboot "Unexpectedly found SAS zone locks held" Attachments This solution has no attachment |
||||||||||||||||||||
|