Oracle ZFS Storage Appliance: After performing a "maintenance system restart", the system reported a Cluster Rejoin

Asset ID:	1-72-2044471.1
Update Date:	2017-10-05
Keywords:

Solution Type Problem Resolution Sure

Solution 2044471.1 : Oracle ZFS Storage Appliance: After performing a "maintenance system restart", the system reported a Cluster Rejoin

Applies to:

Sun ZFS Storage 7320 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Sun Storage 7310 Unified Storage System - Version All Versions and later
Sun Storage 7410 Unified Storage System - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

After performing CLI operation "maintenance system restart" customer reported takeover of the storage.

Cause

NOTE: To confirm that the cluster 'links' cabling is correctly configured - See Document ID 2081179.1

When we perform "maintenance system restart", the Appliance Kit Daemon (akd) will restart.

This daemon is responsible for BUI/CLI management of the storage as well as the communication among the cluster heads.

So when the akd is restarted as part of "maintenance system restart", then the communication is affected for a while and once the akd comes back online it starts communicating back to partner head.

That is why we see messages similar to :

        class = alert.ak.kit.reset.manual
        class = alert.ak.xmlrpc.cluster.link.up
        class = alert.ak.xmlrpc.cluster.link.up
        class = alert.ak.xmlrpc.cluster.link.up
        class = alert.ak.xmlrpc.cluster.rejoin.success

We can also check the rm.ak, akd.ak, alert.ak logs and system uptime to confirm the same.

So in reality there no takeover happened, it is just reporting cluster join messages as part of akd restart.

You will notice the resources as still on the storage heads where it was before.

Solution

This is just an alert, and can be ignored as a manual restart of akd or "maintenance system restart" is expected to show these messages - although there are no real resource movement/takeover involved.

Attachments

This solution has no attachment