Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2372030.1
Update Date:2018-03-20
Keywords:

Solution Type  Problem Resolution Sure

Solution  2372030.1 :   Oracle ZFS Storage Appliance: Node is stuck at "Joining Cluster..." after an upgrade  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Symptoms
Changes
Cause
Solution


Created from <SR 3-17046099271>

Applies to:

Sun ZFS Storage 7420 - Version All Versions and later
Sun ZFS Storage 7320 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
Oracle ZFS Storage ZS4-4 - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

Node is stuck at "Joining Cluster..." after an upgrade.

Unable to get into peer head from SP as getting into the console doesn't display anything. This node was upgraded first.

 

Changes

Issue was noticed during AK upgrade.

 

Cause

Node (nas19) is stuck at "Joining Cluster..." after an upgrade.  Console to peer head (nas20) was unavailable as the session did not return anything after multiple attempts.

From SP we noticed that nas20 was reporting fault :

  Properties:
  type = Host System
  ipmi_name = /SYS
  product_name = SUN FIRE X4470 M2 SERVER
  product_part_number = 31295957+3+1
  product_serial_number = 1249FMJ02G
  product_manufacturer = Oracle Corporation
  fault_state = Faulted                   
  clear_fault_action = (none)
  power_state = On

 

 

Solution

Please contact Oracle Support to help resolve the issue.

 

1)  Booted node nas19 (hung at joining cluster) to milestone none
2)  svcadm milestone all
3)  When nas19 reached joining cluster state, powered off nas20 from SP
4)  Few minutes later nas19 joined the cluster successfully.  Login will take you to Solaris shell. Exit from here and login again as root, if no further troubleshooting is required.
5)  After making sure everything is fine on nas19, powered on nas20.
6)  Found that boot disk-001 on nas20 had a predictive failure.

xxxxnas20:maintenance problems> show
Problems:

COMPONENT    DIAGNOSED            TYPE            DESCRIPTION
problem-000  2018-3-10 05:50:06   Major Fault     SMART health-monitoring
                                                  firmware reported that a
                                                  disk failure is imminent.

nas20:maintenance> hardware list
             NAME          STATE     MANUFACTURER  MODEL                   SERIAL        RPM    TYPE  
chassis-000  cheis01nas20  faulted   Oracle        Sun ZFS Storage 7420    1249FMJ02G    --     system
chassis-001  1351NMT010    ok        Oracle        Oracle Storage DE2-24C  1351NMT010    7200   hdd   
chassis-002  1351NMT019    ok        Oracle        Oracle Storage DE2-24C  1351NMT019    7200   hdd   
chassis-003  1351NMT01B    ok        Oracle        Oracle Storage DE2-24C  1351NMT01B    7200   hdd   
chassis-004  1350NMT014    ok        Oracle        Oracle Storage DE2-24C  1350NMT014    7200   hdd   
chassis-005  1351NMT017    ok   

     Oracle        Oracle Storage DE2-24C  1351NMT017    7200   hdd   
chassis-006  1351NMT014    ok        Oracle        Oracle Storage DE2-24C  1351NMT014    7200   hdd   
chassis-007  1350NMT019    ok        Oracle        Oracle Storage DE2-24C  1350NMT019    7200   hdd   
chassis-008  1351NMT011    ok        Oracle        Oracle Storage DE2-24C  1351NMT011    7200   hdd 

maintenance chassis-000> select disk select disk-000 show show1 show
Properties:
                         label = HDD 1
                       present = true
                       faulted = true
                  manufacturer = HITACHI
                         model = H109090SESUN900G
                        serial = 001236ALU97F        KPGLU97F
                      revision = A7E0
                          size = 838G
                          type = data
                           use = system
                           rpm = 10000
                        device = c0t5000CCA016223694d0
                     interface = SAS
                        locate = false
                       offline = false

 7)  Replace the disk

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback