| Asset ID: |
1-72-2372030.1 |
| Update Date: | 2018-03-20 |
| Keywords: | |
Solution Type
Problem Resolution Sure
Solution
2372030.1
:
Oracle ZFS Storage Appliance: Node is stuck at "Joining Cluster..." after an upgrade
| Related Items |
- Sun ZFS Storage 7420
- Oracle ZFS Storage ZS5-2
- Oracle ZFS Storage ZS3-2
- Oracle ZFS Storage ZS4-4
- Oracle ZFS Storage ZS5-4
- Oracle ZFS Storage ZS3-4
- Sun ZFS Storage 7320
|
| Related Categories |
- PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
|
In this Document
Created from <SR 3-17046099271>
Applies to:
Sun ZFS Storage 7420 - Version All Versions and later
Sun ZFS Storage 7320 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
Oracle ZFS Storage ZS4-4 - Version All Versions and later
7000 Appliance OS (Fishworks)
Symptoms
Node is stuck at "Joining Cluster..." after an upgrade.
Unable to get into peer head from SP as getting into the console doesn't display anything. This node was upgraded first.
Changes
Issue was noticed during AK upgrade.
Cause
Node (nas19) is stuck at "Joining Cluster..." after an upgrade. Console to peer head (nas20) was unavailable as the session did not return anything after multiple attempts.
From SP we noticed that nas20 was reporting fault :
Properties:
type = Host System
ipmi_name = /SYS
product_name = SUN FIRE X4470 M2 SERVER
product_part_number = 31295957+3+1
product_serial_number = 1249FMJ02G
product_manufacturer = Oracle Corporation
fault_state = Faulted
clear_fault_action = (none)
power_state = On
Solution
Please contact Oracle Support to help resolve the issue.
1) Booted node nas19 (hung at joining cluster) to milestone none
2) svcadm milestone all
3) When nas19 reached joining cluster state, powered off nas20 from SP
4) Few minutes later nas19 joined the cluster successfully. Login will take you to Solaris shell. Exit from here and login again as root, if no further troubleshooting is required.
5) After making sure everything is fine on nas19, powered on nas20.
6) Found that boot disk-001 on nas20 had a predictive failure.
xxxxnas20:maintenance problems> show
Problems:
COMPONENT DIAGNOSED TYPE DESCRIPTION
problem-000 2018-3-10 05:50:06 Major Fault SMART health-monitoring
firmware reported that a
disk failure is imminent.
nas20:maintenance> hardware list
NAME STATE MANUFACTURER MODEL SERIAL RPM TYPE
chassis-000 cheis01nas20 faulted Oracle Sun ZFS Storage 7420 1249FMJ02G -- system
chassis-001 1351NMT010 ok Oracle Oracle Storage DE2-24C 1351NMT010 7200 hdd
chassis-002 1351NMT019 ok Oracle Oracle Storage DE2-24C 1351NMT019 7200 hdd
chassis-003 1351NMT01B ok Oracle Oracle Storage DE2-24C 1351NMT01B 7200 hdd
chassis-004 1350NMT014 ok Oracle Oracle Storage DE2-24C 1350NMT014 7200 hdd
chassis-005 1351NMT017 ok
Oracle Oracle Storage DE2-24C 1351NMT017 7200 hdd
chassis-006 1351NMT014 ok Oracle Oracle Storage DE2-24C 1351NMT014 7200 hdd
chassis-007 1350NMT019 ok Oracle Oracle Storage DE2-24C 1350NMT019 7200 hdd
chassis-008 1351NMT011 ok Oracle Oracle Storage DE2-24C 1351NMT011 7200 hdd
maintenance chassis-000> select disk select disk-000 show show1 show
Properties:
label = HDD 1
present = true
faulted = true
manufacturer = HITACHI
model = H109090SESUN900G
serial = 001236ALU97F KPGLU97F
revision = A7E0
size = 838G
type = data
use = system
rpm = 10000
device = c0t5000CCA016223694d0
interface = SAS
locate = false
offline = false
7) Replace the disk
Attachments
This solution has no attachment