Asset ID: |
1-71-1506299.1 |
Update Date: | 2018-05-17 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1506299.1
:
How to Replace a Sun Fire X4500 HDD (Predictive Failure)
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
|
How to Replace a Hard Drive in an x4500 (Predictive Failure)
In this Document
Created from <SR HOW>
Applies to:
Sun Fire X4500 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Goal
How to Replace a Sun Fire X4500 HDD (Predictive Failure).
Solution
DISPATCH INSTRUCTIONS
WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED:
No special skills required, Customer Replaceable Unit (CRU) procedure
TIME ESTIMATE: 30 minutes
TASK COMPLEXITY: 0
FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS:
PROBLEM OVERVIEW: A Sun Fire X4500 HDD (Predictive Failure) needs replacement
WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :
Caution: To avoid overheating the server, if the server is powered on, do not leave HDD out for longer than 60 seconds at a time. Remove and replace only one HDD at a time. Replace HDD access cover as soon as the service tasks are completed. Before removing a drive, have the replacement drive ready to be installed.
WHAT ACTION DOES THE ENGINEER NEED TO TAKE:
1. Remove the drives access cover.
2. Identify the drive to be removed by checking its LEDs. If the middle LED is on (amber), the drive is faulty and should be replaced.
3. Use the operating system or management software to take HDD offline before you replace it. Not doing so could cause data loss or unexpected error messages. Instructions for Solaris zfs follow. For Linux or MS Windows, conffirm with the customer that the disk has been offlined in the OS before hot plug replacement is preformed. Once the drive has been taken off line, the left (blue) LED should turn on. This means the drive is ready to be removed and service action is allowed.
Caution: Pulling a drive that has that has not been prepared for removal can cause a loss of the drive cell memory map or loss of data in its in/out buffers.
4. Check faulted disk status in zfs
# zpool status POOLNAME
pool: POOLNAME
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
POOLNAME ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c2t3d0 FAULTED 0 0 0
errors: No known data errors
5. Bring disk cXtYd0 offline
# zpool offline POOLNAME cXtYd0
6. Confirm zfs shows the disk is offline
# zpool status POOLNAME
pool: POOLNAME
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
POOLNAME DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
c1t0d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c2t3d0 OFFLINE 0 0 0
errors: No known data errors
7. Get sataB/C device name
# cfgadm | grep sata | grep disk cXt3Y0
sata2/3::dsk/c2t3d0 disk connected configured ok
8. Unconfigure the disk using sataB/C name
# cfgadm -c unconfigure sataB/C
Unconfigure the device at: /devices/pci@1,0/pci1022,7458@3/pci11ab,11ab@1:3
This operation will suspend activity on the SATA device
Continue (yes/no)? yes
9. Confirm disk is unconfigured and ready for replacement
# cfgadm | grep sata | grep sataB/C
sata2/3 disk connected unconfigured ok
10. The blue OK to Remove (OK2RM) led will now be on.
11. Remove the drive. Lift the metal latch and remove the drive from the drive bay as shown below, or on the service label.
12. Install the new drive. Push the drive into the bay until it stops, and make sure the drive is fully engaged with the connector on the drive backplane.
13. Make sure the metal handle is properly seated.
14. Replace HDD access cover.
15. Reconfigure and check status
# cfgadm -c configure sataB/C
# cfgadm | grep sataB/C
sata2/3::dsk/c2t3d0 disk connected configured ok
16. Disk may still be offline in zfs
# zpool status POOLNAME
pool: POOLNAME
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
POOLNAME DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
c1t0d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c2t3d0 OFFLINE 0 0 0
errors: No known data errors
17. Online in zfs
# zpool online POOLNAME cXtYd0
Bringing device c2t3d0 online
18. Confirm disk is back online
# zpool status POOLNAME
pool: POOLNAME
state: ONLINE
scrub: resilver completed with 0 errors on Fri Aug 17 07:33:10 2012
config:
NAME STATE READ WRITE CKSUM
POOLNAME ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
errors: No known data errors
# exit
WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:
For hot plug, configure the drive and verify drive availability.
Use appropriate software commands to re-activate/re-sync mirror if manual intervention is required
PARTS NOTE:
Note: Before removing a drive, have the replacement drive ready to be installed.
REFERENCE INFORMATION:
See the section "To replace a hard drive (CRU)" in the
Sun Fire X4500/X4540 Server Service Manual
http://download.oracle.com/docs/cd/E19121-01/sf.x4500/819-4359-19/index.html
"To replace a hard drive (CRU)" section
http://docs.oracle.com/cd/E19121-01/sf.x4500/819-4359-19/CH3-maint.html#50647083_22785
References
<NOTE:1002753.1> - How to Replace a Drive in Solaris[TM] ZFS
Attachments
This solution has no attachment