Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2054589.1
Update Date:2018-04-17
Keywords:

Solution Type  Technical Instruction Sure

Solution  2054589.1 :   How to Replace an Exalogic Elastic Cloud X5-2/6-2 Compute Node Solid State Drive (SSD)  


Related Items
  • Exalogic Elastic Cloud X5-2 Hardware
  •  
  • Exalogic Elastic Cloud X6-2 Hardware
  •  
  • Exalogic Elastic Cloud X5-2 Half Rack
  •  
  • Exalogic Elastic Cloud X5-2 Eighth Rack
  •  
  • Exalogic Elastic Cloud X5-2 Quarter Rack
  •  
  • Exalogic Elastic Cloud X5-2 Full Rack
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  




In this Document
Goal
Solution


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: partner can replace disk drives

Applies to:

Exalogic Elastic Cloud X5-2 Half Rack - Version X5 and later
Exalogic Elastic Cloud X5-2 Full Rack - Version X5 and later
Exalogic Elastic Cloud X5-2 Quarter Rack - Version X5 and later
Exalogic Elastic Cloud X5-2 Eighth Rack - Version X5 and later
Exalogic Elastic Cloud X5-2 Hardware - Version X5 and later
Information in this document applies to any platform.

Goal

How to remove and replace a failed Solid State Drive (SSD) on an Exalogic Elastic Cloud X5-2/X6-2 Compute Node.

Solution

 

- PROBLEM OVERVIEW:

Steps required to successfully remove and replace a failed Solid State Drive (SSD) on a Compute Node (CN) within a Exalogic Elastic Cloud X5-2 Machine.

- TIME ESTIMATE: 30 Minutes

Note: This process is NOT intended to be used for proactive disk drive replacements where the drive has NOT been failed by the RAID Controller. If you are performing a proactive disk replacement, you should seek additional help from Oracle Support.

COMPLEXITY: 3


- WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE
RESOLUTION ACTIVITY?:

The SSD drive is part of a mirrored RAID volume and is hot swappable. Therefore it can be replaced at any time, however care should be exercised to ensure the volume is in an appropriate state (Degraded) and that you are removing the proper device.

The volume State should be listed as 'Degraded' when running the following command:

# /opt/MegaRAID/MegaCli/MegaCli64 -Ldinfo -lAll -aAll


Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 371.597 GB
Mirror Data         : 371.597 GB
State               : Degraded
Strip Size          : 256 KB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Bad Blocks Exist: No
PI type: No PI

Is VD Cached: No


The physical disk may have the amber LED illuminated, though this is not a requirement. The drive 'Firmware State' should be something other than 'Optimal, Spun Up', though it is also possible that the drive will be missing from the drive list completely. The drive state can be seen using the following command:

#  /opt/MegaRAID/MegaCli/MegaCli64 -PdList -aAll

Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 8
WWN: 5000CCA04E07500F
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 372.611 GB [0x2e9390b0 Sectors]
Non Coerced Size: 372.111 GB [0x2e8390b0 Sectors]
Coerced Size: 371.597 GB [0x2e732000 Sectors]
Firmware state: Unconfigured(bad)
Is Commissioned Spare : NO
Device Firmware Level: A122
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000cca04e07500d
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HGST    HSCAC2DA4SUN400GA1221437JQ0NWA
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 12.0Gb/s
Media Type: Solid State Device
Drive Temperature :24C (75.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Drive has flagged a S.M.A.R.T alert : No


Note: The above 'Firmware State: Unconfigured(bad)' was from a injected fault which was caused by a mock drive replacement (drive replaced with itself). Though this is a valid fault, it is important to note that this type of fault requires additional manual steps to completely recover the volume that are not discussed within this document.

- WHAT ACTION DOES THE ADMINISTRATOR NEED TO TAKE:

1. Identify the system with the failed SSD drive. Be sure to note he drive number (slot) that has failed, the drive may also have the amber light illuminated. The following commands will also give you status information for the RAID volume and physical drives.

RAID Volume:

# /opt/MegaRAID/MegaCli/MegaCli64 -LdInfo -lAll -aAll

Physical Drives:

# /opt/MegaRAID/MegaCli/MegaCli64 -Pdlist -aAll

2. Once you have verified that the drive is in the correct state and the amber light is illuminated. Press the release button on the failed drive.
3. When the drive tray opens, remove the drive from the compute node by pulling gently on the drive handle.
4. Insert the new drive tray in the empty drive slot in the compute node by sliding the drive tray into the slot and pushing until the handle clicks.

Note: It is generally a good practice to wait at least 1 minute between part removal and insertion.

5. Verify that the new drive is detected. The firmware state should be "Online, Spun Up".

# /opt/MegaRAID/MegaCli/MegaCli64 -PdList -aAll

Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 8
WWN: 5000CCA04E07500F
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 372.611 GB [0x2e9390b0 Sectors]
Non Coerced Size: 372.111 GB [0x2e8390b0 Sectors]
Coerced Size: 371.597 GB [0x2e732000 Sectors]
Firmware state: Online, Spun Up
Is Commissioned Spare : NO
Device Firmware Level: A122
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000cca04e07500d
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HGST    HSCAC2DA4SUN400GA1221437JQ0NWA
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 12.0Gb/s
Media Type: Solid State Device
Drive Temperature :24C (75.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Drive has flagged a S.M.A.R.T alert : No


6. You may also want to verify that the drive's reconstruction is in progress.

# /opt/MegaRAID/MegaCli/MegaCli64 PDRbld ShowProg PhysDrv [252:0] -aAll

Rebuild Progress on Device at Enclosure 252, Slot 0 Completed 60% in 1 Minutes.


Note: The [252:0] is the 'Enclosure Device ID' and the 'Slot Number', which can be obtained from 'MegaCli64 -PdList -aAll', see output from step 5 above.


Verify status of RAID Volume and ensure appropriate status, the volume rebuild process will normally complete in 5-20 minutes.

example: This is the final expected volume and disk drive states.

RAID Volume Status:

# /opt/MegaRAID/MegaCli/MegaCli64 -Ldinfo -lAll -aAll

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 371.597 GB
Mirror Data         : 371.597 GB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Bad Blocks Exist: No
PI type: No PI

Is VD Cached: No


Physical Disk Status:

# /opt/MegaRAID/MegaCli/MegaCli64 -PdList -aAll

Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 8
WWN: 5000CCA04E07500F
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 372.611 GB [0x2e9390b0 Sectors]
Non Coerced Size: 372.111 GB [0x2e8390b0 Sectors]
Coerced Size: 371.597 GB [0x2e732000 Sectors]
Firmware state: Online, Spun Up
Is Commissioned Spare : NO
Device Firmware Level: A122
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000cca04e07500d
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HGST    HSCAC2DA4SUN400GA1221437JQ0NWA
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 12.0Gb/s
Media Type: Solid State Device
Drive Temperature :24C (75.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 252
Slot Number: 1
Drive's postion: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: 0
Device Id: 9
WWN: 5000CCA04E07A947
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 372.611 GB [0x2e9390b0 Sectors]
Non Coerced Size: 372.111 GB [0x2e8390b0 Sectors]
Coerced Size: 371.597 GB [0x2e732000 Sectors]
Firmware state: Online, Spun Up
Is Commissioned Spare : NO
Device Firmware Level: A122
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000cca04e07a945
SAS Address(1): 0x0
Connected Port Number: 1(path0)
Inquiry Data: HGST    HSCAC2DA4SUN400GA1221438JQ6M2A
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 12.0Gb/s
Link Speed: 12.0Gb/s
Media Type: Solid State Device
Drive Temperature :24C (75.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 12.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

Oracle Server X5-2 Service Manual:
http://docs.oracle.com/cd/E41059_01/html/E48320/index.html
Exalogic Machine Owner's Guide: https://docs.oracle.com/cd/E18476_01/index.htm

 

 

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback