Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2362260.1
Update Date:2018-05-10
Keywords:

Solution Type  Technical Instruction Sure

Solution  2362260.1 :   How to Replace an Exadata X7-2 Storage Cell Server Flash F640 Card  


Related Items
  • Exadata X7-2 Hardware
  •  
  • Zero Data Loss Recovery Appliance X7 Hardware
  •  
  • Oracle SuperCluster M8 Hardware
  •  
  • Exadata X7-8 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  




In this Document
Goal
Solution


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Exadata internal only for Oracle support engineers use and approved HW partners

Applies to:

Exadata X7-8 Hardware - Version All Versions and later
Oracle SuperCluster M8 Hardware - Version All Versions and later
Zero Data Loss Recovery Appliance X7 Hardware - Version All Versions and later
Exadata X7-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Goal

How to Replace an Exadata X7-2 Storage Cell Server Flash F640 Card.

Solution

DISPATCH INSTRUCTIONS

Special Instructions for Dispatch are required for this part.

For Attention of Dispatcher:

The parts required in this action plan may be available as spares owned by the customer, which they received with the Engineered System. (These are sometimes referred to as ride-along spares.)

If parts are not available to meet the customer preferred delivery time/planned end date, then request TAM or field manager to contact the customer, and ask if the customer has parts available, and would be prepared to use them.

If customer spare parts are used, inform the customer that Oracle will replenish the customer part stock as soon as we can. More details on this process can be found in GDMR procedure "Handling Where No Parts Available" step 2: https://ptp.oraclecorp.com/pls/apex/f?p=151:138:38504529393::::DN,BRNID,DP,P138_DLID:2,86687,4,9082,


WHAT SKILLS DOES THE ENGINEER NEED: Exadata X7-2 Training

TIME ESTIMATE: 30 minutes

TASK COMPLEXITY: 2

 

FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS:

PROBLEM OVERVIEW:  An Exadata X7-2 Storage Cell Server Flash F640 needs replacement

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?:

IMPORTANT NOTE TO TSC ENGINEER:  CUT & PASTE the “CUSTOMER ACTIVITY” sections of the Pre-Replacement and Post-Replacement steps into a SR Note and ensure the customer is aware to do these steps prior to the scheduled field engineer activity, and during and after the replacement activity.


CUSTOMER ACTIVITY:

The Flash F640 Card is hot-pluggable in the Exadata X7-2 Storage Cell Server and can be replaced with the system operating as long as steps are taken to prepare the card for removal. 

If a flash disk is detected to have failed, then an alert is generated indicating that the flash disk, as well as the LUN on it, has failed. The alert message includes either the PCI slot number and FDOM number or the NVMe slot number. These numbers uniquely identify the field replaceable unit (FRU). The alert will indicate that power to the card has been removed.

The flash disk status should be "dropped for replacement", which indicates the flash disk is ready for online replacement. To verify the card is dropped for replacement:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS LIKE '.*dropped for replacement.*' DETAIL

name: FLASH_6_1
deviceName: /dev/nvme0n1
diskType: FlashDisk
luns: 6_0
makeModel: "Oracle Flash Accelerator F640 PCIe Card"
physicalFirmware: QDV1RD09
physicalInsertTime: 2017-08-11T12:25:00-07:00
physicalSerial: PHLE6514003R6P4BGN-1
physicalSize: 2.910957656800747T
slotNumber: "PCI Slot: 6; FDOM: 1"
status: failed - dropped for replacement

These steps are also provided in the documentation:
 https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-storage-servers.htm#DBMMN-GUID-9E21B6EB-58B9-4502-8AF0-242A5EBD763B  

 

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE?:

Prepare the Server for Service

The Flash F640 Card is hot-pluggable in the Exadata X7-2 Storage Cell Server and does not require you to power off the server.

1. Locate the cell that has the white LED lit that requires maintenance.
2. Ensure the "Do NOT Service" LED is not lit on this node. If it is, then ASM activity must complete and the LED must be turned off before the Flash F640 card in this node can be serviced. Verify ASM activity status with the customer.
3. Extend the server to the maintenance position
4. Attach an anti-static wrist strap to your wrist and to a metal area on the chassis or the rack.
5. Remove the server top cover. Use a Torx T10 screwdriver to unlock the release button latch.

Caution - These procedures require that you handle components that are sensitive to electrostatic discharge. This sensitivity can cause the component to fail. To avoid damage, ensure that you follow safe anti-static practices.

 

Identifying and Removing the Flash F640 Card

1. Identify and note the location of the faulty Flash F640 Card by pressing the Fault Remind button on the motherboard, and reviewing the cell alert message details.  The faulty Flash F640 card PCIe Slot Number is identified with a corresponding amber LED near the rear card edge of the Flash F640 Card.  The faulty Flash F640 card will have its green power LED turned off indicating the server has prepared for its removal.

Caution - Removing a Flash F640 Card with the green power LED illuminated will cause the system to crash suddenly. If the green power LED is illuminated for the slot intended to be replaced, ensure the customer completes the preparation steps prior to removing the card.

2. Rotate open the PCIe card locking mechanism on the flash card that requires replacement.

3. Lift up on the PCIe card to disengage it from the motherboard connector,and place the card on an anti-static mat.

 

Installing a Flash F640 Card

1. Install the new flash card into the required PCIe slot. 

2. Rotate closed the PCIe locking mechanism to secure the PCIe card in place.

The card will automatically power on and initialize.

 

Return the Server to Operation

1. Install the server top cover. Use a Torx T10 screwdriver to lock the release button latch.
2. Return the server to the normal rack position.

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE?:

FIELD SERVICE ENGINEER and CUSTOMER ACTIVITY:

After the Flash F640 card is replaced, Exadata Storage Server Software automatically adds the new device to the cell configuration and starts the rebuilding process.

1. Verify all expected hardware is visible to the cell server software and the fault is cleared. Assistance from the customer for server login access will be required.

a) Physical disk status:

CellCli> list physicaldisk where disktype=flashdisk

This will show all the flash disks, there should be 2 named for each slot, and include the slot that was just replaced. Use the name in the next command.

b) Physical disk detail:

CellCli> list physicaldisk <name> detail

Repeat for each flash disk named in 1a.  Details should be consistent with the replacement and status normal.  Make a note of the "lun" name.

c) Lun status for the lun named in 1b:

CellCli> list lun <name> detail

Details should be consistent with the replacement and status normal.  Make a note of the "celldisk" name.

d) Celldisk status for the celldisk named in 1c:

CellCli> list celldisk <name> detail

Details should be consistent with the replacement and status normal. 

e) If the cell is running flashcache mode "writeback", verify the celldisk named in 1d is listed in the "cachedby" line of some griddisks:

CellCli> list cell attributes name,flashCacheMode

CellCli> list griddisk attributes name,cachedby

f) Flash cache details should include the celldisk named in 1c:

CellCli> list flashcache detail

This should include the celldisk named in 1c and not be 'degraded'.

g) Flash log details should include the celldisk named in 1c:

CellCli> list flashlog detail

This should include the celldisk named in 1c and not be 'degraded'.

2. Verify there are no outstanding alerts in the Cell:  

# cellcli -e list alerthistory


PARTS NOTE:

7335943 [F] 6.4TB Flash Accelerator F640 NVMe Card

 

REFERENCE INFORMATION:

 

Oracle Exadata Database Machine Maintenance Guide: https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-storage-servers.htm#DBMMN-GUID-9E21B6EB-58B9-4502-8AF0-242A5EBD763B

Oracle Server X7-2L Documentation: https://docs.oracle.com/cd/E72463_01/index.html

 

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback