![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Technical Instruction Sure Solution 2375541.1 : How to Replace an Exadata X7-2 Storage Cell Server Infiniband HCA Card
In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:Oracle SuperCluster M8 Hardware - Version All Versions to All Versions [Release All Releases]Exadata X7-2 Hardware - Version All Versions to All Versions [Release All Releases] Zero Data Loss Recovery Appliance X7 Hardware - Version All Versions to All Versions [Release All Releases] Exadata X7-8 Hardware - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. GoalHow to Replace an Exadata X7-2 Storage Cell Server Infiniband HCA Card. SolutionDISPATCH INSTRUCTIONS
Exadata X7-2 Training. TASK COMPLEXITY: 2
PROBLEM OVERVIEW: An Exadata X7-2 Storage Cell Server InfiniBand HCA Card needs replacement WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? : IMPORTANT NOTE TO TSC ENGINEER: CUT & PASTE the “CUSTOMER ACTIVITY” sections of the Pre-Replacement and Post-Replacement steps into a SR Note and ensure the customer is aware to do these steps prior to the scheduled field engineer activity, and during and after the replacement activity. 1. Determine if the HCA that needs to be replaced is within an InfiniBand network where IB partitions exist by following steps 1 and 2 provided in DOC ID: 1985159.1. 2. Shutdown of the storage cell is required prior to the part replacement. Complete Steps 1 to 6 of MOS Note ID 1188080.1 “Steps to shut down or reboot an Exadata storage cell without affecting ASM” Where noted, the SQL, CellCLI and commands under ‘root’ should be run by the Customers DBA, unless the Customer provides login access to the Field Engineer. These steps are also provided in the documentation:
Prepare the Server for Service The customer should have already prepared the server and powered it off. If not, provide them the instructions in the previous section. 1. Extend the server to the maintenance position Caution - Ensure that all power is removed from the server before removing or installing the IB HCA. You must disconnect the power cables from the system before performing these procedures.
Caution - These procedures require that you handle components that are sensitive to electrostatic discharge. This sensitivity can cause the component to fail. To avoid damage, ensure that you follow safe anti-static practices.
Removing the IB HCA 1. Disconnect and remove the IB cables from the HCA. Pull on the white tab only to unlock and disengage the cables, and then pull on the cable transceiver end to pull them completely out of the HCA port. 2. The IB HCA is located in PCIe Slot 7. This is the first slot to the left of center I/O devices in the middle of the chassis when looking from the front. Rotate the Slot 7 PCIe card locking mechanism open. 3. Lift and remove the IB HCA out of the motherboard slot. 4. Place the IB HCA card on an anti-static mat.
Installing the IB HCA 1. Unpack the replacement Oracle Dual-port QDR (40Gb/s) InfiniBand Host Channel Adapter (HCA) card and place it on an anti-static mat. 2. Insert the IB HCA card into PCIe Slot 7. 3. Rotate the PCIe locking mechanism to secure the IB HCA card in place. You will hear an audible click when the PCIe card is secured into the slot. 4. Reconnect the IB cables into the HCA following the port labels on the cables. Port 1 is the upper slot, and Port 2 is the lower slot. Ensure the cables are oriented correctly and fully seated such that they lock in place and cannot be pulled out without pulling on the pull tab. If the cables are inserted upside down, they will insert ~80% but will not make a connection or lock. Use other servers in the rack to compare visually the orientation of the pull tab.
Return the Server to Operation 1. Install the server top cover. Use a Torx T10 screwdriver to lock the release button latch.
OBTAIN CUSTOMER ACCEPTANCE WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE?: FIELD SERVICE ENGINEER and CUSTOMER ACTIVITY: 1. Verify all expected hardware is visible to the server and the fault is cleared. Assistance from the customer for server login access will be required. 2. Verify the IB links are linked up, active at 40Gbps: # ibstatus
Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0010:e000:01cb:6761 base lid: 0x1f sm lid: 0x2 state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) link_layer: InfiniBand Infiniband device 'mlx4_0' port 2 status: 3. If the HCA is part of an infiniband network where IB partitions exist follow steps 3 and 4 or step 5 of DOC ID: 1985159.1 4. Verify there are no outstanding alerts in the Cell: # cellcli -e list alerthistory
5. Re-activate the Storage Cell grid disks. Follow Steps 7 to 10 of Note ID 1188080.1 “Steps to shut down or reboot an Exadata storage cell without affecting ASM”. These steps are also provided in the documentation:
7092757 [F] Dual 40Gb/Sec (4x) QDR InfiniBand Host Channel Adapter Module M3
REFERENCE INFORMATION: Oracle Server X7-2L Documentation: Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1) Updating IB partitions after replacing an Infiniband HCA in any nodes within IB network - steps to do after replacing HCA (Doc ID 1985159.1) For a documentation reference, in the "Exadata Database Maintenance Guide", use the section titled "General Maintenance Information" section “Powering On and Off Oracle Exadata Rack/Non-emergency Power Procedures” sub-section “Powering Off (or On) Oracle Exadata Rack/Powering Off Storage Servers. Attachments This solution has no attachment |
||||||||||||||||
|