Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2372529.1
Update Date:2018-05-10
Keywords:

Solution Type  Technical Instruction Sure

Solution  2372529.1 :   How to Replace an Exadata X7-2 Compute Node Server Internal RAID HBA Card  


Related Items
  • Exadata X7-2 Hardware
  •  
  • Zero Data Loss Recovery Appliance X7 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  




In this Document
Goal
Solution
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Exadata internal only for Oracle support engineers use and approved HW partners

Applies to:

Zero Data Loss Recovery Appliance X7 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X7-2 Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

 How to Replace an Exadata X7-2 Compute Node Server Internal RAID HBA Card.

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED:
Exadata X7-2 Training

TIME ESTIMATE: 60 minutes

TASK COMPLEXITY: 2



FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS

PROBLEM OVERVIEW: An Exadata X7-2 Compute Node Server RAID HBA (SAS disk controller) needs replacement

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :

IMPORTANT NOTE TO TSC ENGINEER: CUT & PASTE the “CUSTOMER ACTIVITY” sections of the Pre-Replacement and Post-Replacement steps into a SR Note and ensure the customer is aware to do these steps prior to the scheduled field engineer activity, and during and after the replacement activity.

CUSTOMER ACTIVITY:

Offlining the disk cache and shutdown of the database node is required prior to the part replacement.

1. Shutdown of the database node is required prior to the part replacement:

If running Linux or Solaris native - follow Steps 1 to 7 of MOS Note:
How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

If running OVM - follow Steps 1 to 4 of MOS Note:
How to Shutdown and Startup Exadata database nodes running OVM (Doc ID 2367609.1)

2. Revert all the RAID disk volumes to WriteThrough mode to ensure all data in the RAID cache memory is flushed to disk and not lost when disconnection of the SuperCap occurs. As 'root' user, set all logical volumes cache policy to WriteThrough cache mode:

# /opt/MegaRAID/storcli/storcli64 /c0/vall set wrcache=WT

3. Verify the current cache policy for all logical volumes is now WriteThrough:

# /opt/MegaRAID/storcli/storcli64 /c0/vall show

In the volume table, the "Cache" column should report as "NRWTD" where WT indicates WriteThrough.

4. Once all disks are offline and inactive, the customer may shutdown the Cell using the following command:  

# shutdown -hP now

 

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE?:

Prepare the Server for Service

The customer should have already prepared the server and powered it off. If not, provide them the instructions in the previous section.

1. Extend the server to the maintenance position
2. Disconnect the power cords from the power supplies
3. Attach an anti-static wrist strap to your wrist and to a metal area on the chassis or the rack.
4. Remove the server top cover.  Use a Torx T10 screwdriver to unlock the release button latch.

Caution - Ensure that all power is removed from the server before removing or installing the RAID HBA. You must disconnect the power cables from the system before performing these procedures.

 

Caution - These procedures require that you handle components that are sensitive to electrostatic discharge. This sensitivity can cause the components to fail. To avoid damage, ensure that you follow anti-static practices.

 

Removing the RAID HBA

The RAID HBA is located in internal PCIe slot 4 which is part of a riser shared with PCIe slot 3.

1. If there is a NIC card in PCIe slot 3, disconnect and label any network cables.

2. Remove the PCIe riser from slots 3 and 4.

   a. Open the green-tabbed latch located on the rear of the server's chassis next to the PCIe slot 3 to release the PCIe card holding bracket.
   b. To release the riser from the motherboard connector, lift the riser's green-tabbed release lever to the open position.
   c. Grasp the riser with both hands and remove it from the server carefully.

3. Remove the internal host bus adapter card from the riser.

     a. Hold the riser in one hand and use your other hand to carefully remove the card from slot 4 of the riser.
     b. Disconnect the rear bracket attached to the PCIe card from the rear of the PCIe riser.
     c. Place the PCIe riser on an anti-static mat

4. Disconnect the SAS cables and the super capacitor cable from the RAID HBA card and place the card on an anti-static mat.

5. Use a No. 2 Philips screwdriver to remove the special fitted bracket from the RAID HBA card. 

You will need to install the special fitted bracket on the replacement RAID HBA card. Set aside the bracket and screws until you are ready to install the replacement RAID HBA card. 


Installing the RAID HBA

1. Unpack the replacement RAID HBA card

2. Using a No. 2 Philips screwdriver, remove the standard HBA bracket that shipped with the replacement HBA card.

3. Install the special fitted bracket that was removed in Step 4 (above) in "Removing the RAID HBA".

4. Connect the 2 SAS cables and the super capacitor cable to the RAID HBA card.  Ensure the SAS cables are installed in the correct slots as they directly connect to the disk slots being connected through them. Cable 1 goes to disk slots 0-3, Cable 2 goes to disk slots 4-7, the disk slots 8-11 and 12-15 connectors should be empty.

5. Insert the RAID HBA card into the PCIe riser:

  a. Insert the rear bracket that is attached to the PCIe card into the PCIe riser.
  b. Hold the riser in one hand and use your other hand to carefully insert the PCIe card connector into the Riser.

While inserting the HBA, ensure that rear bracket on the HBA card fits into the connector slot on the PCIe riser.

6. Install the PCIe riser into the server:

   a. Raise the green-tabbed release lever on the PCIe riser to the open (up) position, and then gently press the riser into the motherboard connector until it is seated.
   b. Ensure that the rear bracket on the internal SAS HBA card in PCIe slot 4 is connected to the slot in the server's chassis side wall. If the bracket is not connected, remove the riser and re-position it so that the rear bracket connects to the side wall, then gently press the riser into the motherboard connector.
   c. Ensure the SAS cables and the super capacitor cable are routed to the channel on the chassis side wall.
   d. Press the green-tabbed release lever on the PCIe riser to the closed (down) position.
   e. To secure the PCIe card's rear bracket to the server, close the green-tabbed latch on the rear of the server's chassis.

7. If there is a NIC card in PCIe Slot 3, re-connect any network cables that were disconnected.


Return the Server to Operation

1. Install the server top cover. Use a Torx T10 screwdriver to lock the release button latch.
2. Reconnect the power cords to the server power supply and connect any other cables to their original locations.
3. Return the server to the normal rack position.
4. Once the power cords have been re-attached and the ILOM has booted you will see a slow blink on the green LED for the server. Power on the server by pressing the power button on the front of the unit.
5. Connect to the server console via the ILOM and monitor the boot.
    By default the ILOM serial console displays the primary console output.
    In the event of unexpected boot behavior, it is advisable to connect to both ILOM serial and ILOM graphics consoles at the same time and monitor.

 

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE?:

FIELD SERVICE ENGINEER and CUSTOMER ACTIVITY:

1. Verify all expected hardware is visible to the server and the fault is cleared. Assistance from the customer for server login access will be required.

2. Verify all the expected disk devices are present. For Exadata X7-2 Compute Nodes, there should be a single disk volume:

# lsscsi | grep MR
[8:2:0:0] disk AVAGO MR9361-16i 4.72 /dev/sda

3. Verify the status of the Super Capacitor is visible and 'Optimal':

# /opt/MegaRAID/storcli/storcli64 /c0/cv show status

4. Set all logical drives cache policy to WriteBack cache mode:

# /opt/MegaRAID/storcli/storcli64 /c0/vall set wrcache=WB

5. Verify the current cache policy for all logical volumes is now WriteBack:

# /opt/MegaRAID/storcli/storcli64 /c0/vall show

In the volume table, the "Cache" column should report as "NRWBD" where WB indicates WriteBack.

6. If there is a NIC card in PCIe slot 3, verify the network ports are linked. This will either be be eth3 and eth4 if fiber connections or eth5 and eth6 if copper connections, and one of the bond interfaces (0 or 1) should be up and "link detected" is "Yes"; adjust the command according to the output:

# ip addr show dev eth3
# ip addr show dev eth4
# ip addr show dev bondeth0
# ethtool eth3
# ethtool eth4
# ethtool bondeth0
# cat /proc/net/bonding/bondeth0

7. Re-enable and restart the Database services:

If running Linux or Solaris native - follow Steps 11 to 14 of MOS Note:
How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

If running OVM - follow Steps 2 to 5 of MOS Note:
How to Shutdown and Startup Exadata compute nodes running OVM (Doc ID 2367609.1)

 

PARTS NOTE:

7332895 [F] 16-Port 12Gbps SAS-3 Internal RAID HBA

REFERENCE INFORMATION:

Oracle Exadata Database Machine Maintenance Guide: https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-database-servers.htm#DBMMN22020

Oracle Server X7-2 Documentation https://docs.oracle.com/cd/E72435_01/index.html

How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

How to Shutdown and Startup Exadata compute nodes running OVM (Doc ID 2367609.1)



Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback