Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2360561.1
Update Date:2018-05-10
Keywords:

Solution Type  Technical Instruction Sure

Solution  2360561.1 :   How to Replace an Exadata X7-2 Compute Node Server CPU  


Related Items
  • Exadata X7-2 Hardware
  •  
  • Zero Data Loss Recovery Appliance X7 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  




In this Document
Goal
Solution
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Exadata internal only for Oracle support engineers use and approved HW partners

Applies to:

Zero Data Loss Recovery Appliance X7 Hardware - Version All Versions and later
Exadata X7-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Goal

How to Replace an Exadata X7-2 Compute Node Server CPU.

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED:

Exadata X7-2 Training

TIME ESTIMATE: 120 minutes

TASK COMPLEXITY: 3-FRU



FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS

PROBLEM OVERVIEW: An Exadata X7-2 Compute Node Server CPU Processor needs replacement

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :

IMPORTANT NOTE TO TSC ENGINEER: CUT & PASTE the “CUSTOMER ACTIVITY” sections of the Pre-Replacement and Post-Replacement steps into a SR Note and ensure the customer is aware to do these steps prior to the scheduled field engineer activity, and during and after the replacement activity.

CUSTOMER ACTIVITY:

Shutdown of the database node is required prior to the part replacement:

If running Linux or Solaris native - follow Steps 1 to 9 of MOS Note:
How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

If running OVM then follow MOS Note:
How to Shutdown and Startup Exadata database nodes running OVM (Doc ID 2367609.1)

 

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE?:

Prepare the Server for Service

The customer should have already prepared the server and powered it off. If not, provide them the instructions in the previous section.

1. Extend the server to the maintenance position
2. Disconnect the power cords from the power supplies
3. Attach an anti-static wrist strap to your wrist and to a metal area on the chassis or the rack.
4. Remove the server top cover.  Use a Torx T10 screwdriver to unlock the release button latch.

Caution - Ensure that all power is removed from the server before removing or installing the CPU processor. You must disconnect the power cables from the system before performing these procedures. 

 

Caution - These procedures require that you handle components that are sensitive to electrostatic discharge. This sensitivity can cause the components to fail. To avoid damage, ensure that you follow anti-static practices.

  

Identifying and Removing the CPU Processor

1. Lift the air baffles up and out of the server and set them aside.

2. Identify the location of the faulty CPU processor by pressing the Fault Remind button on the motherboard I/O card.

Note - When you press the Fault Remind button, an LED located next to the Fault Remind button lights green, indicates that there is sufficient voltage in the fault remind circuit to light any fault LEDs that were lit due to a failure. If this LED fails to light when you press the Fault Remind button, it is likely that the capacitor powering the fault remind circuit lost its charge. This can happen if you press the Fault Remind button for a long time with fault LEDs lit, or if power was removed from the server for more than 15 minutes.  

   The fault LED for the faulty CPU lights. The CPU processor fault LEDs are located next to the processors:

  • If the processor fault LED is off, then the processor is operating properly.
  • If the processor fault LED is on (amber), then the processor is faulty and must be replaced.

3. Using a Torx T30 screwdriver, loosen the four captive nuts that secure the processor-heatsink module to the socket in the following order:
    fully loosen nut 4, then 3, then 2, then 1.

4. Lift the processor-heatsink module from the socket.
    Always hold the processor-heatsink module along the axis of the fins to prevent damage.

5. Separate the CPU processor from the heatsink.

    a. Flip over the processor-heatsink module, place it on a flat surface, and locate the thermal interface material (TIM) breaker slot.
    b. While holding down the processor-heatsink module by the edges, insert a flat blade screwdriver into the TIM breaker slot.
        The blade of the screwdriver goes into the slot between the heatsink and processor carrier, not between the processor and processor carrier.
    c. Using a rocking motion, gently pry the corner of the processor carrier away from the heatsink.
    d. Remove the processor carrier with processor from the heatsink by prying or pinching the plastic latch tabs that attach the processor to the heatsink.

Note - A thin layer of thermal grease separates the heatsink and the processor. This grease acts as an adhesive. Do not allow the thermal grease to contaminate the work space or other components.  

6. If you plan on reusing either the heatsink or CPU processor, use an alcohol pad to clean the thermal grease on the underside of the heatsink and on the top of the CPU processor.

Caution - Failure to clean thermal grease from the heatsink could result in the accidental contamination of the processor socket or other components. Also, be careful not to get the grease on your fingers, as this could contaminate components. 

 

Installing the CPU Processor

Caution - Be careful not to touch the CPU processor socket pins. The processor socket pins are very fragile. A light touch can bend the processor socket pins beyond repair.  

1. Ensure that the replacement processor is identical to the failed processor that you removed.
    For a description of the processors that are supported by the server, see Product Description.

2. Use the syringe supplied with the new or replacement processor to apply 0.3 cc of thermal interface material (TIM) in an "X" pattern to the processor contact area of the heatsink.

Note - Do not distribute the TIM; the pressure of the heatsink will do so for you when you install the heatsink.

3. Install the new CPU processor.

    a. Align the pin 1 indicators between the heatsink and processor carrier in the packaging tray, and place the heatsink (thermal side down) onto the processor carrier until it snaps in place and lies flat.

Note - The processor carrier has latching posts at each corner: two that insert into heatsink holes and two that attach to the edge of the heatsink.

    b. Lift the processor-heatsink module out of the packaging tray.
    c. Align the processor-heatsink module to the processor socket bolster plate on the motherboard, matching the pin 1 location (a triangle indicator).
    d. Place the processor-heatsink module on the socket on the motherboard.
        The socket bolster plate has alignment pins that go into holes on the processor-heatsink module to help center the module during installation.
    e. Ensure that the processor-heatsink module lies evenly on the bolster plate and that the captive screws align with the threaded socket posts.
    f. Using a 12.0 in-lbs (inch-pounds) torque driver (part number 7352217) with a Torx T30 bit, tighten the processor-heatsink module to the socket. First, fully tighten captive nuts 1 and 2. Then fully tighten nuts 3 and 4.
       As you tighten nuts 3 and 4, some resistance occurs as the bolster leaf spring rises and comes in contact with the heatsink.


Return the Server to Operation

1. Install the server top cover. Use a Torx T10 screwdriver to lock the release button latch.
2. Reconnect the power cords to the server power supply and connect any other cables to their original locations.
3. Return the server to the normal rack position.
4. Once the power cords have been re-attached and the ILOM has booted you will see a slow blink on the green LED for the server. Power on the server by pressing the power button on the front of the unit.
5. Connect to the server console via the ILOM and monitor the boot.
    By default the ILOM serial console displays the primary console output.
    In the event of unexpected boot behavior, it is advisable to connect to both ILOM serial and ILOM graphics consoles at the same time and monitor.

 

 

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE?:

FIELD SERVICE ENGINEER and CUSTOMER ACTIVITY:

1. Verify all expected hardware is visible to the server and the fault is cleared. Assistance from the customer for server login access will be required. Oracle ILOM access is required to clear server CPU processor faults

    a. To show server faults, log in to the server as root using the Oracle ILOM CLI, and type the following command to list all known faults on the system:

-> show /SP/faultmgmt

The servers lists all known faults, for example:

-> show /SP/faultmgmt
Targets:
shell
0 (/SYS/MB/P0)
Properties:
Commands:
cd
show  

    b. To clear the fault on processor 0, type the following command as an example:

-> set /SYS/MB/P0 clear_fault_action=true  

For example:

-> set /SYS/MB/P0 clear_fault_action=true
Are you sure you want to clear /SYS/MB/P0 (y/n)? y
Set ‘clear_fault_action’ to ‘true’  

2. Verify there are no outstanding faults in ILOM:

# ipmitool sunoem cli 'show faulty'
Connected. Use ^D to exit.
-> show faulty
Target | Property | Value
-------------------+-----------------------+-----------------------------------
-> Session closed
Disconnected
#

3. Verify there are no outstanding alerts in the Database Node:

# dbmcli -e list alerthistory

4. Re-enable and restart the Database services: 

If running Linux or Solaris native - follow Steps 11 to 14 of MOS Note:
How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

If running OVM then follow MOS Note:
How to Shutdown and Startup Exadata compute nodes running OVM (Doc ID 2367609.1)

 

PARTS NOTE:

7328735 [F] Pre-Greased 1U CPU Heatsink
- 7339550 CPU Clip, Intel H72851-002
7346820 [F] 24-Core 2.1GHz Xeon P-8160, 150W, SR3B0
7352219 CPU Heatsink Torque Tool
- 7352217 [F] 12 in/lb Torque Driver
- 7353594 5mm Drive x T30 Torx Bit x 6" Shank

 

REFERENCE INFORMATION: 

Oracle Exadata Database Machine Maintenance Guide: https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-database-servers.htm#DBMMN22020 

Oracle Server X7-2 Documentation https://docs.oracle.com/cd/E72435_01/index.html 

How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. (Doc ID 1539451.1)

How to Shutdown and Startup Exadata compute nodes running OVM (Doc ID 2367609.1)



Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback