Asset ID: |
1-71-2001252.1 |
Update Date: | 2018-05-24 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
2001252.1
:
How to Replace an Oracle Database Appliance X5-2 Motherboard and Oracle Database Appliance X6-2HA Motherboard
Related Items |
- Oracle Database Appliance X5-2
- Oracle Database Appliance X6-2 HA Hardware
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
|
Oracle Confidential PARTNER - Available to partners (SUN).
Reason: internal CAP for FE's
Applies to:
Oracle Database Appliance X5-2 - Version All Versions and later
Oracle Database Appliance X6-2 HA Hardware - Version All Versions and later
x86_64
Goal
How to Replace an Oracle Database Appliance X5-2 Motherboard and Oracle Database Appliance X6-2HA Motherboard
Solution
CAP PROBLEM OVERVIEW: MOTHERBOARD ASSEMBLY REPLACEMENT
DISPATCH INSTRUCTIONS
WHAT SKILLS DOES THE ENGINEER NEED:
Oracle Database Appliance X5-2/X6-2HA training.
Note: The removal/insertion CPU tool is new for the Sandy Bridge M3 product lines. If you have not used this new tool before please make yourself familiar before attempting to use on-site. The tool is not intuitive so reference the service manual before attempting this service action.
TIME ESTIMATE: 130 minutes
TASK COMPLEXITY: 3-FRU
FIELD ENGINEER INSTRUCTIONS
WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :
If the system is still up and functioning, customer should perform an orderly and graceful shutdown of applications and OS. Then power off the server and remove the AC power cords from the system.
Before shutting down, to obtain a backup of the bios configuration, issue this command: /usr/sbin/ubiosconfig export all -f --expert -x /tmp/bios.xml
***This is especially important if the customer has limited the amount of cores licensed. ***
Also before shutting down, obtain a backup of the ilom configuration.
Assuming the ILOM is not the reason for the replacement of the system MB, then take a current backup of the ILOM SP configuration using a browser under “ILOM Administration → Configuration Management” tab on the left menu list.
This can also be done from the ILOM CLI as follows:
cd /SP/config
set passphrase=welcome1
set dump_uri=scp://root:password@laptop_IP/var/tmp/SP.config
***This is especially important if the customer has ASR configured. ***
WHAT ACTION DOES THE ENGINEER NEED TO TAKE:
Reference Doc:
Oracle Server X5-2 Remove the Motherboard:
http://docs.oracle.com/cd/E41059_01/html/E48312/napsm.z40017961418774.html#scrolltoc
Oracle Server X6-2 Remove the Motherboard:
http://docs.oracle.com/cd/E62159_01/html/E62171/z40011771418436.html
1. Log into the ILOM check the fruid container values and sync them if needed.
- To avoid mismatched fruid values causing a failure after a motherboard replacement the fruid data should be confirmed to have matching data in at least the Primary (DBP) and Backup2 (PS0) containers so that the motheraboard will have it's container updated automatically after replacement. Go into restricted mode and use the showpsnc command to check this.
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) x5-2]# showpsnc
Primary: fruid:///SYS/DBP
Backup 1: fruid:///SYS/MB
Backup 2: fruid:///SYS/PS0
Element | Primary | Backup1 | Backup2
------------------+-------------------+-------------------+-------------------
PPN 33154574+1+1 33154574+1+1 33154574+1+1
PSN 1449NM1018 1449NM1018 1449NM1018
Product Name ORACLE SERVER X5-2 ORACLE SERVER X5-2 ORACLE SERVER X5-2
[(restricted_shell) x5-2]# exit
- The above example shows a system with all three containers properly in sync. If the output from the system does not show all of the containers with matching values then you should reset the SP and then re-check the values again. An ILOM reset will attempt to auto-populate the matching values if one container is out of sync.
-> reset /SP
Are you sure you want to reset /SP (y/n)? y
Performing reset on /SP
- After an ILOM reset if the Primary and Backup2 containers match then proceed with the following steps to replace the motherboard. If these two containers do not match then DO NOT proceed with the replacement yet.
- If the containers do not match you will need to use the copypsnc command from service or escalation mode to copy the data from the good container so that the Primary and Backup2 containers match (Backup1 is the MB and we are about to replace this so it is not as important at this step). If you are unfamiliar with this process and require assistance please reference the steps for using copypsnc to fix the serial number detailed in the "How to update product serial number on systems which implement TLI functionality (Doc ID 1280913.1)" and contact the TSC if needed. How to access service mode and escalation mode on ILOM 3.x and later platforms (Doc ID 1019946.1)
- After the fruid data in the Primary and Backup2 containers have been confirmed to match proceed with the following steps.
2. Prepare the server for service.
- Power off the server and disconnect the power cords from the power supplies.
- Extend the server to the maintenance position in the rack.
- Attach an anti-static wrist strap.
3. Remove the top cover and all of the Fan Modules.
4. Remove the power supplies.
- If the cable management arm (CMA) is installed, disconnect both CMA left-side connectors and move the CMA out of the way.
Caution - When disconnecting the CMA left-side connectors, use something to support the CMA so that it does not hang down under its own weight and stress the right-side connectors; otherwise, the CMA might be damaged. You must continue to support the CMA until you have reconnected both of the left-side connectors.
- Grasp the power supply handle and push the power supply latch to the left.
- Pull the power supply out of the chassis. Repeat steps b-c for the second power supply.
Caution - When removing the power supplies it is important to label power supplies with the slot numbers from which they were removed (PS0, PS1). This is required because the power supplies must be reinstalled into the slots from which they were removed; otherwise, the server key identity properties (KIP) data might be lost.
5. Remove the PCIe cards and PCIe risers.
6. Disconnect all the cables from the motherboard.
- To disconnect the disk backplane power cable from the motherboard, press in on the connector latch and pull the connector out.
- To eject the disk backplane auxiliary power and signal cable connector, open both side latches.
- To eject the FIM cable connector, open both side latches.
- If the server has a DVD drive, do the following:
- Disconnect the DVD drive cable from the motherboard.
- To remove the DVD drive cable off of the motherboard, carefully guide it through the chassis mid-wall and place it on top of the disk cage so that it is away from the motherboard. You do not need to disconnect the DVD drive cable from the DVD drive.
- To remove the SAS cables and the super capacitor cable that were connected to the HBA card, carefully lift them out of the chassis and place them on top of the disk cage so that they are away from the motherboard.
- To remove the cables that were connected to the switch card, carefully guide them through the chassis mid-wall and put them aside.
7. Remove the server mid-wall.
- Using a screwdriver (No. 2 Phillips or flathead), loosen the four green captive screws that secure the mid-wall to the server chassis.
- Lift up the mid-wall slightly to disengage it from the raised mushroom-shaped standoffs that are located on the server chassis sidewall (one on each end of the mid-wall), then lift it out of the server and set it aside.
8. Remove the motherboard from the server chassis.
9. Remove the motherboard components.
- Remove the air baffle from the motherboard and set it aside.
- Remove the internal USB flash drives from the motherboard making note of the original port locations.
- Remove the DIMMs from the motherboard.
- Remove the processors from the failed motherboard.
- Gently press down on the top of the heatsink to counteract the pressure of the captive spring-loaded screws that secure the heatsink to the motherboard and loosen the four Phillips captive screws in the heatsink for the failed processor.
- Using a No. 2 Phillips screwdriver, turn the screws counterclockwise alternately one and one half turns until they are fully removed.
- To separate the heatsink from the top of the processor, gently twist the heatsink left and right, while pulling upward, and then lift off the heatsink and place it upside down on a flat surface. A thin layer of thermal grease separates the heatsink and the processor. This grease acts as an adhesive.
- Use an alcohol pad to clean the thermal grease from the underside of the heatsink. Be very careful not to get the thermal grease on your fingers.
- Disengage the processor release lever on the right side of the processor socket (viewing the server from the front) by pushing down on the lever and moving it to the side away from the processor, and then rotating the lever upward.
- Disengage the processor release lever on the left side of the processor socket (viewing the server from the front) by pushing down on the lever and moving it to the side away from the processor, and then rotating the lever upward.
- To lift the ILM assembly load plate off of the processor socket, rotate the processor release lever on the right side of the processor toward the closed position (the ILM assembly load plate is lifted up as the release lever is lowered toward the closed position) and carefully swing the ILM load plate to the fully open position.
- To remove the processor from the processor socket, acquire the processor removal and replacement tool and perform the following steps:
- Identify the correct processor removal and replacement tool based on the size of the processor. The processors with 10 or fewer cores are smaller than the processors with 12 or more cores. You can determine the size of the processor that you are going to remove and replace in either of two ways, via ILOM, or visually.
- After the heatsink has been removed you can determine if the larger processors are installed looking at the right and left edges of the processor. If they extend beyond the boundaries of the processor alignment brackets they are the larger processors. If they are withing the alignment brackets they are the smaller processors.
- The processor removal and replacement tool is color coded:
- Green, color-coded removal and replacement tool for the smaller processors-models (10 cores or less).
- Pink, color-coded removal and replacement tool for the larger processor-models (12 cores or more).
- Locate the button in the center of the top of the processor removal and replacement tool and press it to the down position.
- Properly position the tool over the processor socket and lower it into place over the processor socket. To properly position the tool over the processor socket, rotate the tool until the colored triangle on the side of the tool is facing the front of the server and it is over the left side of the processor socket when viewing the server from the front.
- Press the release lever on the tool to release the center button and engage the processor. An audible click indicates that the processor is engaged.
- Grasp the tool by the sides and remove it from the server.
- Turn the tool upside down and verify that it contains the processor.
- While holding the processor tool up side down, press the center button on the tool to release the processor.
- Carefully grasp the processor by the front and back edges, lift it out of the tool and place it with the circuit side down (the installed orientation) onto the antistatic mat.
- Carefully clean the thermal grease off the top of the processor.
10. Install the motherboard components on the replacement board.
- Remove the processor socket covers from the replacement motherboard.
- Disengage the processor ILM (independent loading mechanism) assembly hinge lever on the right side of the processor socket (viewing the server from the front) by pushing down on the lever and moving it to the side away from the processor, and then rotating the lever upward.
- Disengage the processor ILM assembly load lever on the left side of the processor socket (viewing the server from the front) by pushing down on the lever and moving it to the side away from the processor, and then rotating the lever upward.
- To lift the processor ILM assembly load plate off of the processor socket, rotate the ILM assembly hinge lever on the right side of the processor toward the closed position (the load plate is lifted up as the hinge lever is lowered) and carefully swing the load plate to the fully open position.
- Grasp the top and underside of the processor socket cover with one hand (place your thumb against the underside of the cover), place your other thumb against the underside of the cover, and carefully push the cover out of the processor ILM assembly load plate. Be careful not to allow the processor socket cover to fall into the processor socket as this could result in damage to the socket.
- Repeat steps 1-4 above to remove the second processor socket cover from the replacement motherboard.
- Install the socket covers on the bad motherboard processor sockets to protect the sockets during transport.
- Open one of the processor ILM assemblies on the failed motherboard.
- Hold the processor ILM assembly load plate open with one hand and position the processor socket cover over the top of the ILM assembly load plate so that 1) the arrow on the processor socket cover is aligned with the arrow on the top left bottom of the load plate and 2) the fasteners on one side of the cover (the fasteners are located on the underside of the cover) are inside the load plate (it does not matter which side), and use your thumb to press the other side of the processor socket cover into the load plate. You will hear a clicking sound when the processor socket cover snaps into place.
- Close the processor ILM assembly load plate.
- Repeat Step 1 through Step 3 above to install the second processor socket cover on the failed motherboard.
- Install the processors on the replacement motherboard.
- Ensure that the two processor ILM assembly levers and the ILM assembly load plate are in the fully open position.
- To install the replacement processor into the socket, acquire the processor removal and replacement tool and perform the following steps:
- Press the button in the center of the tool to the down position.
- Turn the tool upside down, grasp the processor by its front an back edges and position the processor (circuit side up) in the tool so that the triangle on the corner of the processor aligns with the triangle on the side of the processor removal and replacement tool.
- Lower the processor into the tool and press the tool release lever to release the center button and engage the processor. An audible click indicates that the processor is locked in place.
- Properly position the tool over the processor socket and lower it into place. To properly position the tool in the processor socket, rotate the tool until the colored triangle on the side of the tool is facing the front of the server and it is over the left side of the processor socket (when viewing the server from the front) and lower the tool into the processor socket.
- Press the center button in the tool down to release the processor so that it is placed in the socket.
- Remove the processor removal and replacement tool.
- Visually check the alignment of the processor in the socket. When properly aligned, the processor sits flat in the processor socket.
Caution - Do not press down on the processor. Irreparable damage to the processor or motherboard might occur from excessive downward pressure. Do not forcibly seat the processor into the socket. Excessive downward pressure might damage the socket pins.
- Swing the processor ILM assembly load plate into the closed position. Ensure that the load plate sits flat around the periphery of the processor.
- Engage the socket release lever on the left side of the socket (viewing the server from the front) by rotating it downward and slipping it under the catch.
- Engage the socket release lever on the right side of the socket (viewing the server from the front) by rotating it downward and slipping it under the catch.
- Use the syringe (supplied with the replacement motherboard) to apply approximately 0.1 ml of thermal grease to the center of the top of the processor. To measure 0.1 ml of thermal grease, use the graduated scale on the thermal grease syringe.
Note - Do not distribute the grease; the pressure of the heatsink will do so for you when you install the heatsink.
- Inspect the heatsink for dust and lint. Clean the heatsink if necessary.
- Orient the heatsink so that the screws line up with the mounting studs.
- Carefully position the heatsink on the processor, aligning it with the mounting posts to reduce movement after it makes initial contact with the layer of thermal grease.
Caution - Avoid moving the heatsink after it has contacted the top of the processor. Too much movement could disturb the layer of thermal grease, causing voids, and leading to ineffective heat dissipation and component damage.
- Tighten the Phillips screws with a No. 2 Phillips screwdriver alternately one-half turn until fully seated.
- Repeat steps 1 through 11 above to install the second processor on the replacement motherboard.
- Install the DIMMs onto the replacement motherboard in the corresponding DIMM sockets on the replacement motherboard. Install the DIMMs only in the sockets (connectors) that correspond to the sockets from which they were removed. Performing a one-to-one replacement of the DIMMs significantly reduces the possibility that the DIMMs will be installed in the wrong slots.
- Install the internal USB flash drives onto the replacement motherboard. Ensure to place in the original USB port locations.
- Install the air baffle on the replacement motherboard.
11. Install the motherboard into the server chassis.
12. Install the server mid-wall.
13. Reconnect all the cables to the motherboard.
14. Reinstall the PCIe cards and PCIe risers.
15. Reinstall the power supplies.
- Align the replacement power supply with the empty power supply slot.
Caution - When reinstalling power supplies, it is important to reinstall them into the slots from which they were removed during the motherboard removal procedure; otherwise, the server key identity properties (KIP) data might be lost. When a server requires service, the KIP is used by Oracle to verify that the warranty on the server has not expired.
- Slide the power supply into the bay until it is fully seated. You will hear an audible click when the power supply fully seats. Repeat steps a-b for the second power supply.
- If you disconnected the two CMA left-side connectors, reconnect the connectors.
16. Reinstall all of the Fan Modules and the top cover.
17. Return the Server to operation.
- Remove any anti-static measures that were used.
- Return the server to it's normal operating position within the rack.
- Re-install the AC power cords and any data cables that were removed.
- Power on server. Verify that the Power/OK indicator led lights steady on.
18. Set the system serial number/fruid data if needed.
- The motherbaord is not the primary fruid container in this server so when it is replaced you should not normally need to fix the serial number information.
- login to the ILOM as root and then enter the restricted shell to check the fruid values. Follow the example below to enter restricted shell and use the showpsnc command
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) x5-2:~]# showpsnc
Primary: fruid:///SYS/DBP
Backup 1: fruid:///SYS/MB
Backup 2: fruid:///SYS/PS0
Element | Primary | Backup1 | Backup2
------------------+-------------------+-------------------+-------------------
PPN 33154574+1+1 33154574+1+1 33154574+1+1
PSN 1449NM1018 0000000000 1449NM1018
Product Name ORACLE SERVER X5-2 ORACLE SERVER X5-2 ORACLE SERVER X5-2
[(restricted_shell) x5-2:~]#
- When the motherboard is replaced the Backup1 fruid container will likely not match the Primary entry. If it does not you must enter escalation or service mode to fix it (if all three entries match this step is done).
- Contact the TSC to request an escalation password (service mode will work also if just the copypsnc command ends up needing to be used, if the setpsnc command is needed escalation mode is required. setpsnc is not covered in this procedure).
- Provide your TSC contact the output from the following ILOM commands- "version", "show /SYS product_serial_number", and "show /SP/clock". If the product_serial_number information does not give good output then provide the showpsnc output that was seen in step b above as well.
** REFER TO DOC ID 1280913.1 for the procedure on how to update the TLI serial number fields **
- At this point if all of the fruid containers match and have the correct serial number data this step is done. If more than one of the fruid containers had non-valid entries then the copypsnc command should be used to copy over the valid data to the other container that is not valid. (ie. "copypsnc Primary Backup2" to copy primary to backup2) After confirming all fruid data is correct reset the ILOM to confirm that the fruid data persists through a reboot and remove the escalation user if needed.
-> reset /SP
Are you sure you want to reset /SP (y/n)? y
Performing reset on /SP
..........
***login as the root user again and check the fruid data***
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) x5-2]# showpsnc
Primary: fruid:///SYS/DBP
Backup 1: fruid:///SYS/MB
Backup 2: fruid:///SYS/PS0
Element | Primary | Backup1 | Backup2
------------------+-------------------+-------------------+-------------------
PPN 33154574+1+1 33154574+1+1 33154574+1+1
PSN 1449NM1018 1449NM1018 1449NM1018
Product Name ORACLE SERVER X5-2 ORACLE SERVER X5-2 ORACLE SERVER X5-2
[(restricted_shell) x5-2]# exit
-> cd /SP/users
/SP/users
-> delete escuser
Are you sure you want to delete /SP/users/escuser (y/n)? y
Deleted /SP/users/escuser
- If trouble is encountered during any of the steps of accessing escalation mode and fixing the fruid containers please contact the TSC for assistance.
How to verify the Motherboard is working properly.
1. Log into ILOM to confirm if motherboard status is working properly.
Sample
-> show /SYS/MB
/SYS/MB
Targets:
BIOS
CPLD
FM0
FM1
FM2
FM3
NET0
NET1
NET2
NET3
P0
P1
RISER1
RISER2
RISER3
T_CORE_NET01
T_CORE_NET23
T_IN_PS
T_IN_SLOT1
T_IN_SLOT2
T_IN_SLOT3
T_OUT_SLOT1
T_OUT_SLOT2
T_OUT_SLOT3
Properties:
type = Motherboard
ipmi_name = MB
fru_description = ASM,MOTHERBOARD,1U
fru_manufacturer = MiTAC International Corporation
fru_part_number = 7098505
fru_rev_level = 06
fru_serial_number = 489089M+14364B00M8
fault_state = OK
clear_fault_action = (none)
Commands:
cd
set
show
->
2. Check ILOM event log to see if any error related motherboard.
-> show /SP/faultmgmt
-> show /SP/logs/event/list
OBTAIN CUSTOMER ACCEPTANCE
WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:
Boot up system and verify full functionality.
19. Edit the following files in the /etc/sysconfig/network-scripts directory to change the MAC address to the new MAC address.
* If the system is running the Virtualized OS, then this is done in DOM0 *
If the server node has an IB card installed in slot 1:
ifcfg-eth0 <<< /SYS/MB/NET0
ifcfg-eth1 <<< /SYS/MB/NET1
ifcfg-eth2 <<< /SYS/MB/NET2
ifcfg-eth3 <<< /SYS/MB/NET3
If the server node has a fiber card installed in slot 1:
ifcfg-eth0 <<< /SYS/MB/NET0
ifcfg-eth1 <<< /SYS/MB/NET1
ifcfg-eth4 <<< /SYS/MB/NET2
ifcfg-eth5 <<< /SYS/MB/NET3
There are 2 ways to get the MAC address, from ipmitool or from the ILOM commandline.
From ipmitool:
# ipmitool sunoem cli "show /SYS/MB/NET0 fru_macaddress" << do this for NET1,NET2,NET3
From ILOM cmd:
-> show /System/Networking/Ethernet_NICs/Ethernet_NIC_0 mac_addresses << do this for NIC_1, NIC_2, NIC_3
20. Check if ILOM firmware is running at the correct level for the ODA software release:
]# oakcli show version -detail
Reading the metadata. It takes a while...
System Version Component Name Installed Version Supported Version
-------------- --------------- ------------------ -----------------
12.1.2.2.0
Controller_INT 4.230.40-3739 Up-to-date
Controller_EXT 04.00.00.00 Up-to-date
Expander 0018 Up-to-date
SSD_SHARED {
[ c1d16,c1d17,c1d18, A122 Up-to-date
c1d19 ]
[ c1d20,c1d21,c1d22, A122 Up-to-date
c1d23 ]
}
HDD_LOCAL A690 Up-to-date
HDD_SHARED A2D2 Up-to-date
ILOM 3.2.4.34 r95732 Up-to-date <<< If the installed version is not equal or newer than the supported version, it will need updating. It is not necessary to downgrade firmware to match the "Supported Version" column
BIOS 30030800 Up-to-date
To restore bios configuration, use the following command:
# /usr/sbin/ubiosconfig import config -f --expert -y -x /tmp/bios.xml <<<< this will work if the /tm/bios.xml file was created before the motherboard replacement, as in example at beginning of this doc.
To restore ilom configuration:
Use the web browser under Maintenance Tab or from ILOM cli:
cd /SP/config
set passphrase=welcome1
set dump_uri=scp://root:password@laptop_IP/var/tmp/SP.config
To update the ILOM version on systems running 12.1.2.5.0 and below, use the following command:
# oakcli update --patch 12.1.2.2.0 --infra <<<<<<<<<<<<<<< make sure you use the correct version, this is just an example for 12.1.2.2.0
To update the ILOM version on systems running 12.1.2.6.0, do not use oakcli, instead use the ilom web gui. Follow instructions in the ILOM Administrator's Guide for Configuration and Maintenance Guide.
To update the ILOM version on systems running 12.1.2.7.0 and above, use the following command:
# oakcli update -patch 12.1.2.7.0 --local --server <<<<<<<<<<<<<<< make sure you use the correct version, this is just an example for 12.1.2.7.0
*the oakcli update commands above will reboot the node that needs the ILOM/BIOS updated
REFERENCE INFORMATION:
Oracle Database Appliance Documentation
http://docs.oracle.com/cd/E88491_01/doc.121/e86800/toc.htm
Oracle Integrated Lights Out Manager (ILOM) 3.2 Documentation
http://docs.oracle.com/cd/E37444_01/index.html
References
<NOTE:2105219.1> - After Motherboard Replacement, Issues With The NICs showing __tmpXXX
Attachments
This solution has no attachment