![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Technical Instruction Sure Solution 2134593.1 : How to Replace a SPARC S7-2 (1 or 2 Processor) Motherboard and SP (Embedded Service Processor) [VCAP]
In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:MiniCluster S7-2 Hardware - Version All Versions to All Versions [Release All Releases]SPARC S7-2 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. GoalHow to Replace a SPARC S7-2 (1 or 2 Processor) Motherboard and SP (Embedded Service Processor) *************************************************************************************************************************** ESD Caution:
Contamination Caution:
Solution DISPATCH INSTRUCTIONS Note - When replacing a SPARC S7-2 P/N 7334020 (2 processor MB) or P/N 7334023 (1 processor MB), please refer to "How to Properly Order the Correct Systemboard/Motherboard on a SPARC S7-2 (Doc ID 2162560.1)". It will explain the differences between the 2 motherboards, how to order/replace the correct motherboard and to avoid installing the wrong one into the system.
DAMAGE ALERT: Perform a visual inspection of the replacement part to make sure that there are no damaged components, connectors, bent pins, damaged packages during shipping, etc). If the part is damaged, don't install it into the system, order a new part. Handle with caution and package carefully the return FRU just as the new FRU was packaged, to avoid any damages during shipping.
Note - The LDOM configuration (if any) needs to be saved before motherboard replacement to avoid loss of LDOM configuration, refer to doc 1019720.1 for details.
Note - A data backup is not a prerequisite but is a wise precaution.
Customer should perform an orderly and graceful shutdown of applications and OS to get the OpenBoot PROM prompt. Run the printenv command and make a note of any OpenBoot PROM variables that have been modified. Then power off the server and remove the AC power cords from the system. WHAT ACTION DOES THE ENGINEER NEED TO TAKE: Verify/Update TLI Prior to Replacement 1. Log into the ILOM and check the fruid container values and sync them if needed. a. To avoid mismatched fruid values causing a failure after a motherboard replacement the fruid data should be confirmed to have matching data in at least the Primary (DBP) and Backup1 (PS0) containers so that the motherboard will have its container updated automatically after replacement. Go into restricted mode and use the showpsnc command to check this. -> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely to allow Services to perform diagnostic tasks. [(restricted_shell) s7-2-bur09-a-sp:~]$ showpsnc Primary: fruid:///SYS/DBP Backup 1: fruid:///SYS/PS0 Backup 2: fruid:///SYS/MB Element | Primary | Backup1 | Backup2 ------------------+----------------------+-----------------------+------------------- PPN 34235727+1+1 34235727+1+1 34235727+1+1 PSN AK00370269 AK00370269 AK00370269 MACADDR 00:10:E0:B3:0C:28 00:10:E0:B3:0C:28 00:10:E0:B3:0C:28 HOSTID 86b30c28 86b30c28 86b30c28 Product Name SPARC S7-2 SPARC S7-2 SPARC S7-2 [(restricted_shell) s7-2-bur09-a-sp:~]$ exit b. The above example shows a system with all three containers properly in sync. If the output from the system does not show all of the containers with matching values then you should reset the SP and then re-check the values again. An ILOM reset will attempt to auto-populate the matching values if one container is out of sync. -> reset /SP
Are you sure you want to reset /SP (y/n)? y Performing reset on /SP c. After an ILOM reset if the Primary and Backup1 containers match then proceed with the following steps to replace the motherboard. If these two containers do not match then DO NOT proceed with the replacement yet. If you are unfamiliar with this process and require assistance please reference the steps for using copypsnc to fix the serial number detailed in the "How to update product serial number on systems which implement TLI functionality (Doc ID 1280913.1)" and contact the TSC if needed. How to access service mode and escalation mode on ILOM 3.x and later platforms (Doc ID 1019946.1). Remove Motherboard (w/ Embedded Service Processor) 1. Prepare the server for service. Caution - Components inside the chassis might be hot. Use caution when servicing components inside the chassis.
Note - When replacing the motherboard, you will need to remove the SCC PROM from the old motherboard and install the component on the new motherboard. The SCC PROM contains the system host ID and MAC address. Transferring these components preserves the system-specific information stored on these modules.
2. Remove the top cover and open the fan door to remove all of the fan modules. Note - For a once socket S7-2 only there are three fans and a fan filler block which must be removed by removing a torx screw. The S7-2 two socket will have four fans and no fan filler block.
c. To open the server top cover, press and hold down the top cover release button and use the recessed area to slide the top cover toward the rear of the server about 0.5 inches (12.7 mm). 3. Remove clear plastic air duct assembly cover: 4. Remove all PCIe cards. Note - Always remove transceivers from a PCIe card(s) before removing the card from the server.
Note - Keep track of which slot each PCIe card was in so you can return them to their original positions A. Removing PCIe cards in slots 1 and 2: Note - If the riser does not have a PCIe card installed, then lift the latch to release the PCIe slot filler panel.
3. Lift the green-tabbed riser release lever on the PCIe riser with one hand and use your other hand to remove the riser from the motherboard connector B. Removing PCIe cards from slots 3 and 4: Note - This PCIe riser is actually installed in PCIe slot 3, but it supports up to two PCIe cards. The upper slot, referred to as slot 3, can be used for any supported PCIe card, and, therefore, is optionally populated. The lower slot, referred to as slot 4, is dedicated to the internal HBA card, and, therefore is always populated. The internal HBA card should be serviced only by authorized Oracle Services personnel
1. If there is a PCIe card installed in the riser, disconnect any external or internal cables. Note - Do not disconnect the SAS cable from the internal host bus adapter card until after you have removed the riser from the serve
2. Open the green-tabbed latch located on the rear of the server chassis next to PCIe slot 3 to release the rear bracket on the PCIe card Note - If the riser does not have a PCIe card installed in slot 3, then lift the latch to release the PCIe slot 3 filler panel.
a. To release the riser from the motherboard connector, lift the green-tabbed release lever on the PCIe riser to the open position 5. If you are replacing the motherboard, remove the following components and place them on an ESD mat: Note - Install the DIMMs only in the sockets (connectors) that correspond to the sockets from which they were removed. Performing a one-to-one replacement of the DIMMs significantly reduces the possibility that the DIMMs will be installed in the wrong slots. If you do not reinstall the DIMMs in the same sockets, server performance might suffer and some DIMMs might not be used by the server.
6. Remove the server mid-wall 7. Remove the HBA SAS Cable Assembly 8. Remove the NVMe Cables: 9. Removing Power Supplies Note - During the motherboard removal procedure, it is important to label power supplies with the slot numbers from which they were removed (PS0, PS1). This is required because the power supplies must be reinstalled into the slots from which they were removed; otherwise, the server key identity properties (TLI) data might be lost.
a. Disconnect the power cord from the faulty power supply. 10. Internal cables 11. Remove the motherboard from the server chassis 12. Remove the eUSB drive from the motherboard and install it on the replacement motherboard. 13. Remove the SEEPROM from the motherboard and install it on the replacement motherboard Install Motherboard (w/ Embedded Service Processor) 1. Insert the motherboard into the server chassis Note - Extra caution should be taken to ensure that locator LED/push-button is protruding through hole in the rear I/O panel and not bound up
2. Internal cables 3. Installing Power Supplies Note - When reinstalling power supplies, it is important to reinstall them into the slots from which they were removed during the motherboard removal procedure; otherwise, the server key identity properties (TLI) data might be lost.
a. Push each power supply into the chassis fully until you feel it click into position 4. Installing NVMe Cables 5. Installing PCIe Cards Note - If the riser does not have a PCIe card installed, install a PCIe slot filler panel and close the green-tabbed latch to secure the filler panel.
5. If there were any external cables connected to the PCIe card, reconnect them Note - If the riser does not have a PCIe card installed in slot 3, install a PCIe slot filler panel and close the green-tabbed latch to secure the PCIe slot filler panel.
8. If there is a PCIe card installed in slot 3 of the riser, reconnect any external or internal cables to the card 6. Installing the server mid-wall 7. Install Fans Note - For a once socket S7-2 only there are three fans and a fan filler block which must be installed by tightening a torx screw. The S7-2 two socket will have four fans and no fan filler block.
8. Take the clear plastic air duct assembly and place it over the top of the CPU's and DIMMs allowing the DIMMs to protrude through the open slots and the assembly to fit nicely between the chassis walls, power supply metal casing and mid-wall. 9. Install the server top cover and close the fan door: 10. Return the Server to operation. 11. Prior to powering on the server, connect a terminal or a terminal emulator (PC or workstation) to the SER MGT port. Note - The LDOM configuration (if any) needs to be restored after motherboard replacement to avoid loss of LDOM configuration, refer to doc 1019720.1 for details.
12. Power on server. Verify that the Power/OK indicator led lights steady on. Verify/Update TLI After Replacement 1. Set the system serial number/fruid data if needed. -> cd /SP/users
/SP/users -> create escuser Creating user... Enter new password: ******** Enter new password again: ******** Created /SP/users/escuser -> set escuser role=aucros Set 'role' to 'aucros' -> show escuser /SP/users/escuser Targets: ssh Properties: role = aucros password = ***** g. Set the check_physical_presence to false and then exit from the ILOM so that you can login as the newly created user. -> set /SP check_physical_presence=false
Set 'check_physical_presence' to 'false' -> show /SP check_physical_presence /SP Properties: check_physical_presence = false -> exit h. Login using the escuser login and enter escalation mode using the password that was provided by the TSC. s7-2-bur09-a-sp login: escuser
Password: Oracle(R) Integrated Lights Out Manager Version 3.2.4.34 r95732 Copyright (c) 2014, Oracle and/or its affiliates. All rights reserved. Warning: The system appears to be in manufacturing test mode. Contact Service immediately. Hostname: s7-2-bur09-a-sp -> cd /SP/users/ecsuser/escalation -> set SESSION mode=escalation Password:**** **** **** **** **** *** *** **** **** **** **** **** **** **** **** **** *** *** **** *** **** **** **** *** **** **** *** **** *** * Short form password is: NOSE HAAG MED [(escalation_mode) s7-2-bur09-a-sp:~]# i. Use the showpsnc command to confirm the current container values. Confirm that the primary container has a serial number (the value on the PSN line) that matches the system serial number. The system serial number can be checked by comparing to the serial number RFID tag on the front left hand side of the server. After confirming that there is a valid fruid primary use the copypsnc command to write the good data from the primary to the backup2 container on the MB. The following example shows copying from primary to the backup2, but you could also copy from backup1 if needed. [(escalation mode) s7-2-bur09-a-sp:~]# showpsnc [(escalation mode) s7-2-bur09-a-sp:~]# copypsnc Primary Backup2 [(escalation mode) s7-2-bur09-a-sp:~]# showpsnc j. At this point if all of the fruid containers match and have the correct serial number data this step is done. If more than one of the fruid containers had non-valid entries then the copypsnc command should be used to copy over the valid data to the other container that is not valid. (ie. "copypsnc Primary Backup1") After confirming all fruid data is correct reset the ILOM to confirm that the fruid data persists through a reboot and remove the escalation user if needed. -> reset /SP
Are you sure you want to reset /SP (y/n)? y Performing reset on /SP .......... ***login as the root user again and check the fruid data*** -> set SESSION mode=restricted WARNING: The "Restricted Shell" account is provided solely to allow Services to perform diagnostic tasks. [(restricted_shell) s7-2-bur09-a-sp:~]# showpsnc Primary: fruid:///SYS/DBP Backup 1: fruid:///SYS/PS0 Backup 2: fruid:///SYS/MB Element | Primary | Backup1 | Backup2 -------------------+------------------------+-------------------------+------------------- PPN 34235727+1+1 34235727+1+1 34235727+1+1 PSN AK00370269 AK00370269 AK00370269 MACADDR 00:10:E0:B3:0C:28 00:10:E0:B3:0C:28 00:10:E0:B3:0C:28 HOSTID 86b30c28 86b30c28 86b30c28 Product Name SPARC S7-2 SPARC S7-2 SPARC S7-2 [(restricted_shell) s7-2-bur09-a-sp:~]# exit -> cd /SP/users /SP/users -> delete escuser Are you sure you want to delete /SP/users/escuser (y/n)? y Deleted /SP/users/escuser k. If trouble is encountered during any of the steps of accessing escalation mode and fixing the fruid containers please contact the TSC for assistance. How to verify the Motherboard is working properly 1. Log into ILOM to confirm if motherboard status is working properly. Sample: -> show /SYS/MB . Properties: Commands: -> 2. Check ILOM event log to see if any error related motherboard. -> show /SP/faultmgmt
-> show /SP/logs/event/list
OBTAIN CUSTOMER ACCEPTANCE WHAT ACTION DOES THE FE/CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE: REFERENCE INFORMATION:
References<NOTE:1019946.1> - How to access service mode and escalation mode on ILOM 3.x and later platforms<NOTE:1280913.1> - How to update System, Chassis, and Product level Key Identity Properties on ILOM based systems which implement Top Level Identifier (TLI) functionality Attachments This solution has no attachment |
||||||||||||||||
|