Asset ID: |
1-71-1636232.1 |
Update Date: | 2016-06-21 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1636232.1
:
How to Replace an Oracle Virtual Compute Appliance (OVCA) X3-2. X4-2, PCA X5-2 Hardware Motherboard on a Management Node
Related Items |
- Oracle Virtual Compute Appliance X3-2 Hardware
- Oracle Virtual Compute Appliance X4-2 Hardware
- Private Cloud Appliance X5-2 Hardware
- Private Cloud Appliance
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
|
Oracle Confidential PARTNER - Available to partners (SUN).
Reason: CAPs are not external
Applies to:
Oracle Virtual Compute Appliance X3-2 Hardware - Version All Versions to All Versions [Release All Releases]
Private Cloud Appliance X5-2 Hardware - Version All Versions and later
Oracle Virtual Compute Appliance X4-2 Hardware - Version All Versions and later
Private Cloud Appliance - Version 2.0.2 and later
Information in this document applies to any platform.
Goal
How to Replace a Motherboard within a PCA Management Node
Solution
CAP PROBLEM OVERVIEW: MOTHERBOARD ASSEMBLY REPLACEMENT
DISPATCH INSTRUCTIONS
WHAT SKILLS DOES THE ENGINEER NEED: The engineer must be OVCA trained.
TIME ESTIMATE: 135 minutes
TASK COMPLEXITY: 4
FIELD ENGINEER INSTRUCTIONS
The general steps are broken into three sections:
- Pre-motherboard FRU replacement steps.
- Physical replacement of the motherboard using the appropriate CAP:
How to Replace a Sun Server X3-2(X4170M3) Motherboard assembly (Doc ID 1495251.1)
How to Replace a Sun Server X4-2 Motherboard assembly (Doc ID 1592250.1)
How to Remove and Replace a Motherboard Assembly in an Oracle Server X5-2 and X6-2 (Doc ID 1992420.1)
- Post-motherboard FRU replacement steps.
WHAT ACTION DOES THE ENGINEER NEED TO TAKE:
Pre-motherboard FRU replacement steps.
1. Locate and take note of the Rack’s Master Serial number and get the necessary username passwords for both the OVCA GUI and the OVMM GUI from the customer.
a. The Rack's Master Serial Number is located on the top left side wall (viewed from rear) inside the rack on the rear of the chassis.
b. The customer must supply the cluster Virtual IP address (manager-VIP) for the OVCA cluster, and they must supply the username passwords if they have been changed from the defaults. The defaults user/password for both OVCA GUI and OVMM GUI is: ovcaadmin/Welcome1
2. Check the status of the two-node cluster and failover the master if needed.
a. There are two management nodes that make up the cluster: ovcamn05r1 and ovcamn06r1.
b. The manager-VIP is the customer supplied virtual IP of the cluster.
c. Use putty or ssh to connect to the management node. i.e. ssh root@manager-VIP
d. When you login to the VIP, the command line prompt will indicate which management node you have logged into. Example: [root@ovcamn05r1 ~] It should be the master node.
e. If the VIP is unknown, you can set your laptop's IP to 192.168.4.254 and use the FE service cable from port 19 of the ES1-24 switch. The ip address for ovcamn05 is 192.168.4.3 and ovcamn06 is 192.168.4.4.
f. Under normal conditions, one node will be the master node and the other node is the standby node. To verify, run the /usr/sbin/ovca-check-master command from both nodes.
Examples:
[root@ovcamn05r1 ~]# /usr/sbin/ovca-check-master
NODE: 192.168.4.3 MASTER: True
[root@ovcamn06r1 ~]# /usr/sbin/ovca-check-master
NODE: 192.168.4.4 MASTER: False
g. Determine which node needs the motherboard replacement--it may already be down, offline and ready for service, or it may be up and not the master node.
h. If it is up and running the as the master node, you will needed to fail it over so it becomes the standby node before you replace the motherboard.
i. To fail the master node over to the standby node (or to power down the standby node if it is up), perform a gracefully power off via ILOM (-> stop /SYS), from the OS (init 0), or momentarily pressing the power button. Run the '/usr/sbin/ovca-check-master' command on the other node to confirm it became the master before you replace the motherboard.
3. Determine the motherboard firmware required and optionally copy it onto your laptop now as you may need it later in this process.
Note: The management nodes uses the same system firmware as the compute nodes. In the next steps, you will use the firmware that is located in the 'compute' directory. There is no management_node directory.
a. use putty or ssh to connect to the management node. i.e. ssh root@manager-VIP
b. The firmware is located in /nfs/shared_storage/mgmt_image/firmware/compute/X3-2_BIOS
c. Make note of the version and optionally copy it onto your laptop in the event you need it later.
This completes the Pre-motherboard FRU replacement steps.
Replace the motherboard
4. Before you begin, please note it is extremely important that you DO NOT CONNECT ANY NETWORK CABLES in the CMA cable bundle to the new motherboard until instructed.
5. Replace the motherboard using the appropriate CAP:
Replace a Sun Server X3-2(X4170M3) Motherboard assembly (Doc ID 1495251.1)
Replace a Sun Server X4-2 Motherboard assembly (Doc ID 1592250.1)
How to Remove and Replace a Motherboard Assembly in an Oracle Server X5-2 and X6-2. (Doc ID 1992420.1)
This completes the motherboard replacement steps.
Post-motherboard FRU replacement steps
6. Using your laptop, connect a serial cable from your laptop to the SER MGT port on the node and connect a network cable from your laptop to the NET MGT port on the node.
7. Re-install the AC power cords but DO NOT CONNECT ANY OF THE NETWORK CABLES that were original connected.
8. Wait for the SP to boot.
9. There are two management nodes, ovcamn05r1 and ovcamn06r1. Perform only step 10 or step 11, not both.
10. If you are working on ovcamn05r1 make the following changes to the SP:
a. -> cd /SP/network
b. -> set pendingipdiscovery=static
c. -> set pendingmanagementport=NET0
d. -> set pendingipaddress=192.168.4.103
e. -> set pendingipnetmask=255.255.255.0
f. -> set pendingipgateway=192.168.4.201
g. -> set commitpending=true
h. -> cd /SP
i. -> hostname=ilom-ovcamn05r1
j. -> set /SP system_identifier="Oracle Virtual Compute Appliance X3-2 <top level rack SN>"
k. -> set /SP/users/root password=Welcome1
11. If you are working on ovcamn06r1 make the following changes to the SP:
a. -> cd /SP/network
b. -> set pendingipdiscovery=static
c. -> set pendingmanagementport=NET0
d. -> set pendingipaddress=192.168.4.104
e. -> set pendingipnetmask=255.255.255.0
f. -> set pendingipgateway=192.168.4.201
g. -> set commitpending=true
h. -> cd /SP
i. -> hostname=ilom-ovcamn06r1
j. -> set /SP system_identifier="Oracle Virtual Compute Appliance X3-2 <top level rack SN>"
k. -> set /SP/users/root password=Welcome1
12. Take note of the new MAC addresses, you will need them later:
a. -> show /SYS/MB/NET0 fru_macaddress
b. -> show /SYS/MB/NET1 fru_macaddress
c. -> show /SYS/MB/NET2 fru_macaddress
d. -> show /SYS/MB/NET3 fru_macaddress
e. -> show /SP/network macaddress
13. Check the firmware version is at the required revision level that you checked in step #3.
a. -> show /System system_fw_version
b. -> version
14. If the firmware doesn’t match, you will need to manually up- or downgrade it. One method is to temporarily configure the network on your laptop.
a. configure the IP address on your laptop to 192.168.4.254 and netmask 255.255.255.0.
b. If working on ovcamn05r1, using your browser, browse to 192.168.4.103
c. If working on ovcamn06r1, using your browser, browse to 192.168.4.104
d. Login to the SP as root.
e. On the left side of the screen, click on “ILOM Administration”
f. In the same area, click on “Maintenance”
g. On the “Firmware Upgrade” tab, click “Enter Upgrade Mode”
h. Click okay, then click on the “Browse…” button and supply the path to the firmware you obtained in step #3.
i. Complete the fw update.
j. After the SP has reboot, log in and continue with the next step.
15. It is recommended to run PC check to verify there are no issues after the MB has been replaced.
a. From the SP GUI, in the left pane, click on "Host Management"
b. Click on "Diagnostics"
c. Select "Run Diagnostics on Boot"
d. Reboot and run the diagnostics
e. After the PC-check runs without any issues, verify the 'Run Diagnostics on Boot' is "Disabled".
16. Configure the node to boot into the BIOS, power cycle it, then start the serial console and check the following three BIOS settings: System Date/Time, Boot List, and OSA support:
a. set /HOST boot_device=bios
b. -> Start /SP/console
c. On the BIOS Main screen, set the system date and time
d. On the BIOS Boot screen, make sure the “RAID:PCIE4…PCI RAID Adaptor” is on the top of the boot list and the four PXE:NETx devices are after it.
e. On the BIOS Boot/OSA Configuration screen, set “OSA Internal Support” to [Disabled]
f. Select Save&Exit to exit from the BIOS setup and allow the node to boot into the OS.
17. After the OS has boot, login as root.
18. cd /etc/sysconfig/network-scripts
19. Using the MAC addressed you obtained in step #12, edit the four ifcfg-eth[0-3] files.
a. edit ifcfg-eth0 and enter the new MAC for /SYS/MB/NET0
b. edit ifcfg-eth1 and enter the new MAC for /SYS/MB/NET1
c. edit ifcfg-eth2 and enter the new MAC for /SYS/MB/NET2
d. edit ifcfg-eth3 and enter the new MAC for /SYS/MB/NET3
20. shutdown the node using the power button, or # init 0.
21. Remove the laptop cables connected to the nodes NET MGT and SER MGT ports. The power cables are removed here now so the SP will reboot at step 23.
22. Reconnect all of the original network cables. The two IB cables, and the one to NET0.
23. Reconnect the power cables.
24. Push the node back into the rack and wait for the SP to boot. When it has, then power on the node.
25. Allow the node to boot.
26. Once the OS has boot, wait a couple of minutes, then login and check the status of both nodes. One node will be "True" and the other one (the one you just replaced the motherboard on) should be "False".
a. ssh root@192.168.4.3
b. # ovca-check-master
NODE: 192.168.4.3 MASTER: False
c. ssh root@192.168.4.4
d. # ovca-check-master
NODE: 192.168.4.4 MASTER: True
27. Get the old HOST and old ILOM mac addresses of the inactive/down management node from the OVCA DB on the active management node (MN).
a. # ovca-node-db list | grep <MN hostname>
EXAMPLE:
[root@ovcamn06r1 ~]# ovca-node-db list | grep ovcamn05
mac=00:10:e0:1e:7a:03 ip=192.168.4.103 link=00:10:e0:1e:7a:05 name=ilom-ovcamn05r1 type=mgmt_ilom state=running
mac=00:10:e0:1e:7a:05 ip=192.168.4.3 link=00:10:e0:1e:7a:03 name=ovcamn05r1 type=mgmt state=RUNNING
28. Recall the new motherboard HOST and ILOM mac addresses you collected in step #12. You will use them in this step. Below, there are two sets of three steps. You will be using the “list”, “add”, and “delete” options of the ‘ovca-node-db’ to ‘list’ the MAC, ‘add’ the new MAC, and ‘delete’ the old MAC. You’ll do this twice. Once for the node MAC and then again for the ILOM MAC.. The first set of steps (a, b, and c) are for the host, and the second set of steps (d, e, and f) are for the ILOM:
a. # ovca-node-db list mac=<old MN HOST mac address from step 27>
This information will be used in the next step.
b. # ovca-node-db add mac=<new MN HOST mac address from step12a> link=<new MN ILOM mac address from step12e> ip=<from step 28a> name=<from step 28a> type=mgmt state=RUNNING
c. # ovca-node-db delete mac=<old MN HOST mac address (from step 27)>
d. # ovca-node-db list mac=<old MN ILOM mac address (from step 27)>
This information will be used in the next step.
e. # ovca-node-db add mac=<new MN ILOM mac address (from 12e)> link=<new MN HOST mac address (from 12a)> ip=<from step 28d> name=<from step 28d> type=mgmt_ilom state=dead
f. # ovca-node-db delete mac=<old MN ILOM mac address (from step 27)>
EXAMPLES:
a. [root@ovcamn06r1 ~]# ovca-node-db list mac=00:10:e0:1e:7a:03
mac=00:10:e0:1e:7a:03 ip=192.168.4.3 link=00:10:e0:1e:7a:05 name=ovcamn05r1 type=mgmt state=xxx
b. [root@ovcamn06r1 ~]# ovca-node-db add mac=00:10:e0:2e:82:83 ip=192.168.4.3 link=00:10:e0:2e:82:85 name=ovcamn05r1 type=mgmt state=RUNNING
ADDED: 00:10:e0:2e:82:83 = {'ip': '192.168.4.3', 'state': 'RUNNING', 'link': '00:10:e0:2e:82:85', 'name': 'ovcamn05r1', 'type': 'mgmt'}
c. [root@ovcamn06r1 ~]# ovca-node-db delete mac=00:10:e0:1e:7a:03
NODE: 00:10:e0:1e:7a:03 <<< DELETED
Continue with steps d, e, and f which are similar to steps a, b, and c, except they use the ILOM mac.
29. Ensure the correct addresses are assigned by listing the entries. This command will list a pair of HOST and ILOM entries for the new MB which should show as DEAD and dead, respectively. Ensure that the old mac entries have been deleted.
a. # ovca-node-db list | grep <MN hostname>
30. OVCA DHCP entry update:
While still logged in to the active management node (MN) edit the dhcp_dynamic file, /etc/dhcp/dhcpd_dynamic.conf, as follows:
a. Find the lines which contain the old HOST and ILOM mac addresses.
b. These lines will start with "host ovcamn0" for the host mac entry, and "host ilom-ovcamn0" for the ilom mac entry.
c. Replace the old mac address (after the string "hardware ethernet") with the corresponding host or ilom mac address. Ensure the two mac addresses are not swapped and are correctly assigned to the ILOM and HOST entry.
EXAMPLE:
host ilom-ovcamn05r1 { hardware ethernet 00:10:e0:2e:82:83; fixed-address 192.168.4.103; default-lease-time 600;}
host ovcamn05r1 { hardware ethernet 00:10:e0:2e:82:85; fixed-address 192.168.4.3; filename "/tftpboot/pxelinux.0"; default-lease-time 600;}
31. You're done.
OBTAIN CUSTOMER ACCEPTANCE
REFERENCE INFORMATION:
Sun Server X3-2 Documentation
http://docs.oracle.com/cd/E22368_01/index.html
How to Remove Replace a Motherboard Assembly in an Oracle Server X5-2
Attachments
This solution has no attachment