Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-2281894.1
Update Date:2017-12-22
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  2281894.1 :   [PCA] How to Prepare a Compute Node for Motherboard Replacement  


Related Items
  • Private Cloud Appliance
  •  
  • Private Cloud Appliance X6-2 Server Upgrade
  •  
  • Oracle Virtual Compute Appliance X4-2 Hardware
  •  
  • Private Cloud Appliance X5-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Virtual Compute Appliance>DB: OVCA_EST
  •  




In this Document
Purpose
Scope
Details
 Prepare the compute node for Service
 1. Backup the OVM database.
 2. List Custom networks for the compute node.
 3. Acknowledge Events via the OVM GUI
 4. Disable provisioning
 5. Check if node belongs to a tenant group (applicable to 2.2.1 and above only). 
 6. Shut down the node.
 7. Delete the node from the server pool
 Replace the Motherboard
References


Applies to:

Oracle Virtual Compute Appliance X4-2 Hardware
Private Cloud Appliance X5-2 Hardware
Private Cloud Appliance X6-2 Server Upgrade
Private Cloud Appliance - Version 2.0.5 and later
Oracle Linux and Virtualization > Oracle Virtual Compute Appliance > Oracle Virtual Compute Appliance Software
Information in this document applies to any platform.

Purpose

The purpose of this document is to assist in preparation of a motherboard replacement on a PCA compute node.

Scope

The steps in this document will outline the customer’s responsibility to prepare for an onsite visit from a field engineer. This note makes the assumption that a motherboard replacement has already been recommended by Oracle via an existing Service Request. If you do not have a service request open, and you believe you may have a faulty motherboard on a compute node, please log a new Service Request using your PCA hardware Serial number and CSI.

Details

Prepare the compute node for Service

The following steps will need to be executed before your onsite visit.

1. Backup the OVM database.
  1. ssh into the VIP of the management node cluster
  2. Execute the following
     
    For Pre PCA 2.0.5
     /usr/sbin/ovca-backup

    For PCA 2.0.5 and above
    /usr/sbin/pca-backup
2. List Custom networks for the compute node.
  1. List out and take note of the networks assigned to the compute node. The custom networks will be detached from the compute node after the node is reprovisioned.    
 pca-admin list network ovcacnXXr1
3. Acknowledge Events via the OVM GUI
  1. Enter the following address in a Web browser
      https://manager-vip:7002/ovm/console
  2. Click on the “Servers and VMs” tab
  3. On the navigation tree, locate and highlight the server that requires the motherboard replacement and either under the Rack1_ServerPool or Unassigned Servers.
  4. Observe the node icon and perform these steps as needed.
    If there are events associated with the node (yellow ! or red X), please acknowledge them.
    1. Highlight the node in the navigation tree.
    2. From the management section of the GUI, select “Events” from the “Perspective” drop down menu.
    3. To the right of the drop down menu, click on “Acknowledge All”
    4. Next click “OK” on the confirmation screen
4. Disable provisioning
  1. Go to the pca dashboard
    https://XXXX:7002/dashboard
  2. On the Hardware view tab, select “Disable CN Provisioning”
  3. Confirm you wish to continue to disable compute node provisioning.
  4. Compute node Provisioning should now be disabled.
5. Check if node belongs to a tenant group (applicable to 2.2.1 and above only). 
  1. Execute the following on the PCA CLI on the master management node.
    [root@ovcamn05r1 ~]# pca-admin
    Welcome to PCA! Release: 2.3.1
    PCA> list tenant-group
     
  2. If there is only one default tenant group listed named Rack1_ServerPool, then proceed to step 6.
  3. If there is are additional tenant groups, check each tenant group to find which tenant group the node to be serviced is apart of.
    PCA> show tenant-group <tenant group name>
      
  4. If it is discovered the node is apart of the default Rack1_ServerPool, proceed to step 6.
  5. If it is discovered the node is apart of another group, remove the node from the group.
    PCA> remove compute-node ovcacnXXr1 <tenant group name>
      
  6. If an exception is raised, please update the service request.
6. Shut down the node.
  1. If the node which requires maintenance is up and running indicated by a green arrow in the OVM GUI, place the node into maintenance mode.   
    Please see the following note on how to place the node in maintenance mode and power down.
    Steps to Gracefully Shutdown and Power Off a Node in Oracle Private Cloud Appliance Prior to Maintenance (Doc ID 2256834.1)

  2. If the node which requires maintenance is down indicated by a red X because of faulted hardware, do not Execute step 7. Please update the SR for assistance on the next steps.

 

If the customer is requesting assistance from step 6, the following steps will need to be executed with guidance from support. Please do not provide steps to customer.

a. From the VIP, login to OVM shell.

[root@ovcamn05r1 /]# ovm_shell.sh -u admin -p Welcome1
OVM Shell: 3.2.11.775 Interactive Mode
>>> from com.oracle.ovm.mgr.api import *
>>> from com.oracle.ovm.mgr.api.event import *
>>> from com.oracle.ovm.mgr.api.virtual import *
>>> from com.oracle.ovm.mgr.api.physical import *
>>> from com.oracle.ovm.mgr.api.physical.network import *
>>> from com.oracle.ovm.mgr.api.physical.storage import *
---
>>>

b. Execute the below commands step by step in the OVM shell including parameter numbers in the field [ ] to choose the server to be removed:
c. Connect to OVM Manager instance

>>>ovmm = OvmClient.getOvmManager()

d. Load the foundry to have access to all objects in the model

>>>fdry = ovmm.getFoundryContext()

e. Retrieves the server information

>>>srvr = fdry.getServers()

 f. Print all the servers present in your OVM setup

>>>print srvr

g. Choose the server to be deleted from OVM by including the numbers in the field [ ],the numbering will be in the same order as they appear in the output of above command starting from 0. (For example: use [0] to select 1st server the above printed list, [1] to select 2nd server in the above printed list etc)

>>>srvr = fdry.getServers()[x]

 h. Check the server to be deleted is correctly selected

>>>print srvr

i. Create a job to remove server from server pool

>>>job = ovmm.createJob("Remove server from pool")
>>>job.begin()
>>>fdry.deleteServerFromModelOnly(srvr)
>>>job.commit()

j. Use ^C to exit.

>>> [Type control-C to exit]
[root@ovcamn06r1 ~]#

Use the internal steps for step 5 to remove the node from node DB.

 

7. Delete the node from the server pool
  1. Right click on the server
  2. Select “Delete” from the pop-up menu
  3. Click “OK” on the confirmation pop-up screen.
  4. Wait for the server to be removed from the list.
  5. It may take a couple of minutes, but it should successfully get removed from the list. You can observe the status by looking at the "Job Summary:" at the bottom of the GUI for "Delete server ovcacnXX Completed"
  6. Verify the node is no longer shown in both the OVM GUI and the PCA GUI.
  7. Once the node is removed from the pool. Please notify the support engineer working the SR to assist with removing the node from the PCA node database.

If the customer requests assistance from step 7, please execute the following steps with the customer. Do not provide these instructions to the customer.

The following steps should be followed for versions 2.2.X and above. Please see the 2.1.1 and below steps for non 2.2.X versions.

Login to management node (MN) and execute the following commands to remove the downed CN from the PCA node database. Substitute XX below with the CN number:

 1. Enable support mode.

[root@ovcamn05r1 /]# export ORACLE_SUPPORT_MODE=True
[root@ovcamn05r1 /]# pca-admin
Using Oracle PCA Support interface may invalidate your service contract. Continue? [y/N]:y
Welcome to PCA! Release: 2.2.2

WARNING !!!
===========
You are entering PCA support CLI. THIS ACTION MAY BREAK THE SUPPORT AGREEMENT.

PCA support CLI is ONLY allowed to be used by the Oracle PCA support team. It
is not designed for the regular operation. Using PCA support CLI without any
official direction can result a serious damage for the environment.
===========

  2. Delete the node from DB

PCA> delete nodedb-row ilom-ovcacnXXr1
************************************************************
WARNING !!! THIS IS A DESTRUCTIVE OPERATION.
************************************************************
Are you sure [y/N]:y

Status: Success

PCA> delete nodedb-row ovcacnXXr1
************************************************************
WARNING !!! THIS IS A DESTRUCTIVE OPERATION.
************************************************************
Are you sure [y/N]:y

Status: Success

  3. Execute a list compute node to ensure the node is removed from the node-db

PCA> list compute-node

 4. Remove support mode flag

export ORACLE_SUPPORT_MODE=False

 

The following steps should be followed for versions 2.1.1 and below. Please ignore if previous steps have been followed. 

Login to management node (MN) and execute the following commands to remove the downed CN from the PCA node database. Substitute XX below with the CN number:

1. Execute the following command on the master management node. 

# ovca-node-db list mac | grep ovcacnXX

The output from the command above is used in the next two steps.

Sample output:

[root@ovcamn06r1 ~]# ovca-node-db list mac | grep ovcacn37
mac=00:10:e0:31:a7:f2 ip=192.168.4.21 link=00:10:e0:31:a7:f7 name=ovcacn37r1 type=compute state=RUNNING
mac=00:10:e0:31:a7:f7 ip=192.168.4.121 link=00:10:e0:31:a7:f2 name=ilom-ovcacn37r1 type=ilom state=running

In the sample above, 00:10:e0:31:a7:f2 is the <CN HOST mac address> for step 3 and 5 below.
In the sample above, 00:10:e0:31:a7:f7 is the <CN ILOM mac address> for step 2 and 4 below.

 

2. Delete the relevant ILOM mac addresses

# ovca-node-db delete mac=<CN ILOM mac address>

Example:

[root@ovcamn06r1 ~]# ovca-node-db delete mac=00:10:e0:31:a7:f7
NODE: 00:10:e0:31:a7:f7 <<< DELETED

 

3. Delete the relevant HOST mac addresses

# ovca-node-db delete mac=<CN HOST mac address>

Example: 

[root@ovcamn06r1 ~]# ovca-node-db delete mac=00:10:e0:31:a7:f2
NODE: 00:10:e0:31:a7:f2 <<< DELETED

 

4.   Confirm the ILOM MAC is deleted by listing.

# ovca-node-db list mac=<CN ILOM mac address>

Example: 

[root@ovcamn06r1 ~]# ovca-node-db list mac=00:10:e0:31:a7:f7
[root@ovcamn06r1 ~]# 

 

5. Confirm the HOST MAC is deleted by listing.  

# ovca-node-db list mac=<CN HOST mac address>

Example:

[root@ovcamn06r1 ~]# ovca-node-db list mac=00:10:e0:31:a7:f2
[root@ovcamn06r1 ~]# 

Replace the Motherboard

Once the filed engineer has arrived onsite to replace the motherboard, please wait for the service to be completed. Once completed please execute the post installation steps.

1. Re-enable compute node provisioning.
2. The PCA will automatically find and re-provision the node with the new motherboard. The entire process takes about 45 minutes. You can watch the process on the PCA GUI by hovering the mouse over the node. Some re-provisioning steps take much longer than others.
3. After provisioning is complete, the node may appear briefly under the Unassigned ServerPool section before it moves into the Rack1_ServerPool.
4. Add back any custom networks to the compute node that are no longer listed from Step 2 of the pre-steps.

PCA> list network <compute node name>

PCA> add network <network name> <compute node name>

References

<NOTE:2256834.1> - Steps to Gracefully Shutdown and Power Off a Node in Oracle Private Cloud Appliance Prior to Maintenance
<NOTE:1633452.1> - How to Replace an OVCA X3-2, OVCA X4-2, PCA X5-2 or PCA X6-2 upgrade Hardware Motherboard in a Compute Node
<BUG:26486250> - PCA COMPUTE NODE FAILS TO PROVISION W/ PXE MEDIA FAILURE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback