Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1929576.1
Update Date:2016-12-19
Keywords:

Solution Type  Problem Resolution Sure

Solution  1929576.1 :   ODA ODAVP Created Shared Repositories with VM Clones are Missing / Gone on both DOM0 DOM1 Nodes in /OVS on 2.10 using non-lowercase Node Names  


Related Items
  • Oracle Database Appliance X3-2
  •  
  • Oracle Database Appliance Software
  •  
  • Oracle VM
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  
  • Tools>Primary Use>Configuration
  •  




Created from <SR 3-9626053411>

Applies to:

Oracle Database Appliance Software - Version 2.10.0.0 to 2.10.0.0
Oracle Database Appliance X3-2 - Version All Versions to All Versions [Release All Releases]
Oracle VM - Version 3.2.3 to 3.2.3 [Release OVM32]
Information in this document applies to any platform.

Symptoms

Brief overview of the three types of ODA Virtualized Platform (ODAVP) domains:  

Oracle Database Appliance Base Domain (ODA_BASE): A privileged virtual machine domain, specifically for databases, that provides database performance similar to bare metal deployments. A PCI pass-through driver provides ODA_BASE direct access to the shared storage.

Domain 0 (Dom0) and Dom1: In this note we also refer to Dom1 to identify Dom0 on the second ODA node.
Default domain that initiates Oracle Database Appliance Virtualized Platform provisioning processes and hosts virtual machine templates.
Most of the responsibility of hardware detection in an Oracle Database Appliance Virtualized Platform environment is passed to the management domain, referred to as domain zero (or Dom0). On x86-based servers, the Dom0 / (Dom1) kernel is actually a small-footprint Linux kernel with support for a broad array of devices, file systems, and software RAID and volume management. In Oracle Database Appliance Virtualized Platform, Dom0 provides access to much of the system hardware, creating, deleting and controlling guest operating systems, and presenting those guests with a set of common virtual hardware.

Guest Domains (Domain U): Virtual machines that are provisioned to host non-database workloads, such as applications and middleware. Guest operating systems each have their own management domain, called a user domain, abbreviated to "Domain U". These domains are unprivileged domains that have no direct access to the hardware or to the device drivers. Each Domain U starts after Dom0 is running on Oracle Database Appliance Virtualized Platform.

For more information refer to Managing Oracle Database Appliance Virtualized Platform in ODA docs.oracle.com

  

SYMPTOMS

  • We see the odarepo1 and odarepo2 repositories on the public nodes and the dom0 dom1 nodes.
  • We created new shared repositories to clone new VMs into.
  • The shared repositories on the public nodes are there as seen with df -k .
  • However, the DOM0/1 nodes show nothing: They should have the IP and /OVS/Repositories/sharedreponame;
  • Only the odarepos are displayed.
  • We created some VMs which were already cloned into the new shared repos.
  • While cloning we noticed that Public node 1 had I/O errors using the df command.
  • All the shared repos had the I/O errors.
  • We have rebooted the ODA several times and the problem still persists.

Changes

Using

1)  ODAVP

  - and - 

2)  on 2.10 +

  - and -

3a) New usage

 - or

3b) Migration

- or

3c) Changed Node name to use Upper or Mixed Case

 

Cause

You may be hitting one of the following bugs

<Bug 18939777> SHARED REPO AND VM IS NOT AVAILABLE AFTER ODA_BASE REBOOT IN OAK 2.10
<Bug 18769746> + deletion of invalid entries in acfsutil registry

 

A script (sharedrepoactions.py) is not capturing non-Lower case Node Names 

 
REDISCOVERY INFORMATION:
 

The Node name is using either mixed or upper case nodename + ODAVP + 2.10

Psuedo Example:

 The Node name is ABCNODE   << UPPER Case
 The Node name is Abcnode      << Mixed Case

 

Solution

1) Alter the following script:

    /opt/oracle/oak/adapters/sharedrepoactions.py
         line 342:

   if (string.find(output,sub_str) > 0):

       --to--

    if (string.find(output.lower(),sub_str.lower()) > 0)

 

2)  kill the odaBaseAgent.py process on the oda_base on both the nodes.

3)  wait for the process to respawn on both the nodes, if not then run "init q" couple of times till the process comes back.
 

Worked Example


"... I updated and the replaced the sharedrepoactions.py on both nodes as instructed.

      I killed the process on both nodes and waited for them to start.
      Once they were started I ran the show repo command and I still do not see the repositories.

      I restarted oak afterwards and the repositories were still not listed.

However after a few minutes I tried to start both repo1 and repo 2 it said they were already online!!

NOTE: Based on User Feedback you may need to restart oak and respawn multiple times !!  -- some users said they had to do this '...several times...'
      - We will try to provide better details as they become available and confirmed - CL
  

The original VM (VMNEW) also started up automatically on Node 0 and is up and running
..."


Below are the logs of the commands that were run after updating and replacing the sharedrepoactions.py files on both DOM1 Node 0 and Node1.


ps -ef | grep -i odaBaseAgent.py

  root 45494 84694 0 00:07 pts/0 00:00:00 grep -i odaBaseAgent.py
  root 62616 1 0 00:00 ? 00:00:00 /usr/bin/python /opt/oracle/oak/adapters/odaBaseAgent.py

[root@ODA1 bin]# kill 62616            <<<< this is only this example's process#
[root@ODA1 bin]# ps -ef | grep -i odaBaseAgent.py

root 48873 1 0 00:07 ? 00:00:00 /usr/bin/python /opt/oracle/oak/adapters/odaBaseAgent.py
root 49502 84694 0 00:08 pts/0 00:00:00 grep -i odaBaseAgent.py

[root@ODA1 bin]# ./oakcli show repo


    NAME    TYPE NODENUM STATE

  odarepo1 local      0       N/A
  odarepo2 local      1       N/A
NOTE:    Based on User Feedback you may need to restart oak and respawn multiple times !!  -- some users said they had to do this '...several times...'
- We will try to provide better details as they become available and confirmed - CL


[root@ODA1 bin]# ./oakcli start repo repo1 -node 0

Resource is already ONLINE

[root@ODA1 bin]# ./oakcli start repo repo1 -node 1
Resource is already ONLINE

[root@ODA1 bin]# ./oakcli show repo


NAME      TYPE NODENUM STATE
odarepo1  local     0   N/A
odarepo2  local     1   N/A
repo1    shared     0   ONLINE
repo1    shared     1   ONLINE
repo2    shared     0   ONLINE
repo2    shared     1   ONLINE


[root@ODA1 bin]# ./oakcli start vm VMNEW
Resource is already ONLINE
 

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback