Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1468364.1
Update Date:2017-03-02
Keywords:

Solution Type  Technical Instruction Sure

Solution  1468364.1 :   Exadata - How to Change interconnect bonding interface on the compute nodes  


Related Items
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST
  •  




In this Document
Goal
Solution
 Overview
 Details
 KNOWN ISSUES
References


This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review.

Applies to:

Exadata Database Machine V2 - Version All Versions and later
Information in this document applies to any platform.
***Checked for relevance on 09-Dec-2013***

Goal

As described in the Exadata Database Machine Owners guide, Chapter 8
Extending Oracle Exadata Database Machine, Configuring the New Hardware:

"Earlier releases of Oracle Exadata Database Machine X2-2 (with X4170 and X4275 servers) used bond0 and bond1 as the names for the bonded InfiniBand
and bonded Ethernet client networks, respectively. In the current release, bondib0 and bondeth0 are used for the bonded InfiniBand and bonded Ethernet
client networks. If you are adding new servers to an existing Oracle Exadata Database Machine X2-2 (with X4170 and X4275 servers), then ensure the
database servers use the same names for bonded configuration. You can either change the new database servers to match the existing server interface names,
or change the existing server interface names and Oracle Cluster Registry (OCR) configuration to match the new servers. Use the oifcfg utility to
change the OCR configuration. The interface names for Exadata Storage Servers do not have to be changed."

This document describes the steps to rename the bonding interface on the existing nodes of the cluster, primarly during the expansion of a cluster by adding another rack

 

Applies to:

Only Compute nodes running on servers SUN FIRE X4170 SERVER bonding interface defined as bond0

After the procedure, on the compute nodes,  bonding interface will be identified by bondib0

Solution

Overview


The procedure is simple because it does not include configuring a new subnet.

The steps are:

  1. Determine if the CLUSTER_INTERCONNECT parameter is used in the Oracle Database and Oracle ASM instances
  2. Shut down all cluster-managed services on each database server as the oracle user
  3. Modify the cluster interconnect interface to use the BONDIB0 interface on the first database server
  4. Shut down Oracle Clusterware and Oracle Clusterware CRS on each database server
  5. Change the InfiniBand IP addresses on each database server
  6. Start Oracle Clusterware on each database server
  7. Start all cluster-managed services using the SRVCTL utility
  8. Enable Oracle Clusterware CRS automatic restart on each database server
  9. Delete old cluster interconnect interface bond0
  10. Perform a health check of Oracle Exadata Rack using the steps described in My Oracle Support note 1070954.1.

 

Details



1. Determine if the CLUSTER_INTERCONNECT parameter is used in the Oracle Database and Oracle ASM instances

Currently the database and ASM instances should have defined the cluster_internconnect parameter, associated with the IPs over
the Infiniband network.

Use following SQL command:

sql>SELECT inst_id, name,value FROM gv$parameter WHERE name ='cluster_interconnects';




If it is not defined, use alter system command like:

sql>ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.3.1' SCOPE=SPFILE SID='+ASM1';
sql>ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.3.2' SCOPE=SPFILE SID='+ASM2';
sql>ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.3.3' SCOPE=SPFILE SID='+ASM3';

 

  • Run the command for each of the instances running on the cluster. 
  • Each instance will use the IP address used by the Infiniband network.
  • Repeat the procedure for all the RDBMS instances


2. Shut down all cluster-managed services on each database server.

   Log as the oracle user owner of Grid Infrastructure:

  

$srvctl stop home -o db_home -s state_filename -n node_name



3. Modify the cluster interconnect interface to use the BONDIB0 interface on the first database server


  * Log in as the oracle user owner of Grid Infrastructure

  * Set ORACLE_HOME to the Grid Infrastructure home.

  * Set ORACLE_SID to the ASM instance

  * List the available cluster interfaces using the following command:

   

$ oifcfg iflist

    The following is an example of the output:

    eth0  10.141.134.0
    eth3  10.141.140.0
    bondeth0  10.141.132.0
    bond0  192.168.8.0
    bond0  169.254.0.0

    Note:  Results will be different on each system due to factors like bonding not used over the Ethernet adapters.

    From previous results:

    eth0 is the management network
    eth3 is an additional subnet over the Ethernet 1gb
    bondeth0 is the bonding driver for the public network
    bond0 is the bonding driver for the Infiniband network



  * List the currently-assigned cluster interfaces using the following command:

   

$ oifcfg getif

    The following is an example of the output:

    bondeth0  10.141.132.0  global  public
    bond0     192.168.8.0  global  cluster_interconnect


  * Assign BONDIB0 as the global cluster interconnect interface using the following command:

    $ oifcfg setif -global c_interface/c_IP_address:cluster_interconnect

    In the preceding command, c_interface is the interface to be used for cluster interconnect,
    and c_IP_address is the IP address for the cluster interconnect.

    The following is an example of the command:

    $ oifcfg setif -global bondib0/192.168.8.0:cluster_interconnect



  * List the current interfaces using the following command:

   

$ oifcfg getif

    The following is an example of the output:

    bondeth0  10.141.132.0  global  public
    bond0     192.168.8.0  global  cluster_interconnect
    bondib0   192.168.8.0  global  cluster_interconnect



4. Shut down Oracle Clusterware and Oracle Clusterware CRS on each database server

  * Log in as the root user.

    Shut down Oracle Clusterware CRS on each database server using the following command:

   

# GRID_HOME/grid/bin/crsctl stop crs -f



    Disable automatic Oracle Clusterware CRS restart on each database server using the following command:

   

# GRID_HOME/grid/bin/crsctl disable crs



5. Change the InfiniBand IP addresses on each database server

   * Modify file /opt/oracle.cellos/cell.conf and replace bond0 by bondib0.

  

It should be 3 replacements.

    

#mv /opt/oracle.cellos/cell.conf /opt/oracle.cellos/cell.conf.old
#sed -d 's/bond0/bondib0/g' /opt/oracle.cellos/cell.conf.old > /opt/oracle.cellos/cell.conf
#grep bondib0 /opt/oracle.cellos/cell.conf

should return something like

                              'Name' => 'bondib0',
                              'Master' => 'bondib0'
                              'Master' => 'bondib0'

For reference, below is an example of file cell.conf:

$VAR1 = {
          'Internal' => {
                          'Interface infiniband prefix' => 'ib',
                          'Interface ethernet prefix' => 'eth'
                        },
          'Hostname' => 'host.domain.com',
          'Timezone' => 'America/New_York',
          'Interfaces' => [
                            {
                              'IP address' => '192.168.10.5',
                              'Hostname' => 'host-priv.domain.com',
                              'Netmask' => '255.255.252.0',
                              'Net type' => 'Private',
                              'Slaves' => [
                                            'ib0',
                                            'ib1'
                                          ],
                              'Name' => 'bondib0',
                              'State' => 1
                            },
                            {
                              'Hostname' => 'host.domain.com',
                              'IP address' => 'XX.XX.XX.XX',
                              'Netmask' => '255.255.254.0',
                              'Net type' => 'Management',
                              'Name' => 'eth0',
                              'State' => 1,
                              'Gateway' => 'XX.XX.XX.XX'
                            },
                            {
                              'Hostname' => 'host-dr.domain.com',
                              'IP address' => 'XX.XX.XX.XX',
                              'Netmask' => '255.255.255.0',
                              'Net type' => 'Other',
                              'Name' => 'eth3',
                              'State' => 1,
                              'Gateway' => 'XX.XX.XX.XX'
                            },
                            {
                              'Hostname' => 'hostpublic.domain.com',
                              'IP address' => 'XX.XX.XX.XX',
                              'Netmask' => '255.255.254.0',
                              'Net type' => 'SCAN',
                              'Name' => 'eth4',
                              'State' => 1,
                              'Gateway' => 'XX.XX.XX.XX'
                            },
                            {
                              'Name' => 'ib0',
                              'State' => 1,
                              'Master' => 'bondib0'
                            },
                            {
                              'Name' => 'ib1',
                              'State' => 1,
                              'Master' => 'bondib0'
                            }
                          ],
          'Ntp drift' => '/var/lib/ntp/drift',
          'Version' => '11.2.2.3.0',
          'Ntp servers' => [
                             'XX.XX.XX.XX'
                           ],
          'Nameservers' => [
                             'XX.XX.XX.XX',
                             'XX.XX.XX.XX',
                             'XX.XX.XX.XX'
                           ],
          'Unlinked interfaces' => [],
          'Node type' => 'db',
          'Default gateway device' => 'eth4',
          'ilom' => {
                      'ILOM Nameserver' => 'XX.XX.XX.XX',
                      'ILOM Timezone' => 'America/New_York',
                      'ILOM Netmask' => '255.255.254.0',
                      'ILOM IP address' => 'XX.XX.XX.XX',
                      'ILOM Search' => 'us.oracle.com',
                      'ILOM Second NTP server' => '0.0.0.0',
                      'ILOM Short Hostname' => 'host-ilom',
                      'ILOM Fully qualified hostname' => 'host-ilom.us.oracle.com',
                      'ILOM First NTP server' => 'XX.XX.XX.XX',
                      'ILOM Gateway' => 'XX.XX.XX.XX',
                      'ILOM Use NTP Servers' => 'enabled'
                    }
        };




   * Execute /opt/oracle.cellos/ipconf -nocodes -f

   * Validate configuration files:

    

#cd /etc/sysconfig/network-scripts

#grep bondib0 if*



This should report entries in 3 files:



# grep bondib0 ifc*

ifcfg-bondib0:DEVICE=bondib0

ifcfg-ib0:MASTER=bondib0

ifcfg-ib1:MASTER=bondib0



Example of the configuration files





## ifcfg-bondib0





#### DO NOT REMOVE THESE LINES ####

#### %GENERATED BY CELL% ####

DEVICE=bondib0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

IPADDR=192.168.10.5

NETMASK=255.255.252.0

NETWORK=192.168.8.0

BROADCAST=192.168.11.255

BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000 num_grat_arp=100"

IPV6INIT=no

MTU=65520



## ifcfg-ib0



#### DO NOT REMOVE THESE LINES ####

#### %GENERATED BY CELL% ####

DEVICE=ib0

USERCTL=no

ONBOOT=yes

MASTER=bondib0

SLAVE=yes

BOOTPROTO=none

HOTPLUG=no

IPV6INIT=no

CONNECTED_MODE=yes

MTU=65520



## ifcfg-ib1



#### DO NOT REMOVE THESE LINES ####

#### %GENERATED BY CELL% ####

DEVICE=ib1

USERCTL=no

ONBOOT=yes

MASTER=bondib0

SLAVE=yes

BOOTPROTO=none

HOTPLUG=no

IPV6INIT=no

CONNECTED_MODE=yes

MTU=65520

 


 

   * Delete old file ifcfg-bond0

   * Reboot the nodes

   * Validate connection through IPoIB works, pinging cells and other compute nodes

    

#rds-ping -c 10 <IPoIB>



6. Start Oracle Clusterware on each database server

  

# GRID_HOME/grid/bin/crsctl start crs



7. Start all cluster-managed services using the SRVCTL utility


8. Enable Oracle Clusterware CRS automatic restart on each database server

  

# GRID_HOME/grid/bin/crsctl enable crs



9. Delete old cluster interconnect interface bond0

   * Log as owner of Grid Infrastructure and execute

    

$oifcfg delif -global bond0



   * Validate cluster_interconnect is using bondib0

     

$oifcfg getif
      bondeth0  10.141.132.0  global  public
      bondib0  192.168.8.0  global  cluster_interconnect


10. Perform a health check of Oracle Exadata Rack using the steps described in My Oracle Support note 1070954.1.

KNOWN ISSUES

  1.  The execution of ipconf reports errors.


While executing ipconf, following messages could be observed:

Use of uninitialized value in string ne at /opt/oracle.cellos/ipconf.pl line 2523.
Use of uninitialized value in concatenation (.) or string at /opt/oracle.cellos/ipconf.pl line 2472.

Subsequently, the start of network services failed:

# dmesg | grep bond
bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.

Solution

As a result of an incomplete execution of ipconf, some of the configuration files used for the bonding interface may have been not
updated.

1. validate the current entry on file /etc/modprobe.conf.   Most likely it will have an entry for the old bonding name like

alias bond0 bonding

2. If that is the case, rename the entry on the file:

alias bondib0 bonding

3. Restart the network services

References

<BUG:14079368> - ADDNODE FAILING WITH CTSS DAEMON ABORTING

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback