Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1921433.1
Update Date:2018-01-10
Keywords:

Solution Type  Problem Resolution Sure

Solution  1921433.1 :   Running the networksetup-two Script for a New Installation on Oracle Big Data Appliance (BDA) X3-2 Fails with "networksetup-two: ping (element# 2) failed"  


Related Items
  • Big Data Appliance X3-2 Hardware
  •  
  • Big Data Appliance X3-2 Full Rack
  •  
  • Big Data Appliance X3-2 In-Rack Expansion
  •  
  • Big Data Appliance X4-2 Hardware
  •  
  • Big Data Appliance X4-2 Full Rack
  •  
  • Big Data Appliance X4-2 Starter Rack
  •  
  • Big Data Appliance X4-2 In-Rack Expansion
  •  
  • Big Data Appliance X3-2 Starter Rack
  •  
Related Categories
  • PLA-Support>Eng Systems>BDA>Big Data Appliance>DB: BDA_EST
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-9500359951>

Applies to:

Big Data Appliance X3-2 Hardware - Version All Versions and later
Big Data Appliance X4-2 Starter Rack - Version All Versions and later
Big Data Appliance X4-2 Hardware - Version All Versions and later
Big Data Appliance X4-2 Full Rack - Version All Versions and later
Big Data Appliance X3-2 Full Rack - Version All Versions and later
Linux x86-64

Symptoms

The following symptoms exist while configuring the Oracle Big Data Appliance network.

 

1. The networksetup-two script fails when trying to ping the client data gateway as shown below:

...

networksetup-two: passed
networksetup-two: ssh into servers on client network by name
networksetup-two: passed
networksetup-two: test ntp servers
networksetup-two: passed
networksetup-two: ping client gateway
networksetup-two: ping 10.xx.xx.xxx (element# 2) failed

 

2. The bdacheckvnics script also shows a "no" for pinging the gateway:

[root@node05 ~]# cd /opt/oracle/*/network
[root@node05 network]# ./bdacheckvnics

    host      if   status actv primary    switch                     gw port       ping gw
============== ===  ====== ==== === ====================== ========= ======
node05      ib0   up    yes  yes           ---                          ---              ---
node05      ib1   up    no   no             ---                          ---              ---
node05      eth8  up    no   no           <switch>-ib-sw1  0A-ETH-1         ---
node05      eth9  up    yes  yes         <switch>-ib-sw2  0A-ETH-1         no

 

3. The arp test shows output like:

[root@node05 network]# arp -a
node03-adm (10.xx.xxx.xx) at 00:21:28:ff:51:9c [ether] on eth0
? (10.xx.xx.xx) at 00:18:51:60:66:04 [ether] on bondeth0
? (10.xx.xx.xxx) at 00:00:5e:00:01:2f [ether] on bondeth0
? (10.xx.xx.xxx) at 00:03:d2:f3:43:50 [ether] on bondeth0

 

4. The bdachecknet script fails with errors as follows:

...

bdachecknet: test admin name array matches ip array
1x.xx.xxx.xx
bdachecknet: host name json element does not correspond to json element ip address: node01-adm (!=  index 0)
1x.xx.xxx.xx
bdachecknet: host name json element does not correspond to json element ip address: node02-adm (!=  index 1)
1x.xx.xxx.xx
bdachecknet: host name json element does not correspond to json element ip address: node03-adm (!=  index 2)
1x.xx.xxx.xx
bdachecknet: host name json element does not correspond to json element ip address: node04-adm (!=  index 3)
...

 

 

Cause

The root cause of these errors is:

1. A customer choice to block Admin IPs in DNS.

and at the same time

2. Setting  "ADMIN_HOSTS_IN_DNS" : "true" in /opt/oracle/bda/BdaDeploy.json.

The result of this combination is that the client gateway has ping responses disabled and the network tests are failing because the BDA servers don't have access to DNS servers as specified.  Basically the active client gateway has a blocking firewall.

Solution

To resolve this issue verify /opt/oracle/bda/BdaDeploy.json has "ADMIN_HOSTS_IN_DNS" is set to "true":

"ADMIN_HOSTS_IN_DNS" : "true"

Once confirmed the solution is to rerun the BDA Configuration Utility and set "ADMIN_HOSTS_IN_DNS" is set to "FALSE".

Follow these steps:

a. If "ADMIN_HOSTS_IN_DNS" : "true" rerun the BDA Configuration Utility and specify "No" to the question of "Are administration host name entries in DNS?" and regenerate the files.

Ensure that the newly generated BdaDeploy.json file has:

"ADMIN_HOSTS_IN_DNS" : "FALSE"

b. Backup the current copy of /opt/oracle/bda/BdaDeploy.json on Node 1  as 'root' user.

# cp /opt/oracle/bda/BdaDeploy.json /opt/oracle/bda/BdaDeploy.jsonORIG

 

c.  Copy the newly generated BdaDeploy.json to /opt/oracle/bda/ on Node 1.

d. Replace the old /opt/oracle/bda/BdaDeploy.json with the new one on all nodes with the following command.  Do this on Node 1 as 'root' user.

# dcli -C -f /opt/oracle/bda/BdaDeploy.json -d /opt/oracle/bda/BdaDeploy.json

 

The following additional tests will show the output described below once the value for "ADMIN_HOSTS_IN_DNS" is set to "FALSE". Please note that "FALSE" is case sensitive and will still fail if it is set to "false". If the following tests report the output as shown below then the mammoth installation can be started.

1. Ping the BDA nodes using the admin network to confirm the ping works.  This is done from each BDA node on the BDA Cluster. Ensure that ping works to all the BDA nodes using the admin network. All BDA nodes should be able to ping each of the other BDA nodes in the cluster.

a. From the /opt/oracle/BdaDeploy.json file find the following are that shows ETH0_NAMES and then this will show the ETH0_IPS so you can ping the IP addresses associated:

"ETH0_NAMES":
 [
  "bda1node01-adm",
  "bda1node02-adm",
  "bda1node03-adm",
...
 ],
 "ETH0_IPS":
 [
  "1x.xxx.xxx.xx",
  "1x.xxx.xxx.xx",
  "1x..xxx.xxx.xx",
  ...

 b. From each BDA node on the BDA Cluster ping each of the IP Addresses found under ETH0_IPS using ping -c # <IP address for node>

[root@bda1node01 ~]# ping -c 1 <IP address node01-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms


[root@bda1node01 ~]# ping - c 1 <IP address node02-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms

[root@bda1node01 ~]# ping - c 1 <IP address node0n-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms

...

After pinging all IP addresses in ETH0_IPS then ssh to the next node as in this example:


[root@bda1node01 ~]# ssh <bda1node02-adm>
Last login: Wed Sep 10 06:34:49 2014 from <bda node>

[root@bda1node02 ~]# ping - c 1 <IP address node01-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms

[root@bda1node02 ~]# ping - c 1 <IP address node02-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms

 

[root@bda1node02 ~]# ping - c 1 <IP address node0n-adm>
PING 1x.xxx.xxx.xx(1x.xxx.xxx.xx) 56(84) bytes of data.
64 bytes from 1x.xxx.xxx.xx: icmp_seq=1 ttl=64 time=0.152 ms

--- 1x.xxx.xxx.xx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.152/0.152/0.152/0.000 ms



[root@bda1node02 ~]# exit
logout
Connection to bda1node02-adm closed.

...

Repeat on each BDA node.

  

2. From the BDA cluster nodes confirm client network connectivity by pinging an external host IP address to ensure that the client network is accessible from all BDA hosts. Determine an external IP address and check that each of the BDA servers can ping at least one external IP address via the client gateway - this should be any IP address not on the BDA and not on the admin network. This will confirm that the the client network is accessible to all servers of the BDA.

[root@bda1node01 ~]# ping -c 1 <external host IP address>
PING 1x.xxx.xx.xxx (1x.xxx.xx.xxx) 56(84) bytes of data.
64 bytes from 1x.xxx.xx.xxx: icmp_seq=1 ttl=115 time=40.6 ms

--- 1x.xxx.xx.xxx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 40ms
rtt min/avg/max/mdev = 40.616/40.616/40.616/0.000 ms


[root@bda1node02 ~]# ping -c 1 <external host IP address>
PING 1x.xxx.xx.xxx (1x.xxx.xx.xxx) 56(84) bytes of data.
64 bytes from 1x.xxx.xx.xxx: icmp_seq=1 ttl=115 time=40.6 ms

--- 1x.xxx.xx.xxx ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 40ms
rtt min/avg/max/mdev = 40.616/40.616/40.616/0.000 ms

...

Repeat on each BDA node.

  


3. ping the BDA client IPs from a host on the client network. This example is from a windows desktop external to the BDA.  From the host on the client network ping each BDA node via its client IP address as below:

C:\Users\<username>ping -n 1 1x.xxx.xxx.xx

Pinging 1x.xxx.xxx.xx with 32 bytes of data:
Reply from 11x.xxx.xxx.xx: bytes=32 time=36ms TTL=51

Ping statistics for 1x.xxx.xxx.xx:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 36ms, Maximum = 36ms, Average = 36ms

C:\Users\<username>ping -n 1 1x.xxx.xxx.xx

Pinging 1x.xxx.xxx.xx with 32 bytes of data:
Reply from 11x.xxx.xxx.xx: bytes=32 time=36ms TTL=51

Ping statistics for 1x.xxx.xxx.xx:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 36ms, Maximum = 36ms, Average = 36ms

C:\Users\<username>ping -n 1 1x.xxx.xxx.xx

Pinging 1x.xxx.xxx.xx with 32 bytes of data:
Reply from 11x.xxx.xxx.xx: bytes=32 time=36ms TTL=51

Ping statistics for 1x.xxx.xxx.xx:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 36ms, Maximum = 36ms, Average = 36ms

...

  

It may not be possible to ping the BDA client ip addresses, however you should be able to ssh to the client ip’s of all the nodes in the BDA Cluster. If the client network is accessible from all BDA hosts then it should be ok to proceed with the mammoth install.

4. The bdachecknet script output should pass all checks up to "ping client gateway."

Example output:

The bdachecknet will then run with the following output which is expected:

# bdachecknet
bdachecknet: analyse /opt/oracle/bda/BdaDeploy.json
bdachecknet: passed
bdachecknet: checking for  BdaExpansion.json
bdachecknet: ping test private infiniband ips (bondib0 40gbs)
bdachecknet: passed
bdachecknet: ping test admin ips (eth0 1gbs)
bdachecknet: passed
bdachecknet: test client network (eoib) resolve and reverse resolve
bdachecknet: passed
bdachecknet: test client name array matches ip array
bdachecknet: passed
bdachecknet: ping servers on client network by ip
bdachecknet: passed
bdachecknet: test ntp servers
bdachecknet: passed
bdachecknet: ping client gateway
bdachecknet: ping 1x.xx.xx.xxx (element# 2) failed

5. There are 2 checks normally done after the ping of the client gateway where it failed above that should be run to ensure all tests will succeed that would have been run had the script not failed pinging the gateway.

a. Test that "arp -a" completes without errors i.e. run "arp -a >/dev/null ; echo $?" and ensure that the only output is "0".

arp -a >/dev/null ; echo $?

Example output:

ARP command should run like:

# dcli "arp -a >/dev/null ; echo $? "
  
1x.xx.xxx.xx: 0
1x.xx.xxx.xx: 0
1x.xx.xxx.xx: 0
...

 

b. Run /opt/oracle/bda/network/bdacheckvnics.

# /opt/oracle/bda/network/bdacheckvnics

Sample correct output follows:

    host      if   status actv primary    switch          gw port   ping gw
============== ===  ====== ==== === ====================== ========= ======
rack1bda01   ib0   up    yes  yes ---                    ---       ---
rack1bda01   ib1   up    no   no  ---                    ---       ---
rack1bda01   eth8  up    no   no  rack1sw-ib2           0A-ETH-1  ---
rack1bda01   eth9  up    yes  yes rack1sw-ib3           0A-ETH-1  yes

 

However in this case, "correct" output is expected to be similar except "ping gw" will be a "no" instead of "yes" as shown below:

   host      if   status actv primary    switch                     gw port       ping gw
============== ===  ====== ==== === ====================== ========= ======
node05      ib0   up    yes  yes          ---                          ---               ---
node05      ib1   up    no   no             ---                          ---               ---
node05      eth8  up    no   no           <switch>-ib-sw1  0A-ETH-1    ---
node05      eth9  up    yes  yes        <switch>-ib-sw2  0A-ETH-1    no

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback