Infiniband HCA Replaced and Post Hardware Replacement IPMP bondib0 Group Is Marked As Failed

Asset ID:	1-72-1988829.1
Update Date:	2015-08-14
Keywords:

Solution Type Problem Resolution Sure

Solution 1988829.1 : Infiniband HCA Replaced and Post Hardware Replacement IPMP bondib0 Group Is Marked As Failed

Applies to:

Oracle SuperCluster T5-8 Full Rack - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]
Sun SPARC Sun OS

Symptoms

Grid Infrastructure fails to startup on Super Cluster DB Compute Node and underlying IB IPMP Information is not persistent post IB Card Replacement.

Unable to see any IB Hosts after card replacement.

Changes

Infiniband HCA Card (host channel adapter) Replacement Activity on Super Cluster DB Compute Nodes.

Cause

Post the IB Card replacement upon node reboot Oracle Clusterware Stack fails to come online, issue is caused by IPMP bondib0 group having a status of failed when the HCA card failed.

E.g.
root@dbnode01:~# ipmpstat -g
GROUP       GROUPNAME   STATE     FDT       INTERFACES
bondib0     bondib0     failed    --        --             ---> This seems to indicate IPMP was there but its incorrect !
bondeth0    bondeth0    ok        --        eth2 eth1

In Some cases presence of IB Partitions on the switch may cause this, however most deployments do not have this configured. You can verify this on every IB Switch.

E.g

# smpartition list active
# Sun DCS IB partition config file
# This file is generated, do not edit
#! version_number : 1
Default=0x7fff, ipoib : ALL_CAS=full, ALL_SWITCHES=full, SELF=full;

Above output confirms default, hence not partitions configured.

Solution

1. If CRS is partially running or attempt to startup shut it down on the problem node.

root@dbnode01:~# crsctl stop crs -f

2. Confirm the IP# address information that originally was used for bondib0 interface.

Steps you can take to retrieve this information..

- See if the host/ip information is stored in /etc/hosts
- Check value of cluster_interconnects parameter from an spfile/init.ora if configured
- Check original deployment documentation
- Check alert.logs for ASM Instance and scan backwards for string 'CELL communication is configured' this should list the IP used from last startup.

3. From a working node confirm how the setup is supposed to be configured and the subnet mask:

E.g.

root@dbnode02:~# ipmpstat -g
GROUP       GROUPNAME   STATE     FDT       INTERFACES
bondib0     bondib0     ok        --        bondib0_1 bondib0_0              -------> IPMP Group Config !
bondeth0    bondeth0    ok        --        eth2 eth1

root@dbnode02:~# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
eth0/v4           static   ok           97.253.193.18/26
eth3/v4a          static   ok           172.17.80.239/22
bondeth0/v4       static   ok           97.253.193.82/26
bondeth0/v4a      static   ok           97.253.193.90/26
bondib0/v4        static   ok           192.168.2.200/26                    -------> IP# of another node and its mask!
net6/v4           static   ok           169.254.182.77/24
lo0/v6            static   ok           ::1/128
eth3/v4           static   disabled     97.253.193.146/26
eth3/bkp          static   disabled     172.17.80.239/22

We are only using the above information as a reference point so that we can correctly confirm IPMP group setup
and its IP# along with its mask.

4. On the problem node, re-configure the interfaces as follows:

root@dbnode01:~# dladm create-part -l ib0 -P 0xffff bondib0_0
root@dbnode01:~# dladm create-part -l ib1 -P 0xffff bondib0_1

root@dbnode01:~# dladm show-part
LINK         PKEY OVER         STATE    FLAGS
bondib0_0    FFFF ib0          unknown ----
bondib0_1    FFFF ib1          unknown ----

root@dbnode01:~# ipmpstat -g
GROUP       GROUPNAME   STATE     FDT       INTERFACES
bondib0     bondib0     failed    --        --
bondeth0    bondeth0    ok        --        eth2 eth1

root@dbnode01:~# ipadm delete-ip bondib0_0

root@dbnode01:~# ipadm show-if
IFNAME     CLASS    STATE    ACTIVE OVER
lo0        loopback ok       yes    --
eth0       ip       ok       yes    --
eth1       ip       ok       yes    --
eth2       ip       ok       yes    --
eth3       ip       failed   no     --
bondeth0   ipmp     ok       yes    eth1 eth2
bondib0_0 ip       down     no     --
bondib0_1 ip       down     no     --

root@dbnode01:~# ipadm create-ipmp bondib0

root@dbnode01:~# ipadm show-if
IFNAME     CLASS    STATE    ACTIVE OVER
lo0        loopback ok       yes    --
eth0       ip       ok       yes    --
eth1       ip       ok       yes    --
eth2       ip       ok       yes    --
eth3       ip       failed   no     --
bondeth0   ipmp     ok       yes    eth1 eth2
bondib0_0 ip       ok       yes    --
bondib0_1 ip       ok       yes    --
bondib0    ipmp     down     no     bondib0_0 bondib0_1

You will now need to know the IP# to be used for this interface on the problem node, taken in step# 2

root@dbnode01:~# ipadm create-addr -T static -a local=192.168.2.199/26 bondib0/v4

root@dbnode01:~# ipadm show-if
IFNAME     CLASS    STATE    ACTIVE OVER
lo0        loopback ok       yes    --
eth0       ip       ok       yes    --
eth1       ip       ok       yes    --
eth2       ip       ok       yes    --
eth3       ip       failed   no     --
bondeth0   ipmp     ok       yes    eth1 eth2
bondib0_0 ip       ok       yes    --
bondib0_1 ip       ok       yes    --
bondib0    ipmp     ok       yes    bondib0_0 bondib0_1

IP# is now alive.

Verify you can rds-ping this from remote nodes.

5. Startup CRS:

root@dbnode01:~# crsctl start crs

Verify stack is healthy.

Please note this change is persistent and will survive a reboot!

References

3-10313575601

Attachments

This solution has no attachment