Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2215807.1
Update Date:2017-02-09
Keywords:

Solution Type  Problem Resolution Sure

Solution  2215807.1 :   exasw-ibs0 Broken Madrpc_init: Can't Open UMAD Port at ibswitches after switch upgrade  


Related Items
  • Exadata X3-8 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




Created from <SR 3-13857068681>

Applies to:

Exadata X3-8 Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

After IB switch upgrade (to 2.1.8-1 with patch_12.1.2.3.2.160721), ibswitches command returns "ibpanic: [3922] madrpc_init: can't open UMAD port ((null):0): (No such file or directory)"

 

1)[root@exasw-ibs0 ~]# ibswitches

ibpanic: [3922] madrpc_init: can't open UMAD port ((null):0): (No such file or directory)
[root@exasw-ibs0 ~]# ibcheckerrors
ibpanic: [4289] madrpc_init: can't open UMAD port ((null):0): (No such file or directory)

## Summary: 0 nodes checked, 0 bad nodes found
## 0 ports checked, 0 ports have errors beyond threshold

 

2) Version info shows upgraded version

[root@exasw-ibs0 ~]# version
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Failed to open jida

3) all  processes/application in  db/storage servers are up and running

 

 

Changes

Upgrade exadata storage/switch with 12.1.2.3.2.160721

Cause

Ib switch upgrade didn't complete on all switches.  It completed all except spine one
It is not known yet what caused the partial upgrade

 

 

1) Error exasw-ibs0 upgrde log

[1481772833][2016-12-14 22:39:30 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][upgradeIBSwitchSW][2505][DISPLAY] Starting upgrade on exasw-ibs0 to 2.1.8_1. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[1481772833][2016-12-14 22:39:30 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 getmaster] [CMD_STATUS: 0]
----- START STDOUT -----
Local SM enabled and running, state MASTER
20140925 23:37:00 Master SubnetManager on sm lid 1 sm guid 0x10e035c2b5a0a0 : SUN DCS 36P QDR exasw-ibs0 xxx.xxx.xxx.xxx
----- END STDOUT -----
[1481772833][2016-12-14 22:39:32 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][getIBSwitchMaster][] [CMD: ssh root@exasw-ibs0 getmaster]
Local SM enabled and running, state MASTER
20140925 23:37:00 Master SubnetManager on sm lid 1 sm guid 0x10e035c2b5a0a0 : SUN DCS 36P QDR exasw-ibs0 xxx.xxx.xxx.xxx
[1481772833][2016-12-14 22:39:32 -0500][WARNING][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 spshexec load -o verbose -script -source /dev/shm/sundcs_36p_repository_2.1.8_1.pkg] [CMD_STATUS: 255]
----- START STDOUT -----

Oracle(R) Integrated Lights Out Manager

Version ILOM 3.0 r47111

Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.

-> load -o verbose -script -source /dev/shm/sundcs_36p_repository_2.1.8_1.pkg^M
----- END STDOUT -----
----- START STDERR -----
Write failed: Broken pipe^M
----- END STDERR -----
[1481772833][2016-12-15 00:39:40 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][loadIBSwitchSW][2097] Return value from firmware load is : 0
[1481772833][2016-12-15 00:39:40 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 type -t /usr/local/bin/version] [CMD_STATUS: 0]
----- START STDOUT -----
file
----- END STDOUT -----
[1481772833][2016-12-15 00:39:41 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][getIBSwitchVersionExec][] [CMD: ssh root@exasw-ibs0 type -t /usr/local/bin/version]
file
[1481772833][2016-12-15 00:39:41 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 /usr/local/bin/version] [CMD_STATUS: 0]
----- START STDOUT -----
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Failed to open jida
[1481772833][2016-12-15 00:39:41 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][getIBSwitchVersion][] [CMD: ssh root@exasw-ibs0 /usr/local/bin/version]
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Failed to open jida
[1481772833][2016-12-15 00:39:41 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 rpm -q kernel-2.6.27.13nm2_v2-31.i386] [CMD_STATUS: 0]
----- START STDOUT -----
kernel-2.6.27.13nm2_v2-31
----- END STDOUT -----
[1481772833][2016-12-15 00:39:44 -0500][SUCCESS][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][upgradeIBSwitchSW][2579][DISPLAY] Load firmware 2.1.8_1 onto exasw-ibs0
[1481772833][2016-12-15 00:39:44 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 type -t /usr/local/bin/version] [CMD_STATUS: 0]
----- START STDOUT -----
file
----- END STDOUT -----
[1481772833][2016-12-15 00:39:45 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][getIBSwitchVersionExec][] [CMD: ssh root@exasw-ibs0 type -t /usr/local/bin/version]
file
[1481772833][2016-12-15 00:39:45 -0500][CMD][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][runCmdOnIBSwitch][] [CMD: ssh root@exasw-ibs0 /usr/local/bin/version] [CMD_STATUS: 0]
----- START STDOUT -----
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Failed to open jida
----- END STDOUT -----
[1481772833][2016-12-15 00:39:45 -0500][INFO][/tmp/12.1.2.3.0/12.1.2.3.0/patch_12.1.2.3.0.160207.3/upgradeIBSwitch.sh][getIBSwitchVersion][] [CMD: ssh root@exasw-ibs0 /usr/local/bin/version]
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Failed to open jida

2) fwverify shows missing package in the switch

root@exasw-ibs0 ~]# fwverify

Checking all present packages:
............................................................................................................................................. FAILED

=======================================================================
Packages not belonging to the installed switch firmware version found!
These packages are:
JidaDrv-1.1-1 bash-3.2-33.el5_11.4 tzdata-2014i-1.el5 nm2-phs-2.1.8-1
=======================================================================

Checking if any packages are missing:
...................................................................................................................................... FAILED

=======================================================================
Some required packages are missing!
These packages are:
tzdata-2012c-1.el5.i386 bash-3.2-21.el5.i386 nm2-phs-2.1.3-4.i386
=======================================================================

Verifying installed files:
....................................................................................................................................... FAILED

* Package tzdata-2012c-1.el5.i386:
package tzdata-2012c-1.el5.i386 is not installed

* Package bash-3.2-21.el5.i386:
package bash-3.2-21.el5.i386 is not installed

* Package nm2-phs-2.1.3-4.i386:
package nm2-phs-2.1.3-4.i386 is not installed

Checking FW Coreswitch:
-E- Can not open /dev/mst/mt48436_pci_cr0: No such file or directory MFE_CR_ERROR

3) root@exasw-ibs0 ~]# fwverify

Checking all present packages:
............................................................................................................................................. FAILED

=======================================================================
Packages not belonging to the installed switch firmware version found!
These packages are:
JidaDrv-1.1-1 bash-3.2-33.el5_11.4 tzdata-2014i-1.el5 nm2-phs-2.1.8-1
=======================================================================

Checking if any packages are missing:
...................................................................................................................................... FAILED

=======================================================================
Some required packages are missing!
These packages are:
tzdata-2012c-1.el5.i386 bash-3.2-21.el5.i386 nm2-phs-2.1.3-4.i386
=======================================================================

Verifying installed files:
....................................................................................................................................... FAILED

* Package tzdata-2012c-1.el5.i386:
package tzdata-2012c-1.el5.i386 is not installed

* Package bash-3.2-21.el5.i386:
package bash-3.2-21.el5.i386 is not installed

* Package nm2-phs-2.1.3-4.i386:
package nm2-phs-2.1.3-4.i386 is not installed

Checking FW Coreswitch:
-E- Can not open /dev/mst/mt48436_pci_cr0: No such file or directory MFE_CR_ERROR

 

 

Solution

<workaround  from Bug 24932932>

-------------------------------------------

Install firmware 2.1.8 using -force option.
.
i.e. > load -force -source <path to 2.1.8 .pkg file>

load -force -source /tmp/12.1.2.3.2/CELL/patch_12.1.2.3.2.160721/sundcs_36p_repository_2.1.8_1.pkg

reboot the swtich

run fwverify, env_test, showunhealthy to verify that you then have recovered



References

<BUG:24932932> - IB SWITCH UPGRADE FAILED UBAGXIBSW083 -- PARTIALLY UPGRADE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback