Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2362646.1
Update Date:2018-03-15
Keywords:

Solution Type  Problem Resolution Sure

Solution  2362646.1 :   Logging in as ilom-admin on a 36P or GW Infiniband Switch running Firmware 2.2.8-2 hangs as well as the spsh command hangs  


Related Items
  • Sun Datacenter InfiniBand Switch 36
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-16881578619>

Applies to:

Sun Datacenter InfiniBand Switch 36 - Version All Versions to All Versions [Release All Releases]
Sun Network QDR InfiniBand Gateway Switch - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Failed to ssh NM2-GW IB switches ILOM as ilom-user post upgrading to IB switch firmware 2.2.8-2

Unable to SSH to NM2-GW & 36P IB switches ILOM through user ilom-admin, command appears to be hung

Symptoms

Logging into Infiniband switch as ilom-admin user hanges as well as logging in as root and starting ILOM with the spsh command

e.g.
[root@cn01 ~]ssh ilom-admin@10.10.10.184
Password:
^C
[root@cn01 ~]

Also, command spsh is also hung.

ilom service failed to reboot

[root@ibswitchgw01-adm ~]# service ilom status
ILOM stack is partly started with 2 processes.
ILOM daemons that failed to start are : logmgr etcd fdd IPMIMain MsgHndlr PEF LAN plathwsvcd l umain ealertd webgo ssl_proxyd stdiscoverer stlistener
And, ILOM stack subsystem is not locked! ILOM daemons may have been started manually!
[root@elorl03gw01-adm ~]#

[root@ibswitchgw02-adm ~]# service ilom status
ILOM stack is partly started with 7 processes.
ILOM daemons that failed to start are : etcd fdd IPMIMain MsgHndlr PEF LAN plathwsvcd stdiscoverer stlistener
And, ILOM stack subsystem is not locked! ILOM daemons may have been started manually!

 

Changes

 Upgraded to Switch Firmware version 2.2.8-2

Cause

ILOM Services stack is partly started with few processes.

[root@IBSwitch-adm ~]# version
SUN DCS gw version: 2.2.8-2
Build time: Nov 24 2017 11:27:04 
[root@IBSwitchgw02-adm ~]# service ilom status
ILOM stack is partly started with 9 processes.
ILOM daemons that failed to start are : IPMIMain MsgHndlr PEF LAN plathwsvcd stdiscoverer stlistener
And, ILOM stack subsystem is not locked! ILOM daemons may have been started manually!
[root@IBSwitch-adm ~]# service ilom stop
Stopping ILOM stack
Stopping Servicetags listener: stlistener.
Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Stopping Servicetags discoverer: stdiscoverer.
Stopping webgo Web Server: webgo Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Stopping lu main daemon: lumain Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping Platform Services Daemon: plathwsvcd Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping IPMI Stack: Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping Fault Diagnosis Daemon: fdd Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping Error Telemetry Collection Daemon: etcd Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping ipmi log manager daemon: logmgr Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping Event Manager: eventmgr Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Stopping capidirect daemon: capidirectd Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Done
Invalid ring ID: 'KPALIVE'
Usage:
ilom_trace [-f <dblog_file>] [-n <filename>] -r <ring_id> [-m <ring_mask>] [-l <TRACE | DEBUG | EXIT | FUNC | ENTRY | INFO | WARN | ERROR | ERR_EXIT | CRITICAL | FATAL | MAX | <number>> ] args...
Could not get TMB block
Could not get TMB block
[root@IBSwitch-adm ~]# service ilom start
Creating home directories
Updating FW version
Running plat...Done running plat
Preparsing sensor.xml... ( took 0 seconds ) done
Starting capidirect daemon: capidirectd . Done
Starting Event Manager: eventmgr . Done
Starting ipmi log manager daemon: logmgr . Done
Starting Error Telemetry Collection Daemon: etcd . Done
Starting Fault Diagnosis Daemon: fdd . Done
Starting IPMI Stack: . Done
Starting Platform Services Daemon: plathwsvcd . Done
Setting ILOM IPMI firewall rules
Starting lu main daemon: lumain . Done
Starting webgo Web Server: webgo . Done
Starting Servicetags discoverer: stdiscoverer.
Starting Servicetags listener: stlistener.
Warning... Configuring stlistener for keep_me_alive when already configured
Starting platform_logger

[root@IBSwitch-adm ~]# service ilom status
ILOM stack is running.
 

Solution

Re-start the ilom services stack.

 # service ilom status

# service ilom stop

# service ilom start

Wait a minute or so then:

# service ilom status
ILOM stack is running.

 




This being an Internal Exalogic Lab Environment Customer had already rebooted both the IB Switches but it did not work

This symptom will be seen every time  the switch is power-cycled.

 

 

References

<BUG:27478121> - JAN 2018 PSU IB FIRMWARE 2.2.8-2 STATE LEAVES GUEST VMS UNSTARTABLE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback