![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1668946.1 : Sun Fire[TM] 12K/15K/E20K/E25K Servers: Solaris instance on System Controllers (SCs) may hang during the execution of network command 'dladm show-link'
In this Document
Created from <SR 3-8623519861> Applies to:Sun Fire E20K Server - Version Not Applicable to Not Applicable [Release N/A]Sun Fire 12K Server - Version Not Applicable to Not Applicable [Release N/A] Sun Fire 15K Server - Version Not Applicable to Not Applicable [Release N/A] Sun Fire E25K Server - Version Not Applicable to Not Applicable [Release N/A] Oracle Solaris on SPARC (32-bit) SymptomsSun Fire 12K/15K/E20K/E25K System Controller may encounter unexpected outages due to hung of Solaris, after issueing the data link administration command "dladm" directly on console on System Controller OS once, or at least after some attempts. These command sets are not really relevant for usage on Starcat System Controllers and for platform management: sc_root# /usr/sbin/dladm show-link
sc_root# /usr/sbin/dladm show-linkprop So it looks like the issue could be avoided by prevent usage of the commands on Sun Fire 12K/15K/E20K/E25K System Controllers, but dladm commands are executed by Oracle Explorer Data Collector in module "netinfo" since version 6.x. [ snip]
Oracle Explorer Data Collector 8.0 Apr 06 19:34:11 sc explorer: Explorer ID: explorer.<snip>-2014.04.06.18.33 Apr 06 19:34:18 sc rda: RUNNING Apr 06 19:34:18 sc rda: Initializing RDA Apr 06 19:34:18 sc rda: Running RDA ------------------------------------------------------------------------------ RDA Data Collection Started 06-Apr-2014 19:34:54 ------------------------------------------------------------------------------ Processing RDA.BEGIN module ... Processing RDA.CONFIG module ... Processing XPLR module ... Apr 06 19:34:56 sc begin:RUNNING Apr 06 19:34:57 sc ilomsnapshot_start:RUNNING Apr 06 19:34:57 sc patch:RUNNING Apr 06 19:35:12 sc pkg:RUNNING Apr 06 19:38:12 sc sysconfig:RUNNING Apr 06 19:40:40 sc ndd:RUNNING Apr 06 19:41:20 sc netinfo:RUNNING <hang> a ptree command in parallel in a second command shell: Wed Apr 9 08:41:22 BST 2014
1 /sbin/init 9 /lib/svc/bin/svc.startd 410 -sh 2169 ksh -o vi 2827 /bin/sh ./explorer 3601 /bin/sh /opt/SUNWexplo/tools/rda.sh 3611 /usr/bin/perl -T /usr/lib/rda/rda.pl -nXExplorer run -d/opt/SUNWexplo/output/ex 6995 /usr/sbin/dladm show-link [ output abbreviated ] Sun Apr 6 19:41:21 BST 2014 1 /sbin/init 7 /lib/svc/bin/svc.startd 401 -sh 2040 /bin/sh ./explorer 2740 /bin/sh /opt/SUNWexplo/tools/rda.sh 2750 /usr/bin/perl -T /usr/lib/rda/rda.pl -nXExplorer run -d/opt/SUNWexplo/output/ex 6144 /usr/sbin/dladm show-linkprop Sun Apr 6 19:41:22 BST 2014 1 /sbin/init 7 /lib/svc/bin/svc.startd 401 -sh 2040 /bin/sh ./explorer 2740 /bin/sh /opt/SUNWexplo/tools/rda.sh 2750 /usr/bin/perl -T /usr/lib/rda/rda.pl -nXExplorer run -d/opt/SUNWexplo/output/ex 6144 /usr/sbin/dladm show-linkprop Sun Apr 6 19:41:24 BST 2014 1 /sbin/init 7 /lib/svc/bin/svc.startd 401 -sh 2040 /bin/sh ./explorer 2740 /bin/sh /opt/SUNWexplo/tools/rda.sh 2750 /usr/bin/perl -T /usr/lib/rda/rda.pl -nXExplorer run -d/opt/SUNWexplo/output/ex 6144 /usr/sbin/dladm show-linkprop <hang>
ChangesThe descripted symptoms have been seen only on Sun Fire 12K/15K/E20K/E25K System Controllers with SMS 1.6 on Solaris 10 so far. CauseThe /etc/systems entry "set kmem_flags=0x1" has been identified as root cause of the behavior; it was was entered in the file under a previous troubleshooting scenario for advanced Solaris Kernel debugging. # /usr/sbin/dladm show-link
eri0 type: legacy mtu: 1500 device: eri0 eri1 type: legacy mtu: 1500 device: eri1 eri2 type: legacy mtu: 1500 device: eri2 eri3 type: legacy mtu: 1500 device: eri3 eri4 type: legacy mtu: 1500 device: eri4 eri5 type: legacy mtu: 1500 device: eri5 eri6 type: legacy mtu: 1500 device: eri6 eri7 type: legacy mtu: 1500 device: eri7 eri8 type: legacy mtu: 1500 device: eri8 eri9 type: legacy mtu: 1500 device: eri9 eri10 type: legacy mtu: 1500 device: eri10 eri11 type: legacy mtu: 1500 device: eri11 eri12 type: legacy mtu: 1500 device: eri12 eri13 type: legacy mtu: 1500 device: eri13 eri14 type: legacy mtu: 1500 device: eri14 eri15 type: legacy mtu: 1500 device: eri15 eri16 type: legacy mtu: 1500 device: eri16 eri17 type: legacy mtu: 1500 device: eri17 eri18 type: legacy mtu: 1500 device: eri18 eri19 type: legacy mtu: 1500 device: eri19 eri20 type: legacy mtu: 1500 device: eri20 eri21 type: legacy mtu: 1500 device: eri21 scman0 type: legacy mtu: 1500 device: scman0 scman1 type: legacy mtu: 1500 device: scman1 Solution1. recover from hung OS
2. resolve root cause 3. verification of resolution
sms-user:> resetsc
About to reset other SC. Are you sure you want to continue? (y or [n])
References<NOTE:1538483.1> - Collect a System Controller (SC) Explorer using STB7.3 (or newer) on Sun Fire 12K/15K/E20K/E25K (Starcat) Servers<NOTE:1007746.1> - SunFire[TM] 12K/15K/E20K/E25K: Expected behavior of domains in different scenarios when the SCs are powered down or rebooted <NOTE:1006092.1> - Sun Fire[TM] 12K/15K/E20K/E25K: Enterprise Installation Standards(EIS) EEPROM settings <NOTE:1153444.1> - Oracle Services Tools Bundle (STB) - RDA/Explorer, SNEEP, ACT Attachments This solution has no attachment |
||||||||||||||||||||
|