![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Troubleshooting Sure Solution 2296998.1 : How to Troubleshoot ILOM Interconnect Problems
Applies to:Netra SPARC T4-1 Server - Version All Versions and laterSPARC T5-2 - Version All Versions and later SPARC T5-4 - Version All Versions and later SPARC S7-2 - Version All Versions and later SPARC T8-1 - Version All Versions and later Information in this document applies to any platform. PurposeThe Integrated Lights Out Manager (ILOM) Interconnect is an internal 10MB Ethernet-Over-USB interface between the ILOM and the server's Solaris host / primary LDom which is used to communicate Fault Management Architecture (FMA) and other data. The Fault Management Architecture uses the Interconnect to proxy the faults diagnosed on the Service Processor (SP) to the Host, and to proxy the faults diagnosed on the Solaris Host to the SP. This is known as FMA Fault Proxying which keeps the FMA faults in sync between the host and the SP on T5-x or newer servers. When the FMA Fault Proxying mechanism does not work, FMA on SP and Solaris Host still works but the faults diagnosed are no longer proxied to the other side. The following faults can occur for many reasons, and some are related to the ILOM interconnect: FMD-8000-D6 - alert.oracle.solaris.fmd.ip-transport.interconnect-down This document provides information on troubleshooting ILOM interconnect problems.
Troubleshooting StepsAlso see doc: 2281470.1 This interface was either Auto Configured via the OS or Oracle Hardware Management Pack (OHMP) at initial boot, or was Manually Configured by the system admin. Please note that Solaris will detect this link go down if the ILOM reboots or becomes unresponsive. Determine if the ILOM was down at the time of this fault (typically a FMD-8000-ET) prior proceeding with this troubleshooting doc. Also determine if the ILOM was hung at the time of the interconnect outage since it will be affected. OHMP's ilomconfig will enable the interconnect by default if ILOM 3.0.12 & Solaris 10 U11 are both loaded. OHMP was added to Solaris 11.2's distribution and is enabled by default. We recommend that OHMP version 2.3 or newer is utilized & is upgraded via a Solaris upgrade to a more current version. OHMP can be obtained at the following URL when Solaris 11.1 or earlier is in use: http://www.oracle.com/technetwork/documentation/sys-mgmt-networking-190072.html#hwmgmt
Fault FMD-8000-ET [failed getpeername()] was detected after an upgrade to Solaris 11.3 SRU 22.3 (or newer) even though the ILOM interconnect was properly configured. If the IP transport is missing, then this is resolved by upgrading the system firmware to 9.7.4 (or newer) on T7-x systems and to 9.6.7.a (or newer) on T5-x systems. ##### fma/fmstat-T.out #####
10 RUN ip-transport server-name=169.254.182.76:24
The user should ensure that the Solaris host / primary LDom is configured properly, as follows: 1. Interconnect Service online: ##### # svcs -av | grep :default / Explorer: sysconfig/svcs-av.out #####
STATE NSTATE STIME CTID FMRI online - Apr_01 - svc:/network/ilomconfig-interconnect:default If the Solaris service is offline or not present, then use svcadm to enable the service: svcadm enable svc:/network/ilomconfig-interconnect:default
The history of this service is found in explorer file: /var/svc/log/network-ilomconfig-interconnect:default.log [ Feb 1 13:51:31 Executing start method ("/lib/svc/method/svc-ilomconfig-interconnect start"). ]
[ Feb 1 13:51:31 Method "start" exited with status 0. ] Host-to-ILOM interconnect successfully configured. [ Feb 1 13:56:54 Stopping because service disabled. ] [ Feb 1 13:56:55 Executing stop method ("/lib/svc/method/svc-ilomconfig-interconnect stop"). ] ERROR: Cannot modify interconnect when disabled (use enable command) ERROR: Cannot modify interconnect when disabled (use enable command) ERROR: Cannot modify interconnect when disabled (use enable command) [ Feb 1 13:57:09 Method "stop" exited with status 0. ] [ Apr 16 13:48:10 Disabled. ] [ Apr 17 22:08:52 Disabled. ] [ Apr 17 22:54:40 Disabled. ] [ Apr 19 11:43:32 Enabled. ] [ Apr 19 11:43:32 Executing start method ("/lib/svc/method/svc-ilomconfig-interconnect start"). ] [ Apr 19 11:43:32 Method "start" exited with status 0. ]
2. Interconnect enabled: ##### # /usr/sbin/ilomconfig list interconnect #####
Interconnect ============ State: enabled Type: USB Ethernet SP Interconnect IP Address: 169.254.182.76 Host Interconnect IP Address: 169.254.182.77 Interconnect Netmask: 255.255.255.0 SP Interconnect MAC Address: 03:23:23:57:47:16 Host Interconnect MAC Address: 03:23:23:57:47:17 If not enabled, then enable it via command: # /usr/sbin/ilomconfig enable interconnect
If online, this Solaris interface can be tested via commands: # ping 169.254.182.77 # ping 169.254.182.76 # ipmitool sunoem version If the Solaris services are are online and "ilomconfig list interconnect" appears normal, but isn't operational, then attempt to disable & enable it via OHMP's ilomconfig: # /usr/sbin/ilomconfig disable interconnect
# /usr/sbin/ilomconfig enable interconnect If the Host Interconnect IP Address is (none) or 0, then there is a possibility that ipmitool is hung. ipmitool is used by ilomconfig to initialize the ILOM Interconnect after system boot so this address isn't configured if ipmitool is hung. The ILOM must be reset to get ipmitool working again, but any applications that use ipmitool should include the "-I lanplus" option which uses ipmitool version 2. # ipmitool -I lanplus -H "SP ipaddress" -U root fru
ILOM firmware 3.2.4 (System firmware 9.3.0.b or 8.6.0.b) disables ipmitool version 1.5 by default to increase system security. It could be re-enabled on the ILOM, as follows: -> set /SP/services/ipmi v1_5_sessions=enabled
3. usbecm2's net is online & is addressed as 169.254.182.77 (unless customer modified server's configuration): ##### ifconfig -a / Explorer: sysconfig/ifconfig-a.out ##### ##### # dladm show-phys -Z | grep usb / Explorer: netinfo/dladm/dladm_show-phys_-Z.out ##### ##### # netstat -in / Explorer: netinfo/netstat-in.out #####
The following may only be found on M-series servers! ##### # hotplug list -v | grep usb / Explorer: sysconfig/hotplug_list_-v.out ##### ##### hotplug list -l | grep usb / Explorer: sysconfig/hotplug_list_-l.out ##### The net# will vary depending on the server's network configuration, and is net8 in the example above. Notice that the interface is up at a speed of 10MHz. The IP address for net8 is 169.254.182.77 which is the default configured by OHMP for usbecm2. If the IP address is modified by the customer during server configuration, it must be located in address range 169.254.x.x. Oracle RAC HAIP initially used an IP address range which overlapped this range, but was most likely modified years ago to allow usage of a different range. See SR 3-5732004651. If the usbecm2 interface isn't configured or operational, then attempt to disable & enable the interconnect via OHMP's ilomconfig: # /usr/sbin/ilomconfig disable interconnect
# /usr/sbin/ilomconfig enable interconnect If the interface fails to configure (especially with "ERROR: Internal error"), then manually configure the network interface: root@pdom03:~# ipadm
NAME CLASS/TYPE STATE UNDER ADDR net8 ip failed -- -- net8/v4 static inaccessible -- 169.254.182.77/24 root@pdom03:~# ipadm delete-addr net8/v4 root@pdom03:~# ipadm create-addr -T static -a 169.254.182.77/24 net8/v4 root@pdom03:~# ipadm NAME CLASS/TYPE STATE UNDER ADDR net8 ip ok -- -- net8/v4 static ok -- 169.254.182.77/24 Finally, ensure that a failed USB device (like a memory stick) is not inserted on any of the USB connectors. This could disable the entire bus.
4. If LDoms are in use, the primary LDom must own the Interconnect's PCI path. ##### # prtdiag -v / Explorer: sysconfig/prtdiag-v.out ##### ================================= IO Devices =================================
"/pci@340/pci@1/pci@0/pci@3/usb@0/hub@2/communications@3" 2 "usbecm" ##### # ldm list -l / Explorer: sysconfig/ldm_list_-l.out ##### NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME Notice that the the USB interconnect is in path /pci@340, in this case. The primary LDom owns pci@340 which is the PCI path. If not, then this path must be moved so that the primary LDom controls it & both domains should be reset followed by an ILOM reset.
5. Solaris's IP filter can also disable communication on this link. Files /etc/ipf/ipf.conf or /etc/ipf/ippool.conf should be configured to allow communication to the interconnect's ILOM IP address which is 169.254.182.76 by default. For example, ipf.conf may contain the following: # ssh: # block everything else In this case, communication is blocked by Solaris' IP filter since the interconnects address is not included. This file must also include to following for the internal interconnect to work: pass in quick proto tcp from 169.254.182.76 to any port = 24
In one case where IP filter misconfiguration prevented the interconnect's operation, the Solaris IP address for usbecm2 was configured to 0:0:0:0 & a different network contained 169.254.182.77. This system was repaired by: properly configure ipf.conf,
remove the misprogrammed network, disable & enable the interconnect with ilomconfig, power cycle the system, clear the related faults.
If Solaris IP filtering isn't the problem, then use OHMP to disable & enable the interface: # /usr/sbin/ilomconfig disable interconnect
# /usr/sbin/ilomconfig enable interconnect
The ILOM should have the following configuration for proper operation: 1. Ensure that the ILOM is online See ILOM issue T5 related to ILOM hangs. 2. Ensure the /SP/network & /SP/network/interconnect are properly configured. /SP/network state = enabled or ipv4-only iff ipv6 is disabled
/SP/network/interconnect state = enabled May be missing if host managed with older firmware /SP/network/interconnect allowed_services = fault-transport, ipmi, snmp <--- fault-transport only needed for T5-x through T8-x FMA Fault Proxying /SP/network/interconnect hostmanaged = true <--- Set when AutoConfig'd /SP/network/interconnect ipaddress = 169.254.182.76 Please note that /SP/network/interconnect "state" will be missing on older system FW when the interface is host managed. If ipv6 is disabled, then the state must be ipv4-only. -> set /SP/network state=enabled
The interconnect allowable services are: fault-transport, https, ipmi, ssh, snmp. Also note that IP address: 169.254.182.76 is the default configured by the ILOM for the interface, but may be changed by the customer. -> set /SP/network/test ping=169.254.182.76
Ping of 169.254.182.76 succeeded -> set /SP/network/test ping=169.254.182.77
The related M5/M6 doc 1683087.1 Attachments This solution has no attachment |
||||||||||||
|