Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2078303.1
Update Date:2015-11-23
Keywords:

Solution Type  Problem Resolution Sure

Solution  2078303.1 :   IMF Frequent Failover Because of HP Server Connection Instability  


Related Items
  • Oracle Communications Performance Intelligence Center (PIC) Software
  •  
Related Categories
  • PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec PIC
  •  


HP server connection instability produces IMF Failover.

In this Document
Symptoms
Cause
Solution
 Preferred solution
 Alternate solution
References


Created from <SR 3-11648342031>

Applies to:

Oracle Communications Performance Intelligence Center (PIC) Software - Version 10.1.0 and later
Information in this document applies to any platform.

Symptoms

There are regular failover on the subsystem. IMF changes from IS to OOS status.

/var/log/messages log file is filled with:

kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280()
(Not tainted)
kernel: Hardware name: ProLiant BL460c G6
kernel: NETDEV WATCHDOG: eth02 (bnx2x): transmit queue 8 timed out
kernel: Modules linked in: tcp_diag inet_diag ipmi_watchdog bonding 8021q
garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
ip_tables ip6t_REJECT xt_comment nf_conntrack_ipv6 nf_defrag_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables ipmi_devintf dm_snapshot dm_multipath
tcp_TKLC_cubic power_meter hpilo bnx2x(U) libcrc32c mdio microcode ipv6
serio_raw sg i7core_edac edac_core shpchp ext4 mbcache jbd2 video output
sd_mod crc_t10dif hpsa(U) radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
kernel: Pid: 0, comm: swapper Not tainted
2.6.32-358.6.1.el6prerel6.5.0_82.15.0.x86_64 #1
kernel: Call Trace:
kernel: <IRQ> [<ffffffff8106e247>] ? warn_slowpath_common+0x87/0xc0
kernel: [<ffffffff8106e336>] ? warn_slowpath_fmt+0x46/0x50
kernel: [<ffffffff81467d5d>] ? dev_watchdog+0x26d/0x280
kernel: [<ffffffff81012b69>] ? sched_clock+0x9/0x10
kernel: [<ffffffff81467af0>] ? dev_watchdog+0x0/0x280
kernel: [<ffffffff81081797>] ? run_timer_softirq+0x197/0x340
kernel: [<ffffffff810a7f70>] ? tick_sched_timer+0x0/0xc0
kernel: [<ffffffff8102e99d>] ? lapic_next_event+0x1d/0x30
kernel: [<ffffffff81076f11>] ? __do_softirq+0xc1/0x1e0
kernel: [<ffffffff8109b6fb>] ? hrtimer_interrupt+0x14b/0x260
kernel: [<ffffffff8100c1cc>] ? call_softirq+0x1c/0x30
kernel: [<ffffffff8100de05>] ? do_softirq+0x65/0xa0
kernel: [<ffffffff81076cf5>] ? irq_exit+0x85/0x90
kernel: [<ffffffff815176d0>] ? smp_apic_timer_interrupt+0x70/0x9b
kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
kernel: <EOI> [<ffffffff812d392e>] ? intel_idle+0xde/0x170
kernel: [<ffffffff812d3911>] ? intel_idle+0xc1/0x170
kernel: [<ffffffff814153c7>] ? cpuidle_idle_call+0xa7/0x140
kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
kernel: [<ffffffff8150747c>] ? start_secondary+0x2ac/0x2ef
kernel: ---[ end trace 679a04660581ca63 ]---

Cause

This system BIOS has enabled interrupt remapping on a chipset that contains an errata making that feature unstable.

The various logs show connection issues and frequent failover.

The issue is a known problem of an old HP firmware. 

Solution

Preferred solution

When connection troubles are identified, the HP firmware must be checked, and if needed upgraded on the server to its latest available release. Issue is fixed in package version 2.2.5.

Alternate solution

Disable Intel VT-d in BIOS:

  1. Reboot server
  2. Access BIOS menu
  3. System Options
  4. Processor Options
  5. Set Intel (R) VT-d to disable
  6. Exit BIOS menu with saving option

References

<BUG:19104425> - [229892]LOST NETWORK CONNECTIVITY TO BL460CG6 BLADES

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback