Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-2164046.1
Update Date:2018-01-03
Keywords:

Solution Type  Sun Alert Sure

Solution  2164046.1 :   Certain Cases of E5 A-Series Card Boot May Cause HIPR2 Cards to Boot and Result in the STP Node Booting (Node Isolation)  


Related Items
  • Oracle Communications EAGLE (Software)
  •  
  • Oracle Communications EAGLE (Hardware)
  •  
Related Categories
  • PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec Eagle 5
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
Patches
History
References


Applies to:

Oracle Communications EAGLE (Software) - Version EAGLE 45.0 and later
Oracle Communications EAGLE (Hardware) - Version EAGLE 46.0 to EAGLE 46.0 [Release EAGLE 46.0]
Tekelec

Description

Type-A class cards being the slower cards in the system, it requires to support a procedure to prevent faster cards in the system overrunning A-class card’s IMT receive buffers causing them to boot. This support is performed on the HIPR2 cards. However in very limited cases (like extreme card/subsystem overloads involving A-class cards or communication problem on particular A-class card) HIPR2 may hang and reboot. If this occurs, there is a possibility that this may cause HIPR2 cards on opposite IMT busses to boot. Failure of an HIPR2 cards each IMT bus causes the busses to reset. This causes all cards in the system to reload which results in nodal isolation, until the STP node recovers from the boot.

Occurrence

There are few cases that can cause HIPR2 card to hang and reboot. Failure of HIPR2 cards in both IMT busses results in node isolation until the STP node recovers from the boot. Below are few scenarios on EAGLE running release 45.0 to 46.3.0.0.1.

Scenario 1 – Type A-class LIM and SCCP cards boots

  • In certain cases of E5 A-series Card LIM or SCCP card boot with specific COM failure. Note: Obit "Module pmtc_ixp.c Class 0243" has been identified as specific case that has the potential to cause the issue.
  • HIPR2 cards are in the shelf that contained the E5 A-series card that booted.


Scenario 2 - Extreme card / subsystem overloads involving A-class cards

  • Overload of subsystems (EROUTE, SCCP, SLAN, etc..) involving Type-A cards

           When there is an overload on subsystem involving Type-A cards, it may result in overload on Type-A cards in the subsystem.

  • Extreme overload of A-class LIM cards

           This condition happens as a result of heavy traffic and congestion.

  • HIPR2 cards are in the shelf that contained the E5 A-series card that is overloaded

The part numbers of the affected E5 A-series cards are given below:

E5-ATM -   870-1872-xx
E5-SM4G - 870-2860-xx
E5-E1T1 -  870-1873-xx
E5-ENET -  870-2212-xx
HCMIM -   870-2671-xx

 

 

Symptoms

1) A-Class reboots with module pmtc_ixp.c obits followed by HIPR2 card reboots with Module hipr2op_isr obits and/or network card reboots with Module pmtc_mgr.c obits.

A sample sequence of OBIT events which can be expected:
          STH: Received a BOOT IMT-Obituary reply for restart
              Card xxxx Module pmtc_ixp.c Line 263 Class 0243

          STH: Received a BOOT IMT-Obituary reply for restart
             Card 3109 Module hipr2op_isr. Line 355 Class 01c3

         STH: Received a BOOT APPL-Obituary reply for restart
            Card 4105 Module pmtc_mgr.c Line 599 Class 0241

2) Card overload and/or subsystem capacity related alarms followed by HIPR2 card reboots with Module hipr2op_isr obits and/or network card reboots with Module pmtc_mgr.c obits.

A sample sequence of UAM and OBIT events which can be expected (depends on the features enabled in the system):
         XXXX.0477 * SLK 3213,A Congestion: Copy Function De-activated

         XXXX.0472 * EROUTE SYSTEM EROUTE System Threshold Exceeded
         XXXX.0473 ** EROUTE SYSTEM EROUTE System Capacity Exceeded
         XXXX.0482 ** EROUTE SYSTEM Card(s) have been denied EROUTE service

         XXXX.0330 ** SCCP SYSTEM System SCCP TPS Threshold exceeded
         XXXX.0437 *C SCCP SYSTEM System SCCP TPS capacity exceeded
         XXXX.0336 ** SCCP SYSTEM LIM(s) have been denied SCCP service

         XXXX.0111 ** IMT SYSTEM Failure on both IMT A and IMT B

         STH: Received a BOOT IMT-Obituary reply for restart
           Card 3109 Module hipr2op_isr. Line 355 Class 01c3

         STH: Received a BOOT APPL-Obituary reply for restart
          Card 4105 Module pmtc_mgr.c Line 599 Class 0241

 

Workaround

 Replace type-A class cards that have booted with "Module pmtc_ixp.c Line 263 Class 0243" COM obit with spare cards as this indicates a HW failure, then raise RMA.

Patches

Permanent fix is available in EAGLE release "46.3.0.0.2".

Additionally a new HIPR2 GPL (Generic Program load) containing fix is provided via MOS for select EAGLE releases pre 46.3.0.0.2 releases, under corresponding patch IDs. Below are the available patches for different releases:

Release 45.0.x - Patch 24368436
Release 46.0.x - Patch 24393663
Release 46.2.x - Patch 24461795
Release 46.3.0.0.1 - Patch 24433593

The fix resolves the following bugs:

24334123 - HIPR2 - booting when a "A" class lim holds Flow Control to long.
24942059 - R46.2_ST2:Both IMTs up, we keep getting HIPR2 boots in an oversubscribed EROUTE.
24942071 - R46.2_ST2:System boots, when SCCP capacity oversubscribed with one IMT up.
24942084 - R_46.3_ST: Complete SYSTEM booted in congestion scenario.
21856737- R46.2_ST2:BOTH IMTS UP, WE KEEP GETTING HIPR2 BOOTS IN AN OVERSUBSCRIBED EROUTE

 

Follow below steps before and after applying the new HIPR2 GPL to verify that the IMT bus is free of errors and is in good health (if not in good health contact TAC via an SR before proceeding):

  1. Clear the IMT stats by running “clr-imt-stats:all=yes” command and wait for at least 15 minutes
  2. Check the IMT status by following Procedure 15, Verifying IMT Status, of EAGLE System Health Check Guide (E54339 Revision 4) document (http://docs.oracle.com/cd/E76234_01/docs.463/E54339_rev_4.pdf)

History

 22-JUL-2016 - KM Document Created.

25-JUL-2016 - KM Document updated with review comments

11-AUG-2016 - KM Document updated with release 46.3 patch number

17-AUG-2016 - KM Document updated with release 46.2 patch number

26-OCT-2016 - KM Document updated with additional details

References

<BUG:24334123> - HIPR2 - BOOTING WHEN A "A" CLASS LIM HOLDS FLOW CONTROL TO LONG
<BUG:24942059> - R46.2_ST2:BOTH IMTS UP, WE KEEP GETTING HIPR2 BOOTS IN AN OVERSUBSCRIBED EROUTE.
<BUG:24942071> - R46.2_ST2:SYSTEM BOOTS, WHEN SCCP CAPACITY OVERSUBSCRIBED WITH ONE IMT UP.
<BUG:24942084> - R_46.3_ST: COMPLETE SYSTEM BOOTED IN CONGESTION SCENARIO
<BUG:21856737> - R46.2_ST2:BOTH IMTS UP, WE KEEP GETTING HIPR2 BOOTS IN AN OVERSUBSCRIBED EROUTE.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback