![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Troubleshooting Sure Solution 1009200.1 : Ecache (E$) events and what to do about them
PreviouslyPublishedAs 212735 Applies to:Sun Enterprise 4000 Server - Version Not Applicable to Not Applicable [Release N/A]Sun Enterprise 4500 Server - Version Not Applicable to Not Applicable [Release N/A] Sun Enterprise 5000 Server - Version Not Applicable to Not Applicable [Release N/A] Sun Enterprise 10000 Server - Version Not Applicable to Not Applicable [Release N/A] Sun Enterprise 6000 Server - Version Not Applicable and later All Platforms PurposeThe document provides insight into how to identify an ecache (or e$) event on a customer's system. It also provides details on Oracle's Best Practice for ecache events and when to replace the CPU or not. Troubleshooting StepsBackground:An ecache event (pronounced e-cash) is a hardware event that can occur on any UltraSPARC based system. Such an event occurs when a bit in a cpu's cache memory is mistakenly modified. An UltraSPARC I, II or IIi system will usually panic or reboot when an ecache event occurs. Later cpus, which currently include the UltraSPARC III and IV families, have error correction features which usually result in correction of the errors without any impact on system operation. Jul 10 01:43:34 tronsd81 unix: WARNING: [AFT1] WP event on CPU11, errID 0x00092f64.77726c8d Jul 10 01:43:34 tronsd81 unix: AFSR 0x00000000.00800100 AFAR 0x00000179.fe75f940 Jul 10 01:43:34 tronsd81 unix: AFSR.PSYND 0x0100(Score 95) AFSR.ETS 0x00 Fault_PC 0x100171b0 Jul 10 01:43:34 tronsd81 unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jul 10 01:43:58 tronsd81 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data access at TL=0, errID 0x00092f6a.34dfd7e1 Jul 10 01:43:58 tronsd81 unix: AFSR 0x00000000.80200000 AFAR 0x00000002.95b74000 Jul 10 01:43:58 tronsd81 unix: AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10021058 Jul 10 01:43:58 tronsd81 unix: UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Jul 10 01:43:58 tronsd81 unix: UDBH Syndrome 0x3 Memory Module Board 4 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801
Oracle's Best Practice for ecache events:The first thing to do when an ecache event is determined is to identify if the event is the first ecache event that has taken place on the same CPU or not. A single event is considered to be transient in nature and the customer should be instructed to record this fault and monitor for any repeat event on the same CPU in the next 6 month time period. There is ONE exclusions to the Best Practice rule:
In cases where it is unknown whether the CPU has had a previous ecache error, where case/service request data is not known, and the status is unknown, the recommendation is to consider the fault to be transient and consider this event a first event. Additional Resources:
ecache, score, 95, AFT, panic, reboot, EDP, e$, bestpractices.central, best practice
Attachments This solution has no attachment |
||||||||||||
|