![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1456741.1 : Sun Storage 7000 Unified Storage System: Service Processor has stopped responding to requests
In this Document
Created from <SR 3-5604756781> Applies to:Oracle Exalogic Elastic Cloud X2-2 Hardware - Version X2 to X2 [Release X2]Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases] Exalogic Elastic Cloud X5-2 Hardware - Version X5 to X5 [Release X5] 7000 Appliance OS (Fishworks) SymptomsAn Exalogic or NAS 7000 Appliance may see the below alert : SUNW-MSG-ID: AK-8001-GU, TYPE: alert, VER: 1, SEVERITY: Minor
EVENT-TIME: Wed Apr 18 05:55:05 2012 PLATFORM: i86pc, CSN: 1122334455, HOSTNAME: node01 SOURCE: svc:/appliance/kit/akd:default, REV: 1.0 EVENT-ID: 49cca6f4-3e56-c046-d441-825c88b33e2b DESC: The service processor has stopped responding to requests. AUTO-RESPONSE: None. IMPACT: Features that depend on service processor functionality, including hardware inventory, LED control, and fault diagnosis, will not function correctly while the service processor is in this state. REC-ACTION: Restart the service processor. Contact your service provider if the problem persists.
A few minute later after the given alert is generated the system might post another alert and SP resumes working: "The service processor has resumed responding to requests." Service processor resets itself to recover from this situation.
You will notice a sysevent as in the bundle under fm/infolog_hival.txt or by running "fmdump -l" Apr 18 06:13:10.0638 resource.sysevent.EC_platform.ESC_platform_sp_reset
CauseThe Service Processor may have become unresponsive because of known software limitations like probing SP too frequently, temp filesystem full etc.
SolutionAs a workaround, if the Service Processor does not come back, it can be reset using one of the methods mentioned below:
Resetting the Service Processor will not reboot the Appliance, it just resets the Service Processor.
If the problem appears again, engage Oracle Support for investigation of the issue.
Please also check Bug: 20859787 where the issue was not because of SP memory leak but because of the customer environment.
That is, external requests such as Oracle Ops Center/ Oracle Enterprise Manager or similar can keep the Service Processor busy and the SP may not be able to respond to akd poll request. So if the issue is still seen after reset of SP you can consider increasing the IPMI timeout parameter "ipmi_timeout" value from 5 to 30 sec. using the attached workflow in the bug. NOTE: Running this workflow will restart akd.
NOTE: The SP/BIOS version is fixed for a particular Appliance Firmware Release version. Upgraded SP/BIOS firmware is only available in the context of a Appliance Firmware Release upgrade (which has first been mandated by Fishworks Engineering). As such, the SP/BIOS version can be downrev in terms of the latest version available for the underlying hardware/server platform.
References<BUG:15631318> - SUNBT6937107 SP IS NOT RESET BASED ON SP_RESET_FATAL, SP_RESET_WARN, SP_KILL_TO,<BUG:15672172> - SUNBT6988621-X64_3.0.14 CALLISTO: SP MEMORY LEAK OBSERVED WITH X64_3.0.14 R58793 <BUG:20859787> - SERVICE PROCESSORS CONTINUE TO STOP RESPONDING TO REQUESTS FOLLOWING SP RESET <BUG:15708995> - SUNBT7036162-AK-8 FAN TRAYS AND POWER SUPPLIES: BOGUS ADD/REMOVE ALERTS Attachments This solution has no attachment |
||||||||||||||||||
|