![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1173666.1 : SMF-8000-YX - A Service In Maintenance State
In this Document
Applies to:Solaris Operating System - Version 10 3/05 and laterOracle ZFS Storage ZS3-4 SPARC T8-4 SPARC T8-2 SPARC T8-1 Information in this document applies to any platform. PurposeProvide additional information for Message ID: SMF-8000-YX DetailsType Defect defect.sunos.smf.svc.maintenance Severity Major Description A service failed and could not be restarted. Automated Response The service has been placed into the maintenance state. Impact The service is unavailable. Suggested Action for System Administrator Run svcs -x to determine why the service failed and the location of logfiles (/var/svc/log), if any. Details SummaryWhen the service management facility (see smf(5)) determines that a service instance should be placed into the maintenance state, the fault management subsystem tracks this maintenance state via a new problem diagnosis with diagnosis message id SMF-8000-YX The message id SMF-8000-YX is a generic identifier for "a service entering maintenance state", whatever the affected service or the reason for it entering that state. After investigating and addressing the cause of the maintenance state (see below) a suitably-privileged administrator may clear that state using either SMF commands (svcadm) or fault management commands (fmadm). Why Do Service Instances Enter Maintenance State?Common failure modes that result in an instance being placed in maintenance state are:
The failure modes will be tabulated below, but note that these are simply the generically observable failure symptoms and do not describe the particular reason why a given service is exhibiting those symptoms. For example, if a service start method for
exits with $SMF_EXIT_ERR_CONFIG because a required configuration file is corrupt the failure mode is "(start) method failed" and the administrator will have to look into log files and the like for the affected service to determine exactly why
has a configuration problem.
What To Do?In the following we write <fmri> for the FMRI of the affected service instance - for example svc:/network/ntp:default for the ntp service.
ExampleSuppose that svc:/network/ntp:default is enabled but that the configuration file /etc/inet/ntp.conf is absent. On the console we see a new problem diagnosis as follows (and other notification mechanisms that are configured such as snmp and email will show similar information): SUNW-MSG-ID: SMF-8000-YX, TYPE: defect, VER: 1, SEVERITY: major
EVENT-TIME: Mon May 17 22:38:34 PDT 2010
PLATFORM: Sun-Fire-V40z, CSN: XG051535088, HOSTNAME: parity
SOURCE: software-diagnosis, REV: 0.1
EVENT-ID: 97911e1b-f7a3-cc69-f850-c969e0a7c222
DESC: A service failed - a start, stop or refresh method failed
Refer to http://sun.com/msg/SMF-8000-YX for more information. AUTO-RESPONSE: The service has been placed into the maintenance state. IMPACT: svc:/network/ntp:default is unavailable REC-ACTION: While SMF-8000-YX is a generic "maintenance state" code, the message above does have some dynamic aspects that are specific to this particular case - these are highlighted in red above. Running the suggested command we see: # svcs -xv svc:/network/ntp:default svc:/network/ntp:default (Network Time Protocol (NTP) Version 4) State: maintenance since Mon May 17 22:38:34 2010 Reason:Start method exited with $SMF_EXIT_ERR_CONFIG. See: http://sun.com/msg/SMF-8000-KS See: man -M /usr/share/man -s 1M ntpd See: man -M /usr/share/man -s 4 ntp.conf See: man -M /usr/share/man -s 1M ntp See: /var/svc/log/network-ntp:default.log Impact: This service is not running.
Note that svcs -xv output has been more specific than the console messaging in that it has indicated which method failed and what it returned; it also links to an article that elaborates on the "start method failed" failure mode, and provides a pointer to the service instance log. Inspecting the tail of that log and correlating with the timestamp of 22:38 above we see:
The error message is the result of the instance start method writing to standard output or standard error. In this case the cause is obvious; in more complex cases the above is simply the beginning of an investigation to debug the root cause of the maintenance state, typically involving some service-specific expertise. Suppose we now create a valid /etc/inet/ntp.conf and wish to clear maintenance state; the instance will attempt to move to online state since it is enabled in the repository:
Note that the abbreviation ntp was used instead of the full fmri string svc:/network/ntp:default (which would also have worked but would be longer to type). One could also have used fmadm repaired 97911e1b-f7a3-cc69-f850-c969e0a7c222. NOTE: In Solaris if DNS, LDAP, NIS, AD are not configured to resolve network naming service please set the DNS server to loop back IP address 127.0.0.1 so svc:/system/auditd:default and svc:/system/auditset:default do not go into maintenance state.
Maintenance ReasonsIn the table below we list the possible reasons for entering maintenance state, and link to the corresponding article that provides more information for this particular reason.
See "Predictive Self-Healing" for additional information. Specifically, view "SMF How To Guide" the section entitled "Retrieving Dependency Tree Information". This provides detailed information for troubleshooting service dependency issues. For step by step troubleshooting, reference the Systems Administration Guide: Basic Administration: Troubleshooting the Service Management Facility documentation. Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||||||||||||
|