![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1008805.1 : Sun Fire[TM] 12K/15K/E20K/E25K: Remote Dynamic Reconfiguration (DR) generates "DCA/DCS Communication Error" and showdevices is “Unable to get device information from domain”.
PreviouslyPublishedAs 212092 ***Checked for relevance on 09-May-2011*** Applies to:Sun Fire 15K Server - Version Not Applicable and laterSun Fire E25K Server - Version Not Applicable and later Sun Fire E20K Server - Version Not Applicable and later Sun Fire 12K Server - Version Not Applicable and later Sun SPARC Sun OS SymptomsThe rcfgadm or showdevices commands, generate errors from the system controller (SC). The error message might be "DCA/DCS Communication Error" when executing these commands. The command showdevices might generate the following error (where x is the domain ID): # showdevices -v -d x Unable to get device information from domain x This showdevices error could also be seen in Explorer data from the Main System Controller (SC). The file, showdevices_-v_-d_x.out, which is in the /explorer/sf15k/ directory of Explorer will show the same "Unable to get device information from domain x" error message. Also the following messages might be logged in the platform log file on the SC ( $SMSVAR/adm/platform/messages ) xcat-sc0 showdevices[7496]: [0 2197706244444996 ERR ri_init.cc 85] rcfgaRequestProxy->ri_init failed. Status= 4315 xcat-sc0 showdevices[7496]: [4509 2197706254650079 ERR RcfgaCallback.cc 521] server accept failed. RcfgaCallback::serverAccept: failed in ioctl domain id = I Changes
CauseThese errors can be caused by configuration. SolutionA summary of the actions that need to be followed are:
>scman0 and dman0 The dca <-> dcs handshaking takes place over the I1 network. This means that scman0 on the SC and dman0 on the domain must be configured and running properly. This is often overlooked, so be sure to verify this information with the following command: On SC: # ifconfig -a
scman0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 3 inet 10.10.1.1 netmask ffffffe0 broadcast 10.10.1.31 On domain: dman0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 3 inet 10.10.1.3 netmask ffffffe0 broadcast 10.10.1.31 ether 0:0:be:a8:17:57 Note that the IP addresses and netmasks on the dman0 and scman0 interfaces should match the information stored in the /etc/SUNWMSMS/config/MAN.cf file on the SC.
This should be further confirmed by running the following command on the domain: # ndd /dev/dman man_get_hostinfo manc_magic = 0x4d414e43 manc_version = 01 manc_csum = 0x0 manc_ip_type = AF_INET manc_dom_ipaddr = 10.10.1.3 manc_dom_ip_netmask = 255.255.255.224 manc_dom_ip_netnum = 10.10.1.0 manc_sc_ipaddr = 10.10.1.1 manc_dom_eaddr = 0:0:be:a8:17:57 manc_sc_eaddr = 8:0:20:fa:5f:1a manc_iob_bitmap = 0xa0 io boards = 5.1, 7.1, manc_golden_iob = 5 Domain Configuration Agent(DCA) The Domain Configuration Agent (DCA) daemon runs on the SC,one per domain. Similar to a netcon session on a Sun Enterprise[TM] 10000 server, the DCA provides communication between the DCA on the SC and the Domain Configuration Server (DCS) on the specified domain. If DCA is not running, the showdevices and the rcfgadm commands fail. To verify that DCA is running, issue the following command on the SC: # ps -ef | grep dca sms-dca 1614 361 0 Feb 26 0:00 dca -d A
Domain Configuration Server (DCS) DCS is a domain daemon process that supports remote dynamic reconfiguration. DCS must also be running on the domain in order for the showdevices or rcfgadm commands to work on the domain. If either command fails, check the domain for the following lines in the /etc/inetd.conf file: sun-dr stream tcp wait root /usr/lib/dcs dcs sun-dr stream tcp6 wait root /usr/lib/dcs dcs These lines must be in the /etc/inetd.conf file for the rcfgadm and showdevices commands to work properly. If the lines are not in the file, and showdevices fails from the SC, add the indicated lines above and restart the inetd process as follows: # ps -ef | grep inetd root 151 1 0 Mar 11 0:00 /usr/sbin/inetd -s # kill -HUP 151
For additional information, refer to the man page about dcs. Note for domains running Solaris[TM] 10 (without patch 120253-02 ): The /etc/inetd.conf file is no longer directly used to configure inetd. inetd is now configured in the Service Management Facility. You can get the list of the list of all the SMF services installed. # inetadm ENABLED STATE FMRI enabled online svc:/application/font/stfsloader:default [output omitted] disabled disabled svc:/network/talk:default enabled online svc:/platform/sun4u/dcs:default
[output omitted] The /platform/sun4u/dcs service must be enabled/online. You can now get more information from the svc:/platform/sun4u/dcs service and list its properties via the svccfg command : # /usr/sbin/svccfg -s svc:/platform/sun4u/dcs:default listprop general framework general/enabled boolean true restarter framework NONPERSISTENT restarter/auxiliary_state astring none restarter/next_state astring none restarter/state astring online restarter/state_timestamp time 1117463395.870876000 restarter/contract count 94 inetd_state framework NONPERSISTENT inetd_state/cur_state integer 1 inetd_state/next_state integer 13 inetd_state/start_pids integer svc:/platform/sun4u/dcs:default> quit
If any dcs processes are running, pids will be reported in inetd_state/start_pids. Note that, on domains running Solaris 10 w/o 120253-02, the dcs process will not be running if the SC has not recently communicated with the domain. It's forked by inetd upon request (Remote DR request started from the SC). Hence, the PPID for dcs is the inetd PID. Ex : # ptree 304 159 /usr/sbin/inetd -s 304 dcs Note for domains running Solaris[TM] 10 Update 2( with patch 120253-02 ): Due to the fixes for : Bug ID 4792021 per-socket level IPsec policy for dynamic reconfiguration Bug ID 6380945 Changes required for PSARC 2006/038 introduced in patch 120253-02, dcs does not belong to inetd any longer. Since inetd does not support per-socket IPsec, dcs will be changed to run standalone. Both dcs and cvcd will be controlled by SMF and use SMF properties to define command line options. Hence, running: inetadm | grep dcs will not return information about dcs any longer. Use the following command to get the status from the dcs service: # svcs dcs
STATE STIME FMRI online 13:53:40 svc:/platform/sun4u/dcs:default Note that, on domains running Solaris 10 U2 or w/ 120253-02 The dcs process starts at boot time. And due to the new implementation, dcs will now be running with different options and will accept command line arguments ("-a", "-e", and "-u") allowing the administrator to configure the encryption and authentication IPsec options. Where:
See the manpages for dcs(1M) for more details. Example:
# ptree 220 220 /usr/lib/dcs -a md5 Note that the dcs process might not be running if the SC has not recently communicated with the domain.
To check to see if any process is actually listening on the sun-dr port (port 665), run: e25ka-dom-c# netstat -an | grep 665 *.665 *.* 0 0 49152 0 LISTEN *.665 *.* 0 0 49152 0 LISTEN This verifies that there is indeed some process listening on the sun-dr port, 665. If there is nothing listening on port 665, then the showdevices and addboard / deleteboard commands on the SC can never work properly. The /etc/services File The /etc/services file must also have the following entry on the domain for remote Dynamic Reconfiguration (DR): sun-dr 665/tcp # Remote Dynamic Reconfiguration If you are using the NIS+, make sure that above entry is present in the /etc/services file of NIS+ server. You can check this using the following command: $ niscat services.org_dir | grep sun-dr sun-dr sun-dr tcp 665 Remote Dynamic Reconfiguration
/etc/inet/ipsecinit.conf File on the Domain When running Solaris 9 or below the /etc/inet/ipsecinit.conf file should contain the following entries: { dport sun-dr ulp tcp } permit { auth_algs md5 } { sport sun-dr ulp tcp } apply { auth_algs md5 sa unique } { dport cvc_hostd ulp tcp } permit { auth_algs md5 } { sport cvc_hostd ulp tcp } apply { auth_algs md5 sa unique }
If the entries do not exist, add them and then issue: # ipsecconf -a /etc/inet/ipsecinit.conf
Use the following command to check that the system is now running with these settings: # ipsecconf
If the domain is running Solaris 10 with patch 120253 then the service is managed by SMF and will not need the ipsecinit.conf file. The /etc/inet/ipsecinit.conf MUST not be present on the System Controller (SC) in order to avoid failover machanism not working properly.
Domain X Server (DXS) The console command uses DXS. It is similar to the netcon_server on the Sun Enterprise[TM] 10000 server. DXS runs on the SC, one per domain. To verify that DXS is running, issue the following command on the SC: # ps -ef | grep dxs sms-dxs 1609 361 0 Feb 26 0:57 dxs -d A
Console commands take place over the console bus but can be toggled between the console bus and I1 network using the ~= command. When the domain is rebooting, a message appears on the SC that is similar to "dxs disconnecting." The reboot of a domain causes an hpost -Q. which is a quick POST from the SC. Sun Fire[TM] 12K/15K/E20K/E25K key management daemon (sckmd) The sckmd server process resides on a Sun Fire[TM] 12K/15K/E20K/E25K domain. The sckmd daemon maintains the Internet Protocol Security (IPsec) Security Associations (SAs) needed to secure the communication between the SC and the cvcd and dcs daemons running on the domains. The sckmd daemon must be running on the domain in order for the "showdevices" or "rcfgadm" commands to work on the domain. To verify that the sckmd daemon is running, issue the following command on the domain: # ps -ef | grep sckmd root 24156 1 0 Apr 02 0:00 /usr/platform/SUNW,Sun-Fire-15000/lib/sckmd
Failure after a Solaris[TM] 10+ OS initial installation Upon the initial installation of a Solaris 10+ domain, showdevices/rcfgadm will not work successfully. Running the commands will generate domain-side console messages such as: Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid argument, diagnostic code=40: Unsupported authentication algorithm Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: No such process, diagnostic code=0: No diagnostic Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid argument, diagnostic code=40: Unsupported authentication algorithm Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: No such process, diagnostic code=0: No diagnostic Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid argument, diagnostic code=40: Unsupported authentication algorithm Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: Nosuch process, diagnostic code=0: No diagnostic
To fix this, on the domain, issue the command: # ipsecalgs -s
For a more detailed explanation on this issue, please see Bug ID 6233334 Failure after a Solaris[TM] 10 Update 2 Installation or after installing 120253-02 on Solaris[TM]10. After an upgrade to Solaris[TM] Update 2 or patch installation the dcs service may fail to go online, staying in maintenance mode and the dcs process is not running : Jul 27 13:50:30 inetd[284]: Unspecified inetd_start method for instance svc:/platform/sun4u/dcs:default Jul 27 13:50:30 inetd[284]: Invalid configuration for instance svc:/platform/sun4u/dcs:default, placing in maintenance Jul 27 13:50:30 inetd[284]: Invalid configuration for instance svc:/platform/sun4u/dcs:default, placing in maintenance
# svcs dcs STATE STIME FMRI maintenance 13:52:23 svc:/platform/sun4u/dcs:default Check the reason why dcs never got online via the /etc/svc/volatile/platform-sun4u-dcs:default.log log file. # svcs -xv svc:/platform/sun4u/dcs:default (domain configuration server) State: maintenance since Thu 20 Jul 2006 13:50:30 AM MEST Reason: Start method failed repeatedly, last exited with status 1. See: http://sun.com/msg/SMF-8000-KS See: man -M /usr/share/man -s 1M dcs See: /etc/svc/volatile/platform-sun4u-dcs:default.log
Impact: This service is not running. To fix this, on the domain, restart the services : # svcadm disable dcs # svcadm enable dcs # svcs dcs STATE STIME FMRI online 13:53:40 svc:/platform/sun4u/dcs:default
Since dcs is not available, rcfgadm/showdevices not work successfully. If using a separate /usr partition the workaround for CR# 15345596 will need to be used to define a dependency for the /usr filesystem. svccfg -s svc:/platform/sun4u/dcs svc:/platform/sun4u/dcs> addpg SUNW,workaround dependency svc:/platform/sun4u/dcs> setprop SUNW,workaround/entities = fmri:svc:/system/filesystem/local svc:/platform/sun4u/dcs> setprop SUNW,workaround/grouping = astring: require_all svc:/platform/sun4u/dcs> setprop SUNW,workaround/restart_on = astring: none svc:/platform/sun4u/dcs> setprop SUNW,workaround/type = astring: service svc:/platform/sun4u/dcs> exit
# svcadm refresh svc:/platform/sun4u/dcs # svcs -d dcs STATE STIME FMRI online 17:46:06 svc:/network/loopback:default online 17:46:09 svc:/system/identity:node online 17:46:22 svc:/system/filesystem/local:default Failure after an upgrade to Solaris[TM] 10 Update 2. After an upgrade to Solaris[TM] Update 2, the dcs service may fail to go online, staying in maintenance mode and the dcs process is not running : Sep 19 10:57:55 inetd[250]: Property 'name' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 10:57:55 inetd[250]: Property 'endpoint_type' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 10:57:55 inetd[250]: Property 'isrpc' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 10:57:55 inetd[250]: Property 'wait' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 10:57:55 inetd[250]: Unspecified inetd_start method for instance svc:/platform/sun4u/dcs:default Sep 19 10:57:55 inetd[250]: Invalid configuration for instance svc:/platform/sun4u/dcs:default, placing in maintenance
# svcs -xv svc:/platform/sun4u/dcs:default (domain configuration server) State: maintenance since Tue Sep 19 10:57:55 2006 Reason: Restarter svc:/network/inetd:default gave no explanation. See: http://sun.com/msg/SMF-8000-9C See: man -M /usr/share/man -s 1M dcs Impact: This service is not running The new manifest /var/svc/manifest/platform/sun4u/dcs.xml provided by 120253-02 (bundled in S10U2) has not been applied properly so inetd is still trying to start it. The general/restarter property for the dcs service should now be startd and no longer be inetd. # svcprop dcs general/enabled boolean true general/entity_stability astring Unstable general/restarter fmri See CR# 15351311 for more details. To fix this problem, the new manifest must be imported using the following procedure : # svcs dcs STATE STIME FMRI maintenance 10:57:55 svc:/platform/sun4u/dcs:default # svcadm disable dcs # Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'name' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'endpoint_type' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'isrpc' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'wait' of instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Unspecified inetd_start method for instance svc:/platform/sun4u/dcs:default # svcs dcs STATE STIME FMRI disabled 11:02:13 svc:/platform/sun4u/dcs:default # svccfg -v delete dcs # svcs dcs svcs: Pattern 'dcs' doesn't match any instances STATE STIME FMRI # svccfg -v import /var/svc/manifest/platform/sun4u/dcs.xml svccfg: Taking "initial" snapshot for svc:/platform/sun4u/dcs:default. svccfg: Taking "last-import" snapshot for svc:/platform/sun4u/dcs:default. svccfg: Refreshed svc:/platform/sun4u/dcs:default. svccfg: Successful import. # svcs dcs STATE STIME FMRI disabled 11:03:04 svc:/platform/sun4u/dcs:default # svcadm enable dcs # svcs dcs STATE STIME FMRI online 11:03:20 svc:/platform/sun4u/dcs:default # svcs -p dcs STATE STIME FMRI online 11:03:20 svc:/platform/sun4u/dcs:default 11:03:20 717 dcs # svcprop dcs general/enabled boolean false general/entity_stability astring Unstable dcs/ah_auth astring md5 [...] Note that when no general/restarter is mentionned, the default one - startd is used.
**Note, in certain instances this workaround is not the complete fix. On certain systems it has been found that an inetconv command has been run, resulting in two services called sun-dr being created that will stop the DCS service from being able to start even after following the above workaround.
To check for this condition: # svcs -xv svc:/platform/sun4u/dcs:default (domain configuration server) State: maintenance since Thu Nov 15 19:16:38 2007 Reason: Restarter svc:/network/inetd:default gave no explanation. See: http://sun.com/msg/SMF-8000-9C See: man -M /usr/share/man -s 1M dcs Impact: This service is not running. # svcs -a | grep sun-dr online - 19:14:48 - svc:/network/sun-dr/tcp6:default online - 19:14:48 - svc:/network/sun-dr/tcp:default To clear this condition: 1. Remove 2 sun-dr lines from /etc/inetd.conf 2. svcadm disable svc:/network/sun-dr/tcp:default 3. svcadm disable svc:/network/sun-dr/tcp6:default 4. svccfg delete -f svc:/network/sun-dr/tcp:default 5. svccfg delete -f svc:/network/sun-dr/tcp6:default 6. rm /var/svc/manifest/network/sun-dr-tcp.xml 7. rm /var/svc/manifest/network/sun-dr-tcp6.xml 8. svcadm disable svc:/platform/sun4u/dcs 9. svccfg delete -f svc:/platform/sun4u/dcs 10. svccfg -v import /var/svc/manifest/platform/sun4u/dcs.xml 11. svcadm enable svc:/platform/sun4u/dcs To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in an appropriate
My Oracle Support Community - Oracle Sun Technologies Community.
References<BUG:15351311> - SUNBT6472374 DCS FAILS TO START AFTER AN UPGRADE TO SOLARIS10U2<BUG:15345596> - SUNBT6453706-SOLARIS_10U3 THE SVC:/PLATFORM/SUN4U/DCS SERVICE SHOULD DEPEND ON S Attachments This solution has no attachment |
||||||||||||
|