Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1004922.1
Update Date:2017-06-06
Keywords:

Solution Type  Technical Instruction Sure

Solution  1004922.1 :   Sun Fire[TM] servers: Trouble-shooting RCM failures events in DR operations  


Related Items
  • Sun Fire 4810 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Netra 1290 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E6900 Server
  •  
  • Sun Fire V1280 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire E2900 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Netra 1280 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-x8x0/Ex900
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Entry-Level Servers
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Midrange Servers
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Midrange V and Netra Servers
  •  

PreviouslyPublishedAs
206907


Applies to:

Sun Fire 4810 Server - Version Not Applicable and later
Sun Fire 6800 Server - Version Not Applicable and later
Sun Netra 1280 Server - Version Not Applicable and later
Sun Netra 1290 Server - Version Not Applicable and later
Sun Fire V1280 Server - Version Not Applicable and later
All Platforms

Goal

This document describes how to troubleshoot RCM failures events while performing Dynamic Reconfiguration operations.

 

Solution


Dynamic reconfiguration (DR), which is provided as part of the Solaris[TM]
Operating Environment, enables you to safely add and remove CPU/Memory
boards and I/O assemblies while the system is still running. DR controls
the software aspects of dynamically changing the hardware used by a domain,
with minimal disruption to user processes running in the domain.

The DR software uses the cfgadm command, which is a command-line interface
for configuration administration. Specifically, the cfgadm_sbd plugin
provides dynamic reconfiguration functionality for connecting,
configuring, unconfiguring, and disconnecting class sbd system boards : i.e.,
On a platform employing UltraSPARC(R) III / UltraSPARC(R) III+ CPUs :

# cfgadm -s "select=class(sbd)"
Ap_Id Type Receptacle Occupant Condition
N0.IB8 PCI_I/O_Boa connected configured ok
N0.SB2 CPU_V2 connected configured ok
N0.SB4 CPU_V2 connected configured ok

On a platform employing UltraSPARC(R) IV CPUs :

# cfgadm -s "select=class(sbd)"
Ap_Id Type Receptacle Occupant Condition
N0.IB6 PCI_I/O_Boa connected configured ok
N0.SB0 CPU_V3 connected configured ok
N0.SB2 CPU_V3 connected configured ok

Inherent to the above DR architecture is the Reconfiguration Coordination Manager ( RCM ) which provides a framework that facilitates 'external'
software applications' interaction with DR operations -- i.e.,
Reconfiguration and Coordination Manager (RCM) is a framework designed tocoordinate device consumers during Solaris[TM] Dynamic Reconfiguration (DR)
The purpose of this document is to detail a procedure underwhich a user cantrouble-shoot RCM failure events through the course of a DR operation --
i.e., getting the appropriate data collected pertaining to the RCM faultthats causing the DR operation's failure .

The RCM interface allow device consumers, such as application vendors or site administrators, to act
before and after DR operations take place by providing RCM scripts. Forexample, RCM scripts can be used to shut down applications, or to cleanly release the devices from your applications during dynamic remove
operations.

An RCM script is an executable perl script, a shell script or a binary.
A simple example of RCM faults resulting in the failure of a DR operations might be as follows : i.e.,

# cfgadm -s "select=class(sbd)"
Ap_Id Type Receptacle Occupant Condition
N0.IB6 PCI_I/O_Boa connected configured ok
N0.SB0 CPU_V3 connected configured ok
N0.SB2 CPU_V3 connected configured ok

# cfgadm -v -c disconnect N0.SB2
request delete capacity (8 cpus)
notify add capacity (8 cpus)
cfgadm: Library error: RCM request delete capacity failed for N0.SB2

A useful approach to trouble-shooting the above RCM faults & their corresponding DR failures is enclosed as follows :

  • The libcfgadm plugin for System Board (slot0) DR - the cfgadm_sbd plugin (which resides in /usr/platform/sun4u/lib/cfgadm) provides DR functionalityfor connecting, configuring, unconfiguring and disconnecting class sbd system boards. It also enables you to connect or disconnect a system board from a running system without having to reboot the system. SBD plugin debugging is enabled by the environment variable SBD_DEBUG. In general, the value assigned to the env variable is the file name to which debugging information will directed to & no value assigned would indicate that debug data will be directed to stdout.
  • In concert with the returned debug data off the SBD plugin, it is also useful to initiate the rcm_daemon in debug mode via the following command in a separate window:


/usr/lib/rcm/rcm_daemon -d100
PS: It is imperative any existing / running rcm_daemon process(es) be kill'ed before the above rcm_daemon run is executed .
The following reconstructs the above trouble-shooting approach in the event of a RCM fault originating a DR failure event : i.e.,

# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
N0.IB6 PCI_I/O_Boa connected configured ok
N0.IB6::pci0 io connected configured ok
N0.IB6::pci1 io connected configured ok
N0.IB6::pci2 io connected configured ok
N0.IB6::pci3 io connected configured ok
N0.SB0 CPU_V3 connected configured ok
N0.SB0::cpu0 cpu connected configured ok
N0.SB0::cpu1 cpu connected configured ok
N0.SB0::cpu2 cpu connected configured ok
N0.SB0::cpu3 cpu connected configured ok
N0.SB0::memory memory connected configured ok
N0.SB2 CPU_V3 connected configured ok
N0.SB2::cpu0 cpu connected configured ok
N0.SB2::cpu1 cpu connected configured ok
N0.SB2::cpu2 cpu connected configured ok
N0.SB2::cpu3 cpu connected configured ok
N0.SB2::memory memory connected configured ok
N0.SB4 unknown empty unconfigured unknown
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 disk connected configured unknown
c0::dsk/c0t6d0 CD-ROM connected configured unknown
c1 scsi-bus connected unconfigured unknown

# cfgadm -v -c disconnect N0.SB2
request delete capacity (8 cpus)
notify add capacity (8 cpus)
cfgadm: Library error: RCM request delete capacity failed for N0.SB2

On the above DR failure, initiate the trouble-shooting approach documented above :

1. SBD plugin data --

# setenv SBD_DEBUG

# cfgadm -v -c disconnect N0.SB2
Debug started, pid=1535
path=</devices/ssm@0,0:N0.SB2> drv=<ssm> inst=0 minor=<N0.SB2> target=<N0.SB2>
cid=<> cname=<> cnum=-1
tgt=1 opts=80000000
ap_stat(/devices/ssm@0,0:N0.SB2)
open(/devices/ssm@0,0:N0.SB2)
ioctl(3 SBD_CMD_GETNCM, 0x26c38)
ncm(0)=5
ncm=5
ioctl(3 SBD_CMD_STATUS, sc=0x27080 sz=5892 flags=2)
ap_stat()=0
tgt=1
ap_rcm_init(267b0)
Looking for /usr/lib/librcm.so
/usr/lib/librcm.so found
ap_capinfo(267b0)
ap_cm_capacity(0)=(8 520 5)
ap_cm_capacity(1)=(9 521 5)
ap_cm_capacity(2)=(10 522 5)
ap_cm_capacity(3)=(11 523 5)
ap_cm_capacity(4)=(2097152 5)
cmd=disconnect(13) tmask=0x2 cmask=0x2 omask=0x80000189
ap_seq(3, 5, 13, ffbff01c, ffbff018) = (7, 15)
exec suspend check
ap_ioc(8)
ap_ioc(8)=0x0
ap_ioc(9)
ap_ioc(9)=0x0
ap_ioc(10)
ap_ioc(10)=0x0
ap_ioc(11)
ap_ioc(11)=0x445208
ap_ioc(12)
ap_ioc(12)=0x0
ap_ioc(13)
ap_ioc(13)=0x445209
ap_ioc(14)
ap_ioc(14)=0x445204
ap_ioc(15)
ap_ioc(15)=0x445202
exec request suspend
exec request delete capacity
ap_rcm_ctl(267b0)
ap_rcm_request_cap(267b0)
ap_rcm_cap_cpu(267b0)
getsyscpuids
syscpuids: 0 1 2 3 512 513 514 515 8 520 9 521 10 522 11 523
oldcpuids: 0 1 2 3 512 513 514 515 8 520 9 521 10 522 11 523
change : 8 520 9 521 10 522 11 523
newcpuids: 0 1 2 3 512 513 514 515
ap_msg(267b0)
<0><request delete capacity><(8 cpus)>
request delete capacity (8 cpus)
ap_err(267b0)
<request delete capacity><N0.SB2>ap_rcm_info(267b0)
<Interrupted system call><><>
ap_seq_exec: rcm_cap_del failed
Sequencing recovery: first = 6, last = 6
exec notify add capacity
ap_rcm_ctl(267b0)
ap_rcm_notify_cap(267b0)
ap_capinfo(267b0)
ap_cm_capacity(0)=(8 520 5)
ap_cm_capacity(1)=(9 521 5)
ap_cm_capacity(2)=(10 522 5)
ap_cm_capacity(3)=(11 523 5)
ap_cm_capacity(4)=(2097152 5)
cm=0 valid=1 type=5, prevos=5 os=5
cm=1 valid=1 type=5, prevos=5 os=5
cm=2 valid=1 type=5, prevos=5 os=5
cm=3 valid=1 type=5, prevos=5 os=5
cm=4 valid=1 type=3, prevos=5 os=5
ap_rcm_cap_cpu(267b0)
getsyscpuids
syscpuids: 0 1 2 3 512 513 514 515 8 520 9 521 10 522 11 523
ap_rcm_cap_cpu: CPU capacity, old = 8, new = 16
oldcpuids: 0 1 2 3 512 513 514 515
change : 8 520 9 521 10 522 11 523
newcpuids: 0 1 2 3 512 513 514 515 8 520 9 521 10 522 11 523
ap_msg(267b0)
<0><notify add capacity><(8 cpus)>
notify add capacity (8 cpus)
ap_err(267b0)
recovery complete!
ap_rcm_fini(267b0)
cfgadm: Library error: RCM request delete capacity failed for N0.SB2
From the above, we can clearly observe that the following fault originated
the DR detach failure event :
ap_seq_exec: rcm_cap_del failed <--
i.e., the DR failure originated from a RCM fault which stems from an
internal library call to the RCM framework initiating a delete operations
to current capacity failing.

2. rcm_daemon debug data --


Based on the above acquired SBD plugin debug information, we can reasonably assume that the RCM framework itself is originating the DR failure. The
next phase of data collection would typically involve the following operations :

# ps -ef|grep -i rcm_daemon
root 1547 907 0 11:45:57 console 0:00 grep -i rcm_daemon

# /usr/lib/rcm/rcm_daemon -d100
enter_daemon_lock: lock file = /var/run/rcm_daemon_lock
rcm_daemon started, debug level = 100
rcmd_db_init(): initialize database
rn_alloc(/, 0)
search directory /usr/lib/rcm/modules/
cli_module_hold(SUNW_cluster_rcm.so)
module_load(name=SUNW_cluster_rcm.so)
module_attach(name=SUNW_cluster_rcm.so)
cli_module_rele(name=SUNW_cluster_rcm.so)
cli_module_hold(SUNW_dump_rcm.so)
module_load(name=SUNW_dump_rcm.so)
module_attach(name=SUNW_dump_rcm.so)
add_resource_client(SUNW_dump_rcm.so, /dev/dsk/c0t0d0s1, 0, 0x1000)
rn_node_find(/dev/dsk/c0t0d0s1, 0x1)
rn_find_child(parent=/, child=SYSTEM, 0x1, 0)
rn_alloc(SYSTEM, 0)
rn_find_child(parent=SYSTEM, child=devices, 0x1, 1)
rn_alloc(devices, 1)
rn_find_child(parent=devices, child=ssm@0,0, 0x1, 1)
rn_alloc(ssm@0,0, 1)
rn_find_child(parent=ssm@0,0, child=pci@18,700000, 0x1, 1)
rn_alloc(pci@18,700000, 1)
rn_find_child(parent=pci@18,700000, child=pci@1, 0x1, 1)
rn_alloc(pci@1, 1)
rn_find_child(parent=pci@1, child=scsi@2, 0x1, 1)
rn_alloc(scsi@2, 1)
rn_find_child(parent=scsi@2, child=sd@0,0, 0x1, 1)
rn_alloc(sd@0,0, 1)
rn_find_child(parent=sd@0,0, child=b, 0x1, 1)
rn_alloc(b, 1)
rsrc_client_find(SUNW_dump_rcm.so, 0, 30f70)
rsrc_node_add_user(b, /dev/dsk/c0t0d0s1, SUNW_dump_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_dump_rcm.so, 0, 30f70)
rsrc_client_alloc(/dev/dsk/c0t0d0s1, SUNW_dump_rcm.so, 0)
cli_module_hold(SUNW_dump_rcm.so)
rsrc_client_add: /dev/dsk/c0t0d0s1, SUNW_dump_rcm.so, 0
registered /dev/dsk/c0t0d0s1
cli_module_rele(name=SUNW_dump_rcm.so)
cli_module_hold(SUNW_filesys_rcm.so)
module_load(name=SUNW_filesys_rcm.so)
module_attach(name=SUNW_filesys_rcm.so)
FILESYS: register()
FILESYS: registering /dev/dsk/c0t0d0s0
add_resource_client(SUNW_filesys_rcm.so, /dev/dsk/c0t0d0s0, 0, 0x1000)
rn_node_find(/dev/dsk/c0t0d0s0, 0x1)
rn_find_child(parent=/, child=SYSTEM, 0x1, 0)
rn_find_child(parent=SYSTEM, child=devices, 0x1, 1)
rn_find_child(parent=devices, child=ssm@0,0, 0x1, 1)
rn_find_child(parent=ssm@0,0, child=pci@18,700000, 0x1, 1)
rn_find_child(parent=pci@18,700000, child=pci@1, 0x1, 1)
rn_find_child(parent=pci@1, child=scsi@2, 0x1, 1)
rn_find_child(parent=scsi@2, child=sd@0,0, 0x1, 1)
rn_find_child(parent=sd@0,0, child=a, 0x1, 1)
rn_alloc(a, 1)
rsrc_client_find(SUNW_filesys_rcm.so, 0, 31010)
rsrc_node_add_user(a, /dev/dsk/c0t0d0s0, SUNW_filesys_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_filesys_rcm.so, 0, 31010)
rsrc_client_alloc(/dev/dsk/c0t0d0s0, SUNW_filesys_rcm.so, 0)
cli_module_hold(SUNW_filesys_rcm.so)
rsrc_client_add: /dev/dsk/c0t0d0s0, SUNW_filesys_rcm.so, 0
cli_module_rele(name=SUNW_filesys_rcm.so)
cli_module_hold(SUNW_ip_rcm.so)
module_load(name=SUNW_ip_rcm.so)
IP: mod_init
module_attach(name=SUNW_ip_rcm.so)
IP: register
IP: update_cache
IP: scanning IPv4 interfaces
IP: update_ipifs
IP: update_pif(lo0)
IP: DLPI style2 (lo0)
IP: if ignored (lo0)
IP: update_pif(ce0)
IP: DLPI style2 (ce0)
IP: cache lookup(SUNW_network/ce0)
IP: adding lifs to ce0
IP: update_pif: (SUNW_network/ce0) success
IP: scanning IPv6 interfaces
IP: update_ipifs
IP: update_pif(lo0)
IP: DLPI style2 (lo0)
IP: if ignored (lo0)
IP: update_pif(ce0)
IP: DLPI style2 (ce0)
IP: cache lookup(SUNW_network/ce0)
IP: cache lookup success(SUNW_network/ce0)
IP: adding lifs to ce0
IP: update_pif: (SUNW_network/ce0) success
add_resource_client(SUNW_ip_rcm.so, SUNW_network/ce0, 0, 0x1000)
rn_node_find(SUNW_network/ce0, 0x1)
rn_find_child(parent=/, child=ABSTRACT, 0x1, 0)
rn_alloc(ABSTRACT, 0)
rn_find_child(parent=ABSTRACT, child=SUNW_network, 0x1, 3)
rn_alloc(SUNW_network, 3)
rn_find_child(parent=SUNW_network, child=ce0, 0x1, 3)
rn_alloc(ce0, 3)
rsrc_client_find(SUNW_ip_rcm.so, 0, 310f0)
rsrc_node_add_user(ce0, SUNW_network/ce0, SUNW_ip_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_ip_rcm.so, 0, 310f0)
rsrc_client_alloc(SUNW_network/ce0, SUNW_ip_rcm.so, 0)
cli_module_hold(SUNW_ip_rcm.so)
rsrc_client_add: SUNW_network/ce0, SUNW_ip_rcm.so, 0
IP: registered SUNW_network/ce0
add_resource_client(SUNW_ip_rcm.so, SUNW_event/resource/new/network, 0, 0x2000)
rn_node_find(SUNW_event/resource/new/network, 0x1)
rn_find_child(parent=/, child=ABSTRACT, 0x1, 0)
rn_find_child(parent=ABSTRACT, child=SUNW_event, 0x1, 3)
rn_alloc(SUNW_event, 3)
rn_find_child(parent=SUNW_event, child=resource, 0x1, 3)
rn_alloc(resource, 3)
rn_find_child(parent=resource, child=new, 0x1, 3)
rn_alloc(new, 3)
rn_find_child(parent=new, child=network, 0x1, 3)
rn_alloc(network, 3)
rsrc_client_find(SUNW_ip_rcm.so, 0, 31170)
rsrc_node_add_user(network, SUNW_event/resource/new/network,
SUNW_ip_rcm.so, 0, 0x2000)
rsrc_client_find(SUNW_ip_rcm.so, 0, 31170)
rsrc_client_alloc(SUNW_event/resource/new/network, SUNW_ip_rcm.so, 0)
cli_module_hold(SUNW_ip_rcm.so)
rsrc_client_add: SUNW_event/resource/new/network, SUNW_ip_rcm.so, 0
IP: registered SUNW_event/resource/new/network
cli_module_rele(name=SUNW_ip_rcm.so)
cli_module_hold(SUNW_mpxio_rcm.so)
module_load(name=SUNW_mpxio_rcm.so)
MPXIO: rcm_mod_init()
module_attach(name=SUNW_mpxio_rcm.so)
MPXIO: register()
MPXIO: found 0 clients.
cli_module_rele(name=SUNW_mpxio_rcm.so)
cli_module_hold(SUNW_network_rcm.so)
module_load(name=SUNW_network_rcm.so)
module_attach(name=SUNW_network_rcm.so)
add_resource_client(SUNW_network_rcm.so, SUNW_resource/new, 0, 0x2000)
rn_node_find(SUNW_resource/new, 0x1)
rn_find_child(parent=/, child=ABSTRACT, 0x1, 0)
rn_find_child(parent=ABSTRACT, child=SUNW_resource, 0x1, 3)
rn_alloc(SUNW_resource, 3)
rn_find_child(parent=SUNW_resource, child=new, 0x1, 3)
rn_alloc(new, 3)
rsrc_client_find(SUNW_network_rcm.so, 0, 31250)
rsrc_node_add_user(new, SUNW_resource/new, SUNW_network_rcm.so, 0, 0x2000)
rsrc_client_find(SUNW_network_rcm.so, 0, 31250)
rsrc_client_alloc(SUNW_resource/new, SUNW_network_rcm.so, 0)
cli_module_hold(SUNW_network_rcm.so)
rsrc_client_add: SUNW_resource/new, SUNW_network_rcm.so, 0
NET: /devices/ssm@0,0/pci@18,700000/pci@1/network@0 is new resource
NET: /devices/ssm@0,0/pci@18,700000/pci@1/network@1 is new resource
NET: ignoring pseudo device /pseudo/clone@0
NET: ignoring pseudo device /pseudo/clone@0
NET: registering /devices/ssm@0,0/pci@18,700000/pci@1/network@1
add_resource_client(SUNW_network_rcm.so,
/devices/ssm@0,0/pci@18,700000/pci@1/network@1, 0, 0x1000)
rn_node_find(/devices/ssm@0,0/pci@18,700000/pci@1/network@1, 0x1)
rn_find_child(parent=/, child=SYSTEM, 0x1, 0)
rn_find_child(parent=SYSTEM, child=devices, 0x1, 1)
rn_find_child(parent=devices, child=ssm@0,0, 0x1, 1)
rn_find_child(parent=ssm@0,0, child=pci@18,700000, 0x1, 1)
rn_find_child(parent=pci@18,700000, child=pci@1, 0x1, 1)
rn_find_child(parent=pci@1, child=network@1, 0x1, 1)
rn_alloc(network@1, 1)
rsrc_client_find(SUNW_network_rcm.so, 0, 312f0)
rsrc_node_add_user(network@1,
/devices/ssm@0,0/pci@18,700000/pci@1/network@1, SUNW_network_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_network_rcm.so, 0, 312f0)
rsrc_client_alloc(/devices/ssm@0,0/pci@18,700000/pci@1/network@1,
SUNW_network_rcm.so, 0)
cli_module_hold(SUNW_network_rcm.so)
rsrc_client_add: /devices/ssm@0,0/pci@18,700000/pci@1/network@1,
SUNW_network_rcm.so, 0
NET: registered /devices/ssm@0,0/pci@18,700000/pci@1/network@1 (as
SUNW_network/ce1)
NET: registering /devices/ssm@0,0/pci@18,700000/pci@1/network@0
add_resource_client(SUNW_network_rcm.so,
/devices/ssm@0,0/pci@18,700000/pci@1/network@0, 0, 0x1000)
rn_node_find(/devices/ssm@0,0/pci@18,700000/pci@1/network@0, 0x1)
rn_find_child(parent=/, child=SYSTEM, 0x1, 0)
rn_find_child(parent=SYSTEM, child=devices, 0x1, 1)
rn_find_child(parent=devices, child=ssm@0,0, 0x1, 1)
rn_find_child(parent=ssm@0,0, child=pci@18,700000, 0x1, 1)
rn_find_child(parent=pci@18,700000, child=pci@1, 0x1, 1)
rn_find_child(parent=pci@1, child=network@0, 0x1, 1)
rn_alloc(network@0, 1)
rsrc_client_find(SUNW_network_rcm.so, 0, 31310)
rsrc_node_add_user(network@0,
/devices/ssm@0,0/pci@18,700000/pci@1/network@0, SUNW_network_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_network_rcm.so, 0, 31310)
rsrc_client_alloc(/devices/ssm@0,0/pci@18,700000/pci@1/network@0,
SUNW_network_rcm.so, 0)
cli_module_hold(SUNW_network_rcm.so)
rsrc_client_add: /devices/ssm@0,0/pci@18,700000/pci@1/network@0,
SUNW_network_rcm.so, 0
NET: registered /devices/ssm@0,0/pci@18,700000/pci@1/network@0 (as
SUNW_network/ce0)
cli_module_rele(name=SUNW_network_rcm.so)
cli_module_hold(SUNW_swap_rcm.so)
module_load(name=SUNW_swap_rcm.so)
module_attach(name=SUNW_swap_rcm.so)
add_resource_client(SUNW_swap_rcm.so, /dev/dsk/c0t0d0s1, 0, 0x1000)
rn_node_find(/dev/dsk/c0t0d0s1, 0x1)
rn_find_child(parent=/, child=SYSTEM, 0x1, 0)
rn_find_child(parent=SYSTEM, child=devices, 0x1, 1)
rn_find_child(parent=devices, child=ssm@0,0, 0x1, 1)
rn_find_child(parent=ssm@0,0, child=pci@18,700000, 0x1, 1)
rn_find_child(parent=pci@18,700000, child=pci@1, 0x1, 1)
rn_find_child(parent=pci@1, child=scsi@2, 0x1, 1)
rn_find_child(parent=scsi@2, child=sd@0,0, 0x1, 1)
rn_find_child(parent=sd@0,0, child=b, 0x1, 1)
rsrc_client_find(SUNW_swap_rcm.so, 0, 30f70)
rsrc_node_add_user(b, /dev/dsk/c0t0d0s1, SUNW_swap_rcm.so, 0, 0x1000)
rsrc_client_find(SUNW_swap_rcm.so, 0, 30f70)
rsrc_client_alloc(/dev/dsk/c0t0d0s1, SUNW_swap_rcm.so, 0)
cli_module_hold(SUNW_swap_rcm.so)
rsrc_client_add: /dev/dsk/c0t0d0s1, SUNW_swap_rcm.so, 0
registered /dev/dsk/c0t0d0s1
cli_module_rele(name=SUNW_swap_rcm.so)
cli_module_hold(SUNW_ttymux_rcm.so)
module_load(name=SUNW_ttymux_rcm.so)
TTYMUX: mod_init:
no node for ttymux
module_attach(name=SUNW_ttymux_rcm.so)
cli_module_rele(name=SUNW_ttymux_rcm.so)
cli_module_hold(SUNW_svm_rcm.so)
module_load(name=SUNW_svm_rcm.so)
SVM: cache_all_devices,max sets = 4
SVM: cache_all_devices_in_set
SVM: cache_all_devices no set: setno 1
SVM: exit cache_all_devices
module_attach(name=SUNW_svm_rcm.so)
SVM: register
cli_module_rele(name=SUNW_svm_rcm.so)
cli_module_hold(SUNW_pool_rcm.so)
module_load(name=SUNW_pool_rcm.so)
Segmentation Fault (core dumped)


Two observations can be reasonably concluded based on the above data :

a. The rcm_daemon dumping core off a Segmentation Fault event originated the DR detach failure ;

b. The most likely cause of the rcm_daemon process encountering SEGV would be the following :

SVM: register
cli_module_rele(name=SUNW_svm_rcm.so)
cli_module_hold(SUNW_pool_rcm.so)
module_load(name=SUNW_pool_rcm.so)
Segmentation Fault (core dumped)
i.e., the inherent Sun[TM] Volume Manager RCM module is most likely responsible
for the rcm_daemon core dump'ing.


In conclusion, the pertinent information available for furthering any investigations would include :

SBD plugin debug data + rcm_daemon debug data ;
and
core file off the rcm_daemon SEGV -- i.e.,
# pwd
/usr/lib/rcm
# file core
core: ELF 32-bit MSB core file SPARC Version 1, from 'rcm_daemon'

 

 

 

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in an appropriate My Oracle Support Community, Oracle Sun Technologies Community.





Product
Sun Fire 6800 Server
Sun Fire 3800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Sun Fire V1280 Server
Sun Fire E2900 Server
Sun Fire E4900 Server
Sun Fire E6900 Server
Netra 1280 Server
Sun Netra 1290 Server


Internal Comments
For the internal use of Sun Employee's.

 

See Bug ID: 5052373
debug, rcm_daemon, SBD_DEBUG, SUNW_svm_rcm.so, DR, dynamic, reconfiguration, rcm_cap_del, librcm, sbd, cfgadm_sbd, cfgadm
Previously Published As
76613

 

Change History
Date: 2009-11-30
User Name: Josh Freeman
Action: Rubber Stamp
Comment: Made no changes to the article at all - just a "rubber stamp".
Date: 2004-06-29
User Name: 25440
Action: Approved
Comment: Publishing
Version: 0
Date: 2004-06-29
User Name: 25440
Action: Accepted
Comment:
Version: 0


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback