![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1620826.1 : Reboot Hangs Running dbnodeupdate.sh While Upgrading Exadata Db Server
In this Document
Created from <SR 3-8503511291> Applies to:Exadata Database Machine X2-2 Hardware - Version All Versions and laterInformation in this document applies to any platform. The process being followed is: 1. calling dbnodeupdate - which is kicking off the yum update 2. the yum update process causes the node to reboot 3. reboot not happening due to a bug. This bug could be encountered on any reboot, regardless of whether dbnodeupdate.sh was called. SymptomsRunning dbnodeupdate.sh is hanging while attempting to upgrade Exadata software on a database node. The hanging step is likely from the reboot step of the patching process. It can occur during rebooting / shutting down, an action which may not be affiliated with a patching activity. The console window returns a stack which often looks like the following: INFO: task rmmod:17665 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ... Call Trace: [] rds_ib_remove_one+0xf0/0x110 [rds_rdma] [] ? autoremove_wake_function+0x0/0x3d [] ? _cond_resched+0xe/0x22 [] ib_unregister_device+0x36/0x103 [ib_core] [] mlx4_ib_remove+0x3f/0xfa [mlx4_ib] [] mlx4_remove_device+0x78/0xa0 [mlx4_core] [] mlx4_unregister_interface+0x2f/0x99 [mlx4_core] [] mlx4_ib_cleanup+0x15/0x23 [mlx4_ib] [] sys_delete_module+0x1c3/0x244 [] ? audit_syscall_entry+0x103/0x12f [] system_call_fastpath+0x16/0x1b ChangesApplying a new version of the Exadata software (upgrade) This can occur even when simply rebooting / shutting down, as a process unrelated to patching activities. CauseThe issue is related to <bug 17580227> and the fix is part of future kernels (**version below) as per an internal-only bug SolutionPlease wait at least 5 minutes to allow the system to autocorrect the issue. If it does not progress within 30 minutes then: 1. Login to ilom and do reset /SYS
# ./dbnodeupdate.sh -c
(*) 2014-02-09 06:45:38: Unzipping helpers (/u01/patches/YUM/dbupdate-helpers.zip) to /opt/oracle.SupportTools/dbnodeupdate_helpers (*) 2014-02-09 06:45:38: Initializing logfile /var/log/cellos/dbnodeupdate.log (*) 2014-02-09 06:45:38: Collecting system configuration details, this may take some time... ERROR: Unable to determine hardware type, reset ILOM and retry, exiting
This means not all the firmware was updated fully. To resolve this the system needs to be powered down and off. stop /SYS
only if that doesn't work you, will need to use the ilom to shutdown using force and startup the system: 1) Force a shutdown as normal stop /SYS does not work:
ssh to ilom --->> stop -force -script /SYS 2) Power needs to stay off for about 5 minutes. Verify it is off: show /SYS 3) Manually start the system back up (after 5 mins): start /SYS show /SYS
Power needs to remain off for about 5 mins then it can be started back up with start /SYS
dmidecode -s system-product-name
imageinfo
dbnodeupdate -c # ipmitool -H <ip address of problematic db node> -U root -P mypassword1 mc reset cold
References<NOTE:1570371.1> - DO_IRQ: NO IRQ HANDLER FOR VECTOR (IRQ -1)<NOTE:1553103.1> - dbnodeupdate.sh: Exadata Database Server Patching using the DB Node Update Utility <BUG:17580227> - WHILE SYSTEM IS DOING SHUTDOWN, RMMOD INTERMITTENTLY HANGS. <BUG:16605377> - KERNEL PANIC WHEN RDMA SERVICE RESTARTED WHILE RDS-STRESS RUNS ON SERVER/CLIENT <NOTE:1009715.1> - Integrated Lights Out Manager (ILOM) CLI Quick Reference Attachments This solution has no attachment |
||||||||||||||||||||
|