![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2295141.1 : Grid Infrastructure May Reboot Nodes If Information Must Be Printed To The Serial Console
Node got rebooted by CSS because the CPU stall. In this Document
Applies to:Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform. SymptomsThe system reboot suddenly. Sometime from lastgasp we can find it is css or cssmoniter rebooted the node because lost interconnection: Network communication with node hpnplppmdb12 (2) missing for 90% of timeout interval. Removal of this node from cluster in 2.320 seconds Cluster Synchronization Service daemon (CSSD) clssnmvKillBlockThread_0 not scheduled for 21140 msecs. In kdump-dmessage or os message file file we can find: kernel: INFO: rcu_sched_state detected stalls on CPUs/tasks: { 30} (detected by 14, t=60002) The call stack has: uart_console_write serial8250_console_write vprintk PID: 0 TASK: ffff883f05660580 CPU: 23 COMMAND: "kworker/0:1" In OSW -- general CPU usage is low but some time we can find system CPU spike. Changes sysctl -a|grep printk and iptables enabled. CauseThis is because the customer set up the wrong trace for iptables but it can be some other application will need write large message onto console which will cause linux cannot assign CPU to cssd in time and cause the reboot. SolutionSet: dmesg -n 1 -- temporarily or set: Or change kernel.printk to lower level. Attachments This solution has no attachment |
||||||||||||||||||
|