![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1642258.1 : ODA: Node Inaccessible Due to OS "page allocation failure: order:1"
Created from <SR 3-8678439001> Applies to:Oracle Database Appliance - Version All Versions and laterInformation in this document applies to any platform. SymptomsODA machine went offline and required a manual reboot (direct access to the box) as there was no ILOM console access to it. Once restarted, the ILOM console history was captured and reported a variety of error stacks indicating: "page allocation failure: order:1" Locations that you can for diagnostic information include: ----------------------------- Example 1 swapper: page allocation failure. order:2, mode:0x20 <<< page allocation failure.
Pid: 0, comm: swapper Tainted: P 2.6.32-300.32.5.el5uek #1 Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813b9518>] __alloc_skb+0x72/0x13d [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] [<ffffffffa0157df6>] ? ixgbe_alloc_rx_buffers+0xf6/0x204 [ixgbe] [<ffffffff8123786c>] ? rb_insert_color+0x68/0xe3 [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 [<ffffffff81012eec>] call_softirq+0x1c/0x30 [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105e74a>] irq_exit+0x3b/0x7a [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 [<ffffffff81012713>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff810199d6>] ? mwait_idle+0x74/0x7f [<ffffffff810199c9>] ? mwait_idle+0x67/0x7f [<ffffffff81010d6f>] ? cpu_idle+0xa5/0xd4 [<ffffffff8145121f>] ? start_secondary+0x1fd/0x23c ... rpciod/1: page allocation failure. order:2, mode:0x20 swapper: page allocation failure. order:2, mode:0x20 Pid: 0, comm: swapper Tainted: P 2.6.32-300.32.5.el5uek #1 Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813b9518>] __alloc_skb+0x72/0x13d [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 [<ffffffff81012eec>] call_softirq+0x1c/0x30 [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105e74a>] irq_exit+0x3b/0x7a [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 [<ffffffff81012713>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff810199d6>] ? mwait_idle+0x74/0x7f [<ffffffff810199c9>] ? mwait_idle+0x67/0x7f [<ffffffff81010d6f>] ? cpu_idle+0xa5/0xd4 [<ffffffff8145121f>] ? start_secondary+0x1fd/0x23c ... Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813b9518>] __alloc_skb+0x72/0x13d [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] [<ffffffff812b2d0e>] ? mix_pool_bytes_extract+0x145/0x154 [<ffffffff812b31a8>] ? add_timer_randomness+0x107/0x110 [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff81012eec>] call_softirq+0x1c/0x30 <EOI> [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105df02>] _local_bh_enable_ip+0x82/0x93 [<ffffffff8105e00b>] local_bh_enable+0x12/0x14 [<ffffffff813c1b31>] rcu_read_unlock_bh+0xe/0x10 [<ffffffff813c4dac>] dev_queue_xmit+0x2ed/0x310 [<ffffffff813c8536>] neigh_resolve_output+0x1db/0x210 [<ffffffff813b9568>] ? __alloc_skb+0xc2/0x13d [<ffffffff813eb026>] ip_finish_output2+0x1a1/0x1e5 [<ffffffff813eb0cc>] ip_finish_output+0x62/0x67 [<ffffffff813eb17f>] ip_output+0xae/0xb5 [<ffffffff813e957d>] dst_output+0x10/0x12 [<ffffffff813eabe4>] ip_local_out+0x23/0x28 [<ffffffff813ebc2e>] ip_queue_xmit+0x301/0x371 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813fcad5>] tcp_transmit_skb+0x62d/0x66d [<ffffffff813fdeec>] tcp_write_xmit+0x6d7/0x7bd [<ffffffff813fc151>] ? tcp_current_mss+0x4b/0x6a [<ffffffff813fe037>] __tcp_push_pending_frames+0x2f/0x62 [<ffffffff813f13af>] tcp_push+0x86/0x88 [<ffffffff813f21bb>] tcp_sendpage+0x375/0x3b3 [<ffffffffa0418d14>] xs_sendpages+0x120/0x1b5 [sunrpc] [<ffffffffa041abb3>] xs_tcp_send_request+0x49/0x11a [sunrpc] [<ffffffffa0417ac6>] xprt_transmit+0x10d/0x1e7 [sunrpc] [<ffffffffa04bd732>] ? nfs3_xdr_writeargs+0x0/0x7a [nfs] [<ffffffffa0415164>] call_transmit+0x1d3/0x21e [sunrpc] [<ffffffffa041b8ba>] __rpc_execute+0x85/0x270 [sunrpc] [<ffffffffa041baa5>] ? rpc_async_schedule+0x0/0x17 [sunrpc] [<ffffffffa041baba>] rpc_async_schedule+0x15/0x17 [sunrpc] [<ffffffff81072d62>] worker_thread+0x14d/0x1ed [<ffffffff81077028>] ? autoremove_wake_function+0x0/0x3d [<ffffffff81072c15>] ? worker_thread+0x0/0x1ed [<ffffffff81076c7f>] kthread+0x6e/0x76 [<ffffffff81012dea>] child_rip+0xa/0x20 [<ffffffff81076c11>] ? kthread+0x0/0x76 [<ffffffff81012de0>] ? child_rip+0x0/0x20 ... swapper: page allocation failure. order:2, mode:0x20 Pid: 0, comm: swapper Tainted: P 2.6.32-300.32.5.el5uek #1 Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813b9518>] __alloc_skb+0x72/0x13d [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] [<ffffffff8101859a>] ? native_sched_clock+0x37/0x39 [<ffffffff8123786c>] ? rb_insert_color+0x68/0xe3 [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 [<ffffffff81012eec>] call_softirq+0x1c/0x30 [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105e74a>] irq_exit+0x3b/0x7a [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 [<ffffffff81012713>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff810199d6>] ? mwait_idle+0x74/0x7f [<ffffffff810199c9>] ? mwait_idle+0x67/0x7f [<ffffffff81010d6f>] ? cpu_idle+0xa5/0xd4 [<ffffffff8145121f>] ? start_secondary+0x1fd/0x23c ... sshd: page allocation failure. order:1, mode:0x20 Pid: 9578, comm: sshd Tainted: P 2.6.32-300.32.5.el5uek #1 Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110ee1c>] kmem_cache_alloc+0x7f/0xf7 [<ffffffff813b5035>] sk_prot_alloc+0x3b/0x13e [<ffffffff813b63ad>] sk_clone+0x1e/0x270 [<ffffffff813efe30>] inet_csk_clone+0x16/0x9c [<ffffffff81404272>] tcp_create_openreq_child+0x23/0x3f5 [<ffffffff81402ce0>] tcp_v4_syn_recv_sock+0x5c/0x21a [<ffffffff81404157>] tcp_check_req+0x1f3/0x2eb [<ffffffff813efcbd>] ? inet_csk_search_req+0x3c/0x9d [<ffffffff814015bf>] tcp_v4_do_rcv+0x225/0x352 [<ffffffff8105e00b>] ? local_bh_enable+0x12/0x14 [<ffffffff81402982>] tcp_v4_rcv+0x459/0x6d0 [<ffffffff813e6e52>] ip_local_deliver_finish+0x152/0x1fa [<ffffffff813e724d>] ip_local_deliver+0x72/0x7d [<ffffffff813e6c7e>] ip_rcv_finish+0x372/0x38c [<ffffffff813f370b>] ? tcp_gro_receive+0x7e/0x1e5 [<ffffffff813e719c>] ip_rcv+0x2a2/0x2e1 [<ffffffff813c14ab>] __netif_receive_skb+0x41b/0x440 [<ffffffff813c1519>] netif_receive_skb+0x49/0x50 [<ffffffff813c15b5>] napi_skb_finish+0x2b/0x42 [<ffffffff813c1a2e>] napi_gro_receive+0x2f/0x34 [<ffffffffa01935e8>] igb_poll+0x808/0xb78 [igb] [<ffffffff8101859a>] ? native_sched_clock+0x37/0x39 [<ffffffff810182e0>] ? sched_clock+0x9/0xd [<ffffffff8107bd51>] ? sched_clock_cpu+0x4c/0xdc [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 [<ffffffff81012eec>] call_softirq+0x1c/0x30 [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105e74a>] irq_exit+0x3b/0x7a [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 [<ffffffff81012713>] ret_from_intr+0x0/0x11 <EOI> ... swapper: page allocation failure. order:2, mode:0x20 Pid: 0, comm: swapper Tainted: P 2.6.32-300.32.5.el5uek #1 Call Trace: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d [<ffffffff813b9518>] __alloc_skb+0x72/0x13d [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] [<ffffffff81044498>] ? __wake_up+0x48/0x55 [<ffffffff812b3098>] ? credit_entropy_bits+0x90/0x99 [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 [<ffffffff81012eec>] call_softirq+0x1c/0x30 [<ffffffff81014695>] do_softirq+0x46/0x89 [<ffffffff8105e74a>] irq_exit+0x3b/0x7a [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 [<ffffffff81012713>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff810199d6>] ? mwait_idle+0x74/0x7f [<ffffffff810199c9>] ? mwait_idle+0x67/0x7f [<ffffffff81010d6f>] ? cpu_idle+0xa5/0xd4 [<ffffffff8145121f>] ? start_secondary+0x1fd/0x23c
@ e.g. from SR 3-9966841401: Database lost connection because server was not accessible Here are a few more examples by file type / location Example 2 OS Messages
Dec 2 03:53:59 oda03 kernel: swapper: page allocation failure. order:2, mode:0x20 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<,
Dec 2 03:53:59 oda03 kernel: Pid: 0, comm: swapper Tainted: P W 2.6.32-300.32.5.el5uek #1 Dec 2 03:53:59 oda03 kernel: Call Trace: Dec 2 03:53:59 oda03 kernel: <IRQ> [<ffffffff810ddd8b>] __alloc_pages_nodemask+0x524/0x595 Dec 2 03:53:59 oda03 kernel: [<ffffffff8110d6ef>] kmem_getpages+0x4f/0xf4 Dec 2 03:53:59 oda03 kernel: [<ffffffff8110d8ec>] fallback_alloc+0x158/0x1ce Dec 2 03:53:59 oda03 kernel: [<ffffffff8110da83>] ____cache_alloc_node+0x121/0x134 Dec 2 03:53:59 oda03 kernel: [<ffffffff8110e0a3>] kmem_cache_alloc_node_notrace+0x84/0xb9 Dec 2 03:53:59 oda03 kernel: [<ffffffff8110e11e>] __kmalloc_node+0x46/0x73 Dec 2 03:53:59 oda03 kernel: [<ffffffff813b9518>] ? __alloc_skb+0x72/0x13d Dec 2 03:53:59 oda03 kernel: [<ffffffff813b9518>] __alloc_skb+0x72/0x13d Dec 2 03:53:59 oda03 kernel: [<ffffffffa0157d93>] ixgbe_alloc_rx_buffers+0x93/0x204 [ixgbe] Dec 2 03:53:59 oda03 kernel: [<ffffffffa015ac08>] ixgbe_poll+0xeea/0x1071 [ixgbe] Dec 2 03:53:59 oda03 kernel: [<ffffffff813c45d9>] net_rx_action+0xc6/0x1cd Dec 2 03:53:59 oda03 kernel: [<ffffffff8105e8c5>] __do_softirq+0xd7/0x19e Dec 2 03:53:59 oda03 kernel: [<ffffffff810aee94>] ? handle_IRQ_event+0x10a/0x120 Dec 2 03:53:59 oda03 kernel: [<ffffffff81012eec>] call_softirq+0x1c/0x30 Dec 2 03:53:59 oda03 kernel: [<ffffffff81014695>] do_softirq+0x46/0x89 Dec 2 03:54:00 oda03 kernel: [<ffffffff8105e74a>] irq_exit+0x3b/0x7a Dec 2 03:54:00 oda03 kernel: [<ffffffff8145b8c1>] do_IRQ+0x99/0xb0 Dec 2 03:54:00 oda03 kernel: [<ffffffff81012713>] ret_from_intr+0x0/0x11 Dec 2 03:54:00 oda03 kernel: <EOI> [<ffffffff810199d6>] ? mwait_idle+0x74/0x7f Dec 2 03:54:00 oda03 kernel: [<ffffffff810199c9>] ? mwait_idle+0x67/0x7f Dec 2 03:54:00 oda03 kernel: [<ffffffff81010d6f>] ? cpu_idle+0xa5/0xd4 Dec 2 03:54:00 oda03 kernel: [<ffffffff8145121f>] ? start_secondary+0x1fd/0x23c Dec 2 03:54:00 oda03 kernel: Mem-Info: ... Dec 2 05:18:13 oda03 syslogd 1.4.1: restart. <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Dec 2 05:18:13 oda03 kernel: klogd 1.4.1, log source = /proc/kmsg started. Dec 2 05:18:13 oda03 kernel: 6-11,18-23 (cpu_power = 7068) Dec 2 05:18:13 oda03 kernel: CPU1 attaching sched-domain:
MEMINFO - (excerpt showing memfree is running out just before the reboot) ... Meminfo
OS Top Tasks: 940 total, 2 running, 938 sleeping, 0 stopped, 0 zombie
ASM ALERT.LOG - shows restart at the time of the reboot matching the time stamps of confirmed memory issues ASM1
===== ... Tue Dec 02 03:54:05 2014 Time drift detected. Please check VKTM trace file for more details. Tue Dec 02 05:16:19 2014 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< No activity recorded on this instance at the time of the reboot until some time later (over an hour for this case) NOTE: No asm libraries found in the system * instance_number obtained from CSS = 1, checking for the existence of node 0... * node 0 does not exist. instance_number = 1 NOTE: parameter asm_diskstring not allowed in ODA appliance; overriding asm_diskstring to "/dev/mapper/*D_*" Starting ORACLE instance (normal) ... ...
CauseThe setting for vm.min_free_kbytes is set too small. SolutionUpgrade to ODA 2.9 where the setting is increased. As a workaround, you may manually set vm.min_free_kbytes until you are able to upgrade. vm.min_free_kbytes=512000
References<BUG:14849704> - PAGE ALLOCATION FAILURE. ORDER:1, MODE:0X20<NOTE:1546861.1> - [Linux OS] System Hung with Large Numbers of Page Allocation Failures with "order:5" on Exadata Environments Attachments This solution has no attachment |
||||||||||||
|