![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2173394.1 : ODA: ODA Nodes Are Panicking and Rebooting in a Loop During ACFS Filesystem Startup
In this Document
Created from <SR 3-12933779821> Applies to:Oracle Database Appliance Software - Version 2.1.0.1 to 12.1.2.7 [Release 2.1 to 12.1]Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.2 [Release 12.1] Oracle Database Appliance X4-2 - Version All Versions to All Versions [Release All Releases] Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. Symptoms1) ODA nodes are panicking and rebooting in a loop during every attempt of starting an ACFS filesystem: [root@asmcloud1 log]# srvctl start filesystem -d /dev/asm/datastore-456 Message from syslogd@asmcloud1 at Jul 26 10:48:52 ...
[root@asmcloud2 log]# srvctl start filesystem -d /dev/asm/datastore-456 Message from syslogd@asmcloud2 at Jul 26 10:55:30 ...
2) During the system booting the OS is generating the next “kernel panic” Pid:20646,comm:UsmRslvrMk:RECOTainted:P2.6.39-400.278.3.el6uek.x86_64#1
CallTrace: [] panic+0xa6/0x1bd [] __KsPanic+0x92/0xa0[oracleoks] [] ? DLM_LOG+0x45/0x50[oracleoks] [] odlm_quelock+0x366/0x490[oracleoks] [] ? AsmBlogClassic+0x56/0x58[oracleadvm] [] Asm_acqRegionLock+0x3d4/0x1240[oracleadvm] [] ? Asm_relRegionLock+0x54e/0x54e[oracleadvm] [] Asm_startIo+0x181/0x1343[oracleadvm] [] ? __schedule+0x3f6/0x810 [] ? AsmBlogClassic+0x56/0x58[oracleadvm] [] AsmVolIoRequest+0x261/0x2f8[oracleadvm] [] Asm_issueResilverIo+0x67/0x1c2[oracleadvm] [] ? Asm_getNextRegion+0x2b1/0x2ed[oracleadvm] [] Asm_regIoMakeCallback+0x277/0x288[oracleadvm] [] KsOsdInvokeCallback+0x87/0xd0[oracleoks] [] ? Asm_wakeRecovery+0x243/0x243[oracleadvm] [] ? pick_next_task_fair+0xc8/0x120 [] Ks_invokeCallback+0x8d/0x150[oracleoks] [] Ks_processRequests+0x161/0x3c0[oracleoks] [] Ks_main+0xc2/0x130[oracleoks] [] ? Ks_processRequests+0x3c0/0x3c0[oracleoks] [] KsKthreadRun+0x7f/0xb0[oracleoks] [] kernel_thread_helper+0x4/0x10 [] ? KsSetTimer+0x70/0x70[oracleoks] [] ? gs_change+0x13/0x13 [] panic+0xa6/0x1bd [] ? kmem_getpages+0xc1/0x170 [] __KsPanic+0x92/0xa0[oracleoks] [] ? DLM_LOG+0x45/0x50[oracleoks] [] odlm_lock+0x356/0x460[oracleoks] [] ? AsmBlogClassic+0x56/0x58[oracleadvm] [] Asm_cicAcqLock+0x119/0x49c[oracleadvm] [] Asm_dgUse+0x1c4/0x52f[oracleadvm] [] ? _raw_spin_lock_irq+0x15/0x20 [] ? __down+0x91/0xc0 [] AsmDgStateVolOpen+0x111/0x26c[oracleadvm] [] ? KsMalloc+0xae/0x1f0[oracleoks] [] AsmDgAction+0x152/0x201[oracleadvm] [] AsmRootStateVolOpen+0x209/0x363[oracleadvm] [] AsmRootAction+0x250/0x30f[oracleadvm] [] asmOpen_int+0x5f1/0x87a[oracleadvm] [] ? kobject_get+0x1a/0x30 [] asmOpen+0xe/0x10[oracleadvm] [] __blkdev_get+0xd9/0x490 [] blkdev_get+0x5c/0x210 [] ? _raw_spin_lock+0xe/0x20 [] blkdev_open+0x65/0x80 [] __dentry_open+0x138/0x320 [] ? do_lookup+0x4b/0x330 [] ? blkdev_get+0x210/0x210 [] nameidata_to_filp+0x71/0x80 [] do_last+0x19e/0x8e0 [] path_openat+0xcd/0x3d0 [] ? OfsUpdateMntEntry+0x1aa/0x270[oracleacfs] [] ? ofs_ctldev_ioctl+0x375/0x1c90[oracleacfs] [] do_filp_open+0x49/0xa0 [] ? strncpy_from_user+0x4a/0x90 [] ? _raw_spin_lock+0xe/0x20 [] ? alloc_fd+0x10a/0x150 [] do_sys_open+0x108/0x1f0 [] ? audit_syscall_entry+0x1d7/0x200 [] sys_open+0x20/0x30 [] system_call_fastpath+0x16/0x1b
Cause1) The kernel size is reporting a stack size = 16k instead of 8k on both nodes: [root@asmcloud0 ~]# echo $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$( awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)) ) [root@asmcloud1 ~]# echoecho $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$( awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)) )
2) The ODA nodes were upgraded to a Linux kernel which is not certified nor supported with ODA and ACFS: asmcloud1-ilom_1440NML0DB_2016-06-28T15-40-18/spos_logs/@var@log@messages.1: Jun 24 09:31:10 asmcloud-ilom kernel: Linux version 2.6.27.43 (ilom@oracle) (gcc asmcloud1-ilom_1440NML0DB_2016-06-28T15-40-18/ilom/@persist@hostconsole.log.1: Linux version 2.6.39-400.278.3.el6uek.x86_64 (mockbuild@x86-ol6-builder-04)
Note: A correct and valid OS kernel size should be = 8k as follows: [root@odax5rm1-base ~]# echo $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$(awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)))
8
Solution1) This manual kernel update in the ODA nodes turned the OS and entire ODA configuration into an inconsistent state.
2) Therefore the solution for this issue is to reimage the ODA nodes as described in the following document:
Community Discussions ODAStill have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot) Click here to open in main browser window References<BUG:24355916> - ODA NODE IS PANICKING AND REBOOTING EVERY ATTEMPT OF STARTING AN ACFS FILESYSTEM<BUG:23312691> - LNX64: ACFS SUPPORT FOR 16K KERNEL STACK SIZE Attachments This solution has no attachment |
||||||||||||||||||||
|