Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2173394.1
Update Date:2016-08-17
Keywords:

Solution Type  Problem Resolution Sure

Solution  2173394.1 :   ODA: ODA Nodes Are Panicking and Rebooting in a Loop During ACFS Filesystem Startup  


Related Items
  • Oracle Database Appliance X4-2
  •  
  • Oracle Database Appliance Software
  •  
  • Oracle Database - Enterprise Edition
  •  
  • Oracle Database Appliance X5-2
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  




In this Document
Symptoms
Cause
Solution
 Community Discussions ODA
References


Created from <SR 3-12933779821>

Applies to:

Oracle Database Appliance Software - Version 2.1.0.1 to 12.1.2.7 [Release 2.1 to 12.1]
Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.2 [Release 12.1]
Oracle Database Appliance X4-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

1) ODA nodes are panicking and rebooting in a loop during every attempt of starting an ACFS filesystem:

[root@asmcloud1 log]# srvctl start filesystem -d /dev/asm/datastore-456

Message from syslogd@asmcloud1 at Jul 26 10:48:52 ...
kernel:Kernel panic - not syncing: odlm_kernel_lock: called by interrupt
service routine

 

[root@asmcloud2 log]# srvctl start filesystem -d /dev/asm/datastore-456

Message from syslogd@asmcloud2 at Jul 26 10:55:30 ...
kernel:Kernel panic - not syncing: odlm_kernel_lock: called by interrupt
service routine

 

2) During the system booting the OS is generating the next “kernel panic”

Pid:20646,comm:UsmRslvrMk:RECOTainted:P2.6.39-400.278.3.el6uek.x86_64#1
CallTrace: [] panic+0xa6/0x1bd
[] __KsPanic+0x92/0xa0[oracleoks]
[] ? DLM_LOG+0x45/0x50[oracleoks]
[] odlm_quelock+0x366/0x490[oracleoks]
[] ? AsmBlogClassic+0x56/0x58[oracleadvm]
[] Asm_acqRegionLock+0x3d4/0x1240[oracleadvm]
[] ? Asm_relRegionLock+0x54e/0x54e[oracleadvm]
[] Asm_startIo+0x181/0x1343[oracleadvm]
[] ? __schedule+0x3f6/0x810
[] ? AsmBlogClassic+0x56/0x58[oracleadvm]
[] AsmVolIoRequest+0x261/0x2f8[oracleadvm]
[] Asm_issueResilverIo+0x67/0x1c2[oracleadvm]
[] ? Asm_getNextRegion+0x2b1/0x2ed[oracleadvm]
[] Asm_regIoMakeCallback+0x277/0x288[oracleadvm]
[] KsOsdInvokeCallback+0x87/0xd0[oracleoks]
[] ? Asm_wakeRecovery+0x243/0x243[oracleadvm]
[] ? pick_next_task_fair+0xc8/0x120
[] Ks_invokeCallback+0x8d/0x150[oracleoks]
[] Ks_processRequests+0x161/0x3c0[oracleoks]
[] Ks_main+0xc2/0x130[oracleoks]
[] ? Ks_processRequests+0x3c0/0x3c0[oracleoks]
[] KsKthreadRun+0x7f/0xb0[oracleoks]
[] kernel_thread_helper+0x4/0x10
[] ? KsSetTimer+0x70/0x70[oracleoks]
[] ? gs_change+0x13/0x13
[] panic+0xa6/0x1bd
[] ? kmem_getpages+0xc1/0x170
[] __KsPanic+0x92/0xa0[oracleoks]
[] ? DLM_LOG+0x45/0x50[oracleoks]
[] odlm_lock+0x356/0x460[oracleoks]
[] ? AsmBlogClassic+0x56/0x58[oracleadvm]
[] Asm_cicAcqLock+0x119/0x49c[oracleadvm]
[] Asm_dgUse+0x1c4/0x52f[oracleadvm]
[] ? _raw_spin_lock_irq+0x15/0x20
[] ? __down+0x91/0xc0
[] AsmDgStateVolOpen+0x111/0x26c[oracleadvm]
[] ? KsMalloc+0xae/0x1f0[oracleoks]
[] AsmDgAction+0x152/0x201[oracleadvm]
[] AsmRootStateVolOpen+0x209/0x363[oracleadvm]
[] AsmRootAction+0x250/0x30f[oracleadvm]
[] asmOpen_int+0x5f1/0x87a[oracleadvm]
[] ? kobject_get+0x1a/0x30
[] asmOpen+0xe/0x10[oracleadvm]
[] __blkdev_get+0xd9/0x490
[] blkdev_get+0x5c/0x210
[] ? _raw_spin_lock+0xe/0x20
[] blkdev_open+0x65/0x80
[] __dentry_open+0x138/0x320
[] ? do_lookup+0x4b/0x330
[] ? blkdev_get+0x210/0x210
[] nameidata_to_filp+0x71/0x80
[] do_last+0x19e/0x8e0
[] path_openat+0xcd/0x3d0
[] ? OfsUpdateMntEntry+0x1aa/0x270[oracleacfs]
[] ? ofs_ctldev_ioctl+0x375/0x1c90[oracleacfs]
[] do_filp_open+0x49/0xa0
[] ? strncpy_from_user+0x4a/0x90
[] ? _raw_spin_lock+0xe/0x20
[] ? alloc_fd+0x10a/0x150
[] do_sys_open+0x108/0x1f0
[] ? audit_syscall_entry+0x1d7/0x200
[] sys_open+0x20/0x30
[] system_call_fastpath+0x16/0x1b

 

Cause

1) The kernel size is reporting a stack size = 16k instead of 8k on both nodes:

[root@asmcloud0 ~]# echo $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$( awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)) )
16

[root@asmcloud1 ~]# echoecho $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$( awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)) )
16

 

2) The ODA nodes were upgraded to a Linux kernel which is not certified nor supported with ODA and ACFS:

asmcloud1-ilom_1440NML0DB_2016-06-28T15-40-18/spos_logs/@var@log@messages.1:

Jun 24 09:31:10 asmcloud-ilom kernel: Linux version 2.6.27.43 (ilom@oracle) (gcc
version 4.4.5 (Debian 4.4.5-8) ) #1 Thu Aug 6 11:07:33 CST 2015

asmcloud1-ilom_1440NML0DB_2016-06-28T15-40-18/ilom/@persist@hostconsole.log.1:

Linux version 2.6.39-400.278.3.el6uek.x86_64 (mockbuild@x86-ol6-builder-04)
(gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) ) #1 SMP Thu May 19
11:54:25 PDT 2016

 

Note: A correct and valid OS kernel size should be = 8k as follows: 

[root@odax5rm1-base ~]# echo $(($(awk '$1 == "KernelStack:" {print $2}' /proc/meminfo)/$(awk '$1 =="nr_kernel_stack" {stacks += $2;} END {print stacks}' /proc/zoneinfo)))
8

  

Solution

1) This manual kernel update in the ODA nodes turned the OS and entire ODA configuration into an inconsistent state.

 

2) Therefore the solution for this issue is to reimage the ODA nodes as described in the following document:

  • Oracle Database Appliance - 12.1.2 and 2.X Supported ODA Versions & Known Issues (Doc ID 888888.1)

 
 


 

Community Discussions ODA

Still have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot)

Click here to open in main browser window

References

<BUG:24355916> - ODA NODE IS PANICKING AND REBOOTING EVERY ATTEMPT OF STARTING AN ACFS FILESYSTEM
<BUG:23312691> - LNX64: ACFS SUPPORT FOR 16K KERNEL STACK SIZE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback