Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1458626.1
Update Date:2018-05-29
Keywords:

Solution Type  Problem Resolution Sure

Solution  1458626.1 :   ASM In Hung State Across All Nodes In Exadata Server  


Related Items
  • Exadata X3-2 Hardware
  •  
  • Oracle Exadata Hardware
  •  
  • Exadata Database Machine X2-8
  •  
  • Exadata X3-8 Hardware
  •  
  • Exadata Database Machine X2-2 Hardware
  •  
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST
  •  


HUGE_PAGES  USE_LARGE_PAGES  in general should not be used with ASM. We will review a problem where ASM would hang across several nodes once this setting was altered from AMM or ASMM to Huge_pages or the more current 11.2.0.2+ parameter USE_LARGE_PAGES .

Created from <SR 3-5706628341>

Applies to:

Oracle Exadata Hardware - Version 11.2.0.2 and later
Exadata Database Machine V2 - Version All Versions and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Exadata Database Machine X2-8 - Version All Versions and later
Exadata X3-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Symptoms

  • ASM in hung state across all nodes in Exadata server
  • System state dump requested by (instance=2, osid=3618), summary=[SYSTEMSTATE_GLOBAL: global system state dump request (kjdgdss_g)] **
     System State dumped to trace file /u01/app/oracle/diag/asm/+asm/+ASM8/trace/+ASMx_diag_7745.trc
  • Enqueue blocker waiting on 'GCS lock esc X'

 

More from the ASM ALERT.logs

As this was on an eight-node ASM cluster the only location the following command
will be found was on the original node where the ALTER SYSTEM command was issued
from a single node

...
...
  Sat Apr 28 06:16:49 2012
  ALTER SYSTEM SET memory_target='0' SCOPE=SPFILE SID='*';
  Sat Apr 28 06:17:26 2012
  ALTER SYSTEM SET memory_max_target='0' SCOPE=SPFILE SID='*';
  Sat Apr 28 06:17:50 2012
  ALTER SYSTEM RESET memory_max_target SCOPE=SPFILE SID='*';
  Sat Apr 28 06:18:18 2012
  ALTER SYSTEM SET use_large_pages='TRUE' SCOPE=SPFILE SID='*';
  Sat Apr 28 06:23:51 2012
...
...

 

However, we would find evidence of the changes in the ALERT.LOGs especially during instance restarts (which were frequent).


Changes

Switched ASM from using AMM to USE_LARGE_PAGES.

Cause

ASM was set to use Large Pages / Huge pages without making associated changes to the OS:

 

Alert.log excerpt confirms the use_large_pages parameter was being used

Machine: x86_64
Using parameter settings in server-side spfile +DBFS_DG/....
System parameters with non-default values:
  processes = 1200
   use_large_pages = "TRUE" <<<<<<<<<<<<< Older parameter setting that is no longer used in favor of ONLY
  large_pool_size = 16M
  instance_type = "asm"    
  ....
  sga_target = 1264M
  memory_target = 0
  pga_aggregate_target = 400M
...
...

****************** Large Pages Information *****************

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 16 MB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB

RECOMMENDATION:
 Total Shared Global Region size is 1258 MB. For optimal performance,
 prior to the next instance restart increase the number
 of unused Large Pages by atleast 629 2048 KB Large Pages (1258 MB)
 system wide to get 100% of the Shared
 Global Region allocated with Large pages
***********************************************************


 

Solution

1) Please set use_large_pages = FALSE on all ASM instances -- do not use Hugepages on ASM.
    
OR

   Use values calculated from  401749.1 - Oracle Linux: Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration
   and set use_large_pages=ONLY.


2) While you can use SGA and PGA settings for ASM it is recommended to set AMM MEMORY_TARGET to 1.6G. 

    Leave the minimum settings of 400M for PGA and 600M for SGA with the remaining balance dynamically acquired as needed for either.

On /dev/shm mount point, you can configure ASMM instead of AMM,

  . set SGA_TARGET, SGA_MAX_SIZE and PGA_AGGREGATE_TARGET instead of MEMORY_TARGET.

Please see the reference section for more information on ASM, AMM.  HugePages and Use_Large_Pages.

 

NOTE:  the symptom global system state dump request (kjdgdss_g)]  is actually due to a known bug generating excessive Systemstate Dumps during hang conditions
      This symptom is a consequence of the memory misconfiguration and hangs and can occur under many other circumstances and is discussed in Note 10256843.8.

 

References

<NOTE:10256843.8> - Bug 10256843 - Hang manager may trigger unnecessary SYSTEMSTATE dumps on ASM instance
<NOTE:401749.1> - Oracle Linux: Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration
<NOTE:265633.1> - ASM Technical Best Practices For 10g and 11gR1 Release
<NOTE:1279458.1> - Exadata Database Machine Reference Guide for Upgrade 11.2.0.1 to 11.2.0.2
<BUG:10256843> - HANG MANAGER MAY TRIGGER UNNECESSARY SYSTEMSTATE DUMPS ON ASM INSTANCE
<NOTE:1392497.1> - USE_LARGE_PAGES To Enable HugePages

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback