Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2018298.1
Update Date:2018-05-30
Keywords:

Solution Type  Problem Resolution Sure

Solution  2018298.1 :   Oracle ZFS Storage Appliance: Systems "hang" due to 'AKD threads running in the RT class' in 2013.1.x releases  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7320
  •  
  • Oracle ZFS Storage Appliance Racked System ZS4-4
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  


Collecting an NMI when hanging will show system out of memory, and also 'pageout' threads being blocked by AKD threads running in the 'Real Time' scheduler class.

In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-10247940061>

Applies to:

Sun ZFS Storage 7320 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
Sun ZFS Storage 7120 - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

ZFS Storage Appliances hang during certain workloads after upgrading to 2013.1.x versions.

By generating an NMI crashdump (see KM Doc 1173064.1) when the system is hanging, Oracle TSC will be able to do an analysis of the situation during the hang.

In this case, the crashdump will reveal a system almost completely out of 'free memory', and a big file sync pending.

 

Typically ::memstat will show a system running with 0% free memory.

A good analysis can be found in bug 17709054 which has since been marked as a duplicate of bug 19591405

 

Changes

 A recent upgrade to any 2013.1.x version

 

Cause

The issue is caused by bug 19591405 "akd running in RT class is blocking critical memory management threads".

This bug is not yet resolved but a fully functional workaround can be provided.

Solution

 Upload and run the attached workflow, to prevent most of the AKD threads running in the 'Real Time' (scheduler) class - potentially blocking memory management threads from the CPU.

 

Attached workflow: Workaround for 19591405

 

 

 

***Checked for relevance on 30-MAY-2018***

References

<BUG:19591405> - AKD RUNNING IN RT CLASS IS BLOCKING CRITICAL SYSTEM MANAGEMENT THREADS
<NOTE:1173064.1> - Oracle ZFS Storage Appliance: How to generate a system core dump in case of system hang (BUI and CLI fails to respond) using NMI when directed to do so by an Oracle Support Engineer
<BUG:17709054> - 7120 LARGER FSYNC OPERATION SHOULD BE BLOCKED IN LOW MEMORY CONDITIONS

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback