Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2112032.1
Update Date:2017-07-18
Keywords:

Solution Type  Problem Resolution Sure

Solution  2112032.1 :   FS System: How to Avoid Exceeding the Stripe Handle Threshold Limit  


Related Items
  • Oracle FS1-2 Flash Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Flash Storage>SN-EStor: FSx
  •  




Created from <SR 3-12210140961>

Applies to:

Oracle FS1-2 Flash Storage System - Version All Versions and later
Information in this document applies to any platform.

Symptoms

This document provides information on the stripe handle threshold limits and how best to manage the Flash Storage System in avoiding that threshold.  Systems will report Pinned Data events and Volumes with Read Only status.  In Release 06.02.02 and lower, Controllers can also warm start and in extreme cases cause loss of access to data.

Changes

 

Cause

The FS1-2 Flash Storage Systems before R6.2.11 have a system-wide stripe handle threshold limit of 131,072.  Beginning with the release of R6.2.11, customers can increase that limit to 1,048,576 by enabling Enhanced Allocation (see patch Readme for details on Enhanced Allocation).   On systems with large numbers of Hard Disk Drive (HDD) Drive Enclosures, the threshold can easily be reached.  Systems with capacity Solid State Drives (SSDs) are even more susceptible as they use more stripe handles than their HDD counterparts.  If this limit is reached with Auto Tiering enabled, it will attempt to first utilize the all of the available capacity.  If that happens more storage infill is required causing repeated software faults to be experienced on the Controllers.

If an FS1-2 experiences this condition, contact Oracle Customer Support for assistance in recovering the system.  Recovery will likely require a Disruptive Upgrade to at least release 06.02.03. And while release 06.02.03 will inhibit the software faults on the Controllers, it could still result in Pinned Data and Read Only Volumes for the resources that have attempted to allocate storage.  As such, upgrading to R6.2.11 or higher and enabling Enhanced Allocation is the better solution.

A Drive Group is a collection of physical drives (either 12 HDDs or 6 SSDs) within the same Drive Enclosure.  The table below shows how many stripe handles are used for each Drive Group based on the drive capacity:

Maximum Stripe Handles
Drive Capacity 300GB HDD 900GB HDD 1.2TB HDD 4TB HDD 8TB HDD 10TB HDD 400GB SSD 1.6TB SSD 3.2TB SSD
Stripe Handles/Drive Group 750 2250 3000 1250 2500 3125 4000 8000 16000
Drive Groups per Enclosure 2 2 2 2 2 2 1-2 1-3 1-3

These number assume 100% of the available capacity is used by Auto Tiering. If there are Single Tiered LUNs on the storage classes the number of stripe handles will be lower.

Solution


Based on the relationship between storage capacity and the number of stripes, it may be necessary to adjust the 'Allowable Storage Capacity for Auto-tiered LUNs' percentage within a Storage Domain.  If unsure how to calculate the stripe handle count and/or how to adjust the Storage Domain settings, please contact Oracle Support Services.

  1. Extract the Drive Group information from FS1-2.  This can be obtained in two ways:
    • Using the Oracle FS System Manager GUI:
      1. From the System tab, expand Hardware if needed and select Drive Groups and note the Media Type:
        Drive Groups

      2. Using the mouse, right click + View on each Drive Group and note the Drive Capacity:
        Drive Group Size

    • Using the Oracle FS System Manager CLI (fscli): following fscli commands:
      1. Gather the raw data on the Drive Groups:
        # fscli login -u administrator -oracleFS <FS1 management IP address>
        # fscli drive_group -list -details > drive_group_details.txt


      2. Examine the output to determine the Media Type and Drive Capacity of HDDs/SDDs in each Drive Group:
        /DRIVE_GROUP-000
            Id                          : 414B303031323639A2860D2E27EF9F10
            ManagementState             : AVAILABLE
        ...
            DiskDrive
                DriveStatus                 : NORMAL
                Model                       : H7280A520SUN8.0T
          
  2.  Using the chart in the previous section, calculate the number of stripe handles:

    Example 1 - FS1-2 Flash Storage System with 13 Drive Enclosures:

    Enclosures:
    2 x ORACLE  DE2-24P/378GB/STORAGE_CLASS_SLC_SSD
    4 x ORACLE  DE2-24C/3784GB/STORAGE_CLASS_NEARLINE_HDD
    5 x ORACLE  DE2-24P/851GB/STORAGE_CLASS_PERF_HDD
    2 x ORACLE  DE2-24P/1513GB/STORAGE_CLASS_MLC_SSD


    Drive Group Information:
    10 x 900GB PERF_HDD DG (2250)   =  22500
     8 x 4TB NEARLINE_HDD DG (1250) =  10000
     4 x 400GB SLC_SSD DG (4000)    =  16000
     4 x 1.6TB MLC_SSD DG (8000)    =  32000
                                 Total 80500
       

    80500 is below the limit and therefore there is no issue in providing 100% of the available capacity for Auto-Tier usage.

    Example 2 - FS1-2 Flash Storage System with 30 Drive Enclosures: 
    Enclosures:
     4 x ORACLE  DE2-24P/378GB/STORAGE_CLASS_SLC_SSD
    22 x ORACLE  DE2-24P/851GB/STORAGE_CLASS_PERF_HDD
     4 x ORACLE  DE2-24P/1513GB/STORAGE_CLASS_MLC_SSD


    Drive Group Information:
    44 x 900GB PERF_HDD DG (2250) =  99000
     8 x 400GB SLC_SSD DG (4000)  =  32000
    12 x 1.6TB MLC_SSD DG (8000)  =  96000
                              Total 227000

    227000 is above the threshold limit before Enhanced Allocation in R6.2.11 and therefore it is required to reduce the available capacity for Auto-Tier usage.



  3. Making the following adjustments will lower the total number of stripe handles to 124700 which is less than the threshold limit:
    Perf HDD: room for 99000 stripe handles.  Limit auto-tier to  50% to consume only 49500
    SLC  SSD: room for 32000 stripe handles.  Leave auto-tier at 100% to consume only 32000
    MLC  SSD: room for 96000 stripe handles.  Limit auto-tier to  45% to consume only 43200

    These values will provide just over 6300 additional stripe handles for usage by Single Tier LUNs and Metadata LUNs.

    NOTE: To receive the benefit of Auto-Tiering 100% should be allocated for Performance SSDs. Performance SSD usage may need to be limited if you are using this media for both Single-Tiered and Auto-Tiered LUNs. Please review this knowledge document for more details: Document 2044814.1 FS System: LUNs May Experience Higher Latency When Storage Class Free Space Is Low.

 

 

Internal section:

Number of stripes and stripe handles per DG

  2 MAUs make one stripe of SLC SSD
  4 MAUs make one stripe of MLC SSD
 16 MAUs make one stripe of Perf HDD
128 MAUs make one stripe of Cap HDD

Type of drive         #MAUs per DG       #MAUS per DG in hexadecimal      #Stripe handles per DG
300GB HDD                    12000                            0x2EE0                         750
900GB HDD                    36000                            0x8CA0                        2250
1.2TB HDD                    48000                            0xBB80                        3000
4TB HDD                     160000                           0x27100                        1250
8TB HDD                     320000                           0x4E200                        2500
10TB HDD                    400000                           0x61A80                        3125
400G SSD                      8000                            0x1F40                        4000
1.6TB SSD                    32000                            0x7D00                        8000
3.2TB SSD                    64000                            0xFA00                       16000               

 

Procedure to obtain the number of used stripe handles using the system logs:

To detect the currently used number of stripes the system log collection must include COD data. It is generally available in every log bundle including periodic log collections.

 

  1. Locate the system log tar file and then execute the following in the tar file directory: 
    # scanlog6 -t <tar_log_bundle_name> -separate
      

  2. Extract the COD information in a readable format:
    # geomap_r6.pl -s off
     

  3. Count the stripe handle numbers using the geomap.Chapter.txt file
    # grep stripeHandle geomap.Chapter.txt | sort | uniq | wc
      
    NOTE: The stripe handle value needed is listed on the first column
     
    Example:
      
    # geomap_r6.pl -s off
    Generating geomap.Chapter.txt
    Generating geomap.EnclosureRMAP.txt
    Generating geomap.EnclosureSummary.txt
    Generating geomap.EnclosureVolumes.txt
    Generating geomap.SlatSummary.txt
    Generating geomap.StorageDomainSummary.txt
    Generating geomap.StorageDomainVolumes.txt
    Generating geomap.SystemCapacity.txt
    Generating geomap.VolumeExtents.txt
    Generating geomap.VolumeSummary.txt
    Generating geomap.VolumeExtents.xml and geomap.BrickVolumes.xml
    Generating geomap.csv
    Generating lun2Vlun list
    Skipping geomap.SlatExtent.txt
    # grep stripeHandle geomap.Chapter.txt | sort | uniq | wc
      62336   124672 2276392

    The numbers generated are result of the wc tool which reports newline, word, and byte counts for the file. We are interested to newline count as we are looking for used uniq stripehandle counts. if you want to reach direct result you can use below command instead which will give only line count of the file:

    # grep stripeHandle geomap.Chapter.txt | sort | uniq | wc -l
       62336  

      


 

References

<NOTE:2044814.1> - FS System: LUNs May Experience Higher Latency When Storage Class Free Space Is Low
<BUG:22763703> - FS1 IS DOWN

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback