Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1553160.1
Update Date:2015-09-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  1553160.1 :   Sun Storage 7000 Unified Storage System: Vertical Outlier Elimination for Analytics broken down by latency shows less operations when set to 0%  


Related Items
  • Sun ZFS Storage 7320
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun Storage 7110 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-7094481451>

Applies to:

Sun ZFS Storage 7120 - Version All Versions to All Versions [Release All Releases]
Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases]
Sun ZFS Storage 7420 - Version All Versions to All Versions [Release All Releases]
Sun Storage 7210 Unified Storage System - Version All Versions to All Versions [Release All Releases]
Sun Storage 7110 Unified Storage System - Version All Versions to All Versions [Release All Releases]
7000 Appliance OS (Fishworks)

Symptoms

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance

Analytics, when configured with vertical outlier elimination, shows incorrect data. If no elimination is set, there are less operations and no latency is shown. If elimination is set to 1% or 5%, there are many operations with latency.

Normally, the 0% elimination should include all the peaks, and all the latencies. The problem here is that there are shown more operations with latency when setting the elimination, when in fact the elimination should exclude some of the operations.

Thus, there should always be more operations when the elimination is set to 0 rather than when it is set to 1% or 5%

Object: Problem has been noticed on the following analytics:
  -> Protocol: NFSv4 operations per second broken down by latency
  -> Protocol: SMB operations per second broken down by latency

System: Sun Storage Unified Systems

AK version: 2011.1.4.2

 

In BUI, the following outputs have been observed for the analytics "NFSv4 operations per second broken down by latency".

It can be seen that using 0% elimination, there are less iops. More precise information is provided when using 1% or 5% elimination.

- with 0% elimination (all the iops) there are:
             -> 2 ops with 26.1 ms latency
             -> 6 ops with 13.0 ms latency
             -> 1859 with 0 us latency

- with 0.1% elimination there are:
             -> 2 ops with 28.9 ms latency
             -> 1 ops with 16.7 ms latency
             -> 1 ops with 15.2 ms latency
             -> 4 ops with 13.7 ms latency
             -> 2 ops with 12.2 ms latency
             -> 2 ops with 10.7 ms latency
             -> 4 ops with 9.13 ms latency
             -> 1 ops with 7.61 ms latency
             -> 3 ops with 6.09 ms latency
             -> 1 ops with 4.57 ms latency
             -> 8 ops with 1.52 ms latency
             -> 1838 ops with 0 us latency

- with 1% elimination there are:
             -> 1 ops with 2.96 ms latency
             -> 1 ops with 2.61 ms latency
             -> 1 ops with 2.44 ms latency
             -> 1 ops with 2.26 ms latency
             -> 1 ops with 2.09 ms latency
             -> 1 ops with 1.91 ms latency
             -> 1 ops with 1.74 ms latency
             -> 1 ops with 1.57 ms latency
             -> 2 ops with 1.39 ms latency
             -> 2 ops with 1.22 ms latency
             -> 5 ops with 1.04 ms latency
             -> 6 ops with 870 us latency
             -> 27 ops with 696 us latency
             -> 97 ops with 522 us latency
             -> 552 ops with 348 us latency
             -> 23 ops with 174 us latency
             -> 1124 with 0 us latency

- with 5% elimination there are:
             -> 2 ops with 957 us latency
             -> 1 ops with 914 us latency
             -> 2 ops with 870 us latency
             -> 5 ops with 827 us latency
             -> 5 ops with 783 us latency
             -> 9 ops with 740 us latency
             -> 8 ops with 696 us latency
             -> 9 ops with 653 us latency
             -> 24 ops with 690 us latency
             -> 25 ops with 566 us latency
             -> 39 ops with 522 us latency
             -> 104 ops with 479 us latency
             -> 202 ops with 435 us latency
             -> 211 ops with 392 us latency
             -> 35 ops with 348 us latency
             -> 3 ops with 305 us latency
             -> 6 ops with 261 us latency
             -> 4 ops with 218 us latency
             -> 10 ops with 174 us latency
             -> 14 ops with 131 us latency
             -> 182 ops with 87 us latency
             -> 269 ops with 44 us latency
             -> 659 ops with 0 us latency

Cause

The cause of this problem is the way the Dtrace is designed to calculate the latencies.


Thus, 0% elimination will provide the maximum and the minimum values and rounds the latency's values in between. The output will be a reduced one, not a very precise output, that will provide an average of the latencies.

Using 1% elimination and especially 5% elimination, there will be provided more accurate and more detailed information about the latencies.

Dtrace uses a linear algorithm, lquantize, which is a linear frequency distribution, sized by the specified range, of the values of the specified expressions. Increments the value in the highest bucket that is less than the specified expression

An example of lquantize function, which is used in the code is @us0 = lquantize(this->usec, 0, 1000, 10), where the timestamps are calculated in microseconds.

So 0% elimination is not actually showing all the operations and all the latencies, it summarizes the values and calculates them according to the Gaussian function. This is why there are some missing values in the output.

5% elimination provides the most precise values, with all the operations and the latencies.

Best practice is to keep the elimination at the default value of 0.1%, so never set the elimination to 0% as you might loose some important data from the analysis.

It is not recommended to use 5% either, as this is too detailed and might impact the analysis in case of a performance issue.

 


The code of the analytics can be obtained by going to the Browser User Interface -> Analytics -> Open Worksheets -> choose the statistic for Protocol: SMB or NFS operations per second broken down by latency
Press the "Export Data" button and the following code will appear:

smb:::op-*-start
{
self->opstart = timestamp;
}

smb:::op-*-done
/(self->opstart != 0)/
{
this->usec = (timestamp - self->opstart) / 1000;
this->msec = this->usec / 1000;

@us0 = lquantize(this->usec, 0, 1000, 10);
@us1 = lquantize(this->usec, 1000, 10000, 100);
@ms0 = lquantize(this->msec, 10, 100, 1);
@ms1 = lquantize(this->msec, 100, 1000, 10);
@ms2 = lquantize(this->msec, 1000, 10000, 100);
@ms3 = lquantize(this->msec, 10000, 100000, 1000);

}


This explains how the latencies are calculated.
For more details about lquantize function in Dtrace, you can check the Dtrace documentation at:  https://docs.oracle.com/cd/E18752_01/html/817-6223/chp-aggs-2.html#indexterm-155

 

Solution

Always use the default value of 0.1% for vertical outlier elimination for analytics broken down by latency.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback