Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-2288649.1
Update Date:2018-05-02
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  2288649.1 :   SuperCluster: Best Practice - Establish Threshold Alerts on the ZFSSA  


Related Items
  • Oracle SuperCluster M6-32 Hardware
  •  
  • Oracle SuperCluster T5-8 Hardware
  •  
  • Oracle SuperCluster Specific Software
  •  
  • Oracle SuperCluster M7 Hardware
  •  
  • SPARC SuperCluster T4-4 Half Rack
  •  
  • Oracle SuperCluster T5-8 Full Rack
  •  
  • SPARC SuperCluster T4-4
  •  
  • SPARC SuperCluster T4-4 Full Rack
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>SPARC SuperCluster>DB: SuperCluster_EST
  •  




In this Document
Purpose
Scope
Details
References


Applies to:

Oracle SuperCluster Specific Software - Version 1.x to 2.x [Release 1.0 to 2.0]
Oracle SuperCluster M7 Hardware - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 Half Rack - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 Full Rack - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Purpose

Best Practice:

Establish Threshold Alerts on the ZFSSA to ensure that availability and performance are not degraded for iSCSI LUNs consumed by SuperCluster tenant rpools

Scope

Benefit:

A properly monitored and maintained internal ZFSSA will ensure the optimum operations of all LDoms and zones that consume iSCSI LUNs off the internal ZFSSA. While the internal ZFSSA is also available for general-purpose shared storage with NFS, the single JBOD ZFSSA is not designed for heavy IO for extended periods of time. Proper monitoring of the utilization will also let you know which one of your virtual tenants may be using the ZFSSA excessively, negatively impacting other tenants on the SuperCluster.

Risk :

An improperly utilized ZFSSA without monitoring can lead to one or many tenants experiencing local IO operations that are so slow that rpools can become non responsive. This could have all sorts of database and or application repercussions.

 

Details

Mandatory actions:

Run the attached zfs_thrsh_alert.sh script and follow prompts to create the analytics worksheet and IOPS threshold so that you will be alerted if the ZFSSA is too heavily loaded and be able to obtain some more detail as to what is overloading it. 

  

root@cinqodemayo:~# ls -l ~/.ssh/id_rsa.pub
root/.ssh/id_rsa.pub: No such file or directory

  

If you do not have an RSA public key for the user you are running the script as (see above), create one :

  

root@cinqodemayo:~# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
5e:................:14:e9:5a root@cinqodemayo

  

  

Make sure the script has the execute bit set, and run it twice, each time passing one the names of the ZFSSA head nodes for your SuperCluster.  If you do not know the ZFSSA node name, you can use the IP address which should also be set for solaris and exa-family publishers if you issue the command “pkg publisher”, connect to that node and find the hostname.  If hostname is *-h1-* then the other ZFSSA head node will be *-h2-* and vice versa; you can look in /etc/hosts to find both head nodes if that standard is not adhered to for your SuperCluster.

  

  

root@cinqodemayo:~# pkg publisher
PUBLISHER                   TYPE     STATUS P LOCATION
solaris                     origin   online F
file:///net/192.168.28.1/export/IPS-repos/solaris11/repo/
exa-family                  origin   online F file:///net/192.168.28.1/export/IPS-repos/exafamily/repo/
root@cinqodemayo:~# ssh 192.168.28.1
Password:
Last Login: Tue May 5 05:05:05 2017 from 5.5.5.5
cinqo5m7-h1-storadm:> exit

root@cinqodemayo:~# chmod +x zfs_thrsh_alert.sh
root@cinqodemayo:~# zfs_thrsh_alert.sh cinqo5m7-h1-storadm

  


In our case we do not yet have passwordless ssh shared to the ZFSSA it is set that up via the script :

  

Share passwordless ssh (y/[n])?
y
Password:
                          type = RSA (uncommitted)
                           key = AAAAB3NzaC1yc2EAAAABIwAAAQEAzqaYW2lxZpc5RfgsdPLNzV4dP6QZHcOp7UajP4Ny1yL3VMrJaCVcc/a5abNdOCz7b9AGwxl/P59YdE2wZO6aRmByLy+bkRQGFywGuQ+zrqLRsnIEgxlXaZAHyYwVPZUdkMlHWBxJo6y/BMySDfzwRCj30LN2r514dSUBItYozFwWp1axd61bL9Jt2RnEi9Ajv082rfb3CveOg0TU2k8Ks6wz0uQuyJCSyejflJj/lcXLD69jBt97+b4+SkpCeVfjP/6bfwWRoY3o28KTHe7K2t1A7XKJkFctPeibqYy3zx899bnIwB/e1ZHUxpEvj2rT0RO/WJJw75b4qFikLBsDhQ== (uncommitted)
                       comment = IO_util_worksheet_alert (uncommitted)

Current worksheets:
zfssa_threshold_collection
Current thresholds:
Enter email address to send alerts to, blank to abort the process :
root@cinqodemayo.org
Press return to create new worksheet and utilization threshold alert emailed to: root@cinqodemayo.org

  

 

If you see a current worksheet with a name like zfssa_threshold_collection and a threshold with a name like io.utilization, you probably have already had this setup manually and do not need to continue (ctrl-c).

If you don't see such items, proceed by pressing return/enter key and you will see something like:

  


                      category = thresholds
                   thresholdid = bd86c855-03f6-e162-802c-937e3d02d680 (uncommitted)
                       handler = resume_worksheet
                     worksheet = b33a5e5f-f000-484d-94b8-f4cac88afbb3 (uncommitted)
                       handler = email
                       address = root@cinqodemayo.org (uncommitted)
                       subject = ZFS Storage appliance IOPS threshold exceeded. (uncommitted)
Current worksheets:
worksheet-000   root    zfssa_threshold_collection
Current thresholds:
threshold-000         75     normal io.utilization

 

 

This is indicative of the worksheet, threshold, and alert being set as desired. 

 

Run the script again on the second ZFSSA node, where it will only create the analytics and then you can exit with ^C as you will see the threshold already exists :

  

root@cinqodemayo:~# zfs_thrsh_alert.sh cinqo5m7-h2-storadm
Share passwordless ssh (y/[n])?
y
Password:
                          type = RSA (uncommitted)
                           key =
……
== (uncommitted)
                       comment = IO_util_worksheet_alert (uncommitted)

Current worksheets:
Current thresholds:
Enter email address to send alerts to, blank to abort the process :

No Email address entered so not creationg worksheet and alert.
root@cinqodemayo:~#

  

References

<NOTE:1957864.2> - x

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback