Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1402579.1
Update Date:2018-01-05
Keywords:

Solution Type  Troubleshooting Sure

Solution  1402579.1 :   Sun Storage 7000 Unified Storage System: How to Troubleshoot Problems with the NFS Service  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 Framing the problem
 Gather useful data
 Common NFS problems
 Network Ports common to NFS
 Performance Problems
 Client Mount Options
 No Lock Available
 Unable to clear locks for non-Solaris clients
 NFSv4 Domain must match on appliance and clients
 Shares accessed via NFSv4 hang whilst the same share accessed via NFSv3 or SMB is responsive
 Dropped Packets on Fast Networks
 RHEL Clients using NFSv3 see errors like "bad nfs status return value: 88"
 Further help required:
References


Applies to:

Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS3-2 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Sun Storage 7110 Unified Storage System - Version All Versions and later
Sun ZFS Storage 7120 - Version All Versions and later
7000 Appliance OS (Fishworks)

Purpose

This document is intended to provide a troubleshooting path for problems that involve loss of access to shares that are accessed by clients using the NFS protocol.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance Community

Troubleshooting Steps

Problems with client access to the data shares via the NFS protocol can have many causes, many problems that at first glance may appear to be network connectivity problems can be due to other issues.
Links to other troubleshooting resolution path documents will be provided below where applicable.

Framing the problem

First it must be decided whether the problem is likely to be due to issues with the NFS protocol or elsewhere.
Some basic questions must be asked.

Is the problem with loss of connectivity to shares?
If the problem is due to an inability to read or write to a particular file or directory, but the share itself is mounted and other subdirectories or files are accessible, then this is likely to be a permissions issue that will be better covered in:

  Document 1439378.1 "Sun Storage 7000 Unified Storage System: How to Troubleshoot UNIX/NFS File and Directory Permission Issues".

If other clients access the same share via another protocol such as SMB then this may be the cause of permissions problems, see Document 1428753.1 "Sun Storage 7000 Unified Storage System: How to Troubleshoot Identity Mapping and cross-platform file sharing issue".

Does the problem involve loss of access for clients that access the shares via other protocols than NFS?
If so check the General Network troubleshooting resolution path - Document 1392086.1 "Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems".

Gather useful data

Useful information to gather will include:

  • The NFS version(s) that the client(s) use.
  • The Directory Service(s) used
  • Can the appliance ping the client IP address and vice versa
  • Can the appliance resolve the client hostname and vice versa
  • Any errors generated from the client - the files to check will depend on the client operating system, examples would be
    • Linux - /var/log/messages
    • Solaris - /var/adm/messages
    • Windows - The "system log" viewable through the "Event Viewer".

At the very least a support bundle will be useful to examine the network configuration of the appliance. See Document 1019887.1 "Sun Storage 7000 Unified Storage System: How to collect supportfile bundle using the BUI or CLI".

It is more likely for complex issues that a network trace will need to be gathered to effectively troubleshoot the problem. A network
trace will need to be gathered from the client side and the appliance side. For more details see Document 1398376.1 "Sun Storage 7000 Unified Storage System: How to get a network trace to assist in troubleshooting network problems".

If the problem is more of a performance type issue - e.g. I/O to the share takes longer and longer until the timeout is reached and
the share becomes unavailable - then it would be useful to collect the recommended analytics datasets from Document 1230143.1 "Sun Storage 7000 Unified Storage System: Collecting Analytics data for NFS performance issues".

Common NFS problems

Network Ports common to NFS

Sometimes a firewall keeps NFS Clients from successfully connecting to NFS Servers. As there are several deamons required to open connection the ports for following deamons are required to be open:

  • rpcbind on port 111
  • nfsd on port 2049
  • lockd on port 4045

Other Deamons might like mountd or statd have ports higher than 32.000. For more details on them look at Document 1175804.1

Performance Problems

Many times a loss of connection to NFS shares may result from performance issues that may be due to a poor choice of configuration, or either an appliance-side or client-side tuning issue.
These scenarios are covered in detail in the following documents. To find recommendations on how best to initially configure the pool and hardware to maximize NFS performance see Document 1213725.1 "Sun Storage 7000 Unified Storage System: Configuration and tuning for NFS performance".

To find out about known NFS performance limitations, and what may be tuned on the appliance-side - such as the number of server threads, or on the client-side for different types of client or workload then check Document 1213739.1 "Sun Storage 7000 Unified Storage System: Known NFS performance issues".

Client Mount Options

The mount options that should be specified on a client mounting an NFS share will depend on many factors such as the client OS, the type of workload and the applications accessing the data. For recommended mount options for shares used by Oracle Clusterware and RAC in Unix, Linux and Windows environments see Document 359515.1 " Mount Options for Oracle files when used with NFS on NAS devices".

No Lock Available

If an error similar to "No locks available" is noticed when trying to mount a NFS share on a client, then the default number of NFS locks available on the appliance may need to be increased.
This is controlled by the parameter LOCKD_SERVERS, for details on how to increase this Document 1375378.1 "Sun Storage 7000 Unified Storage System: Possible cause of NFS Lock Issue".

Unable to clear locks for non-Solaris clients

Prior to appliance software version 2011.04.24.5.0 there was no supported way to remove stale or otherwise malfunctioning NFS locks for a non-Solaris client.

See Bug 15771518 "SUNBT6858883-AK-2011.04.24 Need way to clear NFS locks on behalf of non-Solaris" for details.

 

As of 2011.04.24.5.0 this functionality is provided by means of a workflow. Upon running the workflow the hostname and IP address of the client for which the locks are to be cleared is asked for.
Care should be taken to ensure that the correct hostname or IP address is used to prevent locks being cleared from genuine clients.


To run the workflow from the BUI:

Maintenance -> WORKFLOWS -> Clear Locks

Enter client hostname and IP address.

 

To run the workflow from the CLI:

lbl-262:> maintenance workflows show

Properties:
                    showhidden = false

Workflows:

WORKFLOW     NAME                 OWNER      SETID ORIGIN
workflow-000 Clear locks          root       false Oracle Corporation
workflow-001 Configure for Oracle Solaris Cluster NFS root       false Oracle Corporation
workflow-002 Unconfigure Oracle Solaris Cluster NFS root       false Oracle Corporation
workflow-003 Configure for Oracle Enterprise Manager Monitoring root       false Sun Microsystems, Inc.
workflow-004 Unconfigure Oracle Enterprise Manager Monitoring root       false Sun Microsystems, Inc.


lbl-262:> maintenance workflows select workflow-000
lbl-262:maintenance workflow-000>
lbl-262:maintenance workflow-000> execute
lbl-262:maintenance workflow-000 execute (uncommitted)> show
Properties:
                      hostname = (unset)
                       ipaddrs = (unset)

lbl-262:maintenance workflow-000 execute (uncommitted)> set hostname=testhost
                      hostname = testhost
lbl-262:maintenance workflow-000 execute (uncommitted)> set ipaddrs=192.168.0.10
                       ipaddrs = 192.168.0.10
lbl-262:maintenance workflow-000 execute (uncommitted)> commit

NFSv4 Domain must match on appliance and clients

If NFSv4 is used then special attention must be paid to the concept of the NFS domain. This must match on both appliance and clients for the clients to be able to mount appliance shares over NFSv4.

See Document 1409693.1 "Sun Storage 7000 Unified Storage System: NFSv4 clients cannot mount shares if NFSv4 identity domains do not match"

Shares accessed via NFSv4 hang whilst the same share accessed via NFSv3 or SMB is responsive

In an environment where clients mount filesystems from the ZFS Storage Appliance utilizing NFSv4 protocol it is seen that the requests for file access were hanging, and additionally commands such as ls and df were also unresponsive.
The same filesystems mounted via NFS v3 or SMB/CIFS are responsive. This is because the limit of entries for the OpenOwner table is exhausted.
See Document 1385644.1 "Sun Storage 7000 Unified Storage System: OpenOwner Entries Count Increases on ZFS Storage Appliance".

Dropped Packets on Fast Networks

If when using a fast network technology such as 10 Gigabit Ethernet or IPoIB it is found that NFS performance suffers with dropped packets and TCP transmission problems, then this may be due to the TCP buffer size on the appliance being set to too small a default value to cope with these fast technologies.
More details on this problem and how to increase the TCP buffer size can be found here:

   Document 1401076.1 "Sun Storage 7000 Unified Storage System: How to increase the TCP buffer size to improve NFS performance in fast 10Gb Ethernet or IPoIB networks".

RHEL Clients using NFSv3 see errors like "bad nfs status return value: 88"

This is to do with NFSv4 using UTF-8 encoding but the clients mounting the share may not.
See Document 1405993.1 "Sun Storage 7000 Unified Storage System: Nfs_stat_to_errno: Bad Nfs Status Return Value: 88".

Further help required:

At this point if the problem still exists and further troubleshooting is required please contact Oracle Support Services to raise a new Service Request (SR) via
the My Oracle Support portal (https://support.oracle.com) or by telephoning your Oracle Global Support telephone number.
Please provide all data collected in the steps above to assist in the troubleshooting process.

Back to Document 1416406.1 Sun ZFS Storage Appliances Troubleshooting Resource Center

References

<NOTE:1392086.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems
<NOTE:1398376.1> - Sun Storage 7000 Unified Storage System: How to get a Network Trace to assist in Troubleshooting Network Problems
<NOTE:1401076.1> - Sun Storage 7000 Unified Storage System: How to increase the TCP buffer size to improve NFS performance in fast 10Gb Ethernet or IPoIB networks
<NOTE:145194.1> - ORA-1157 ORA-1110 ORA-27086 Starting up Database
<NOTE:1019887.1> - Sun Storage 7000 Unified Storage System: How to Collect a Support Bundle using the BUI or CLI
<NOTE:1175804.1> - NFS and NIS Interaction with Network Firewalls
<NOTE:1230143.1> - Sun Storage 7000 Unified Storage System: Collecting analytics data for NFS performance issues
<NOTE:1385644.1> - Sun Storage 7000 Unified Storage System: OpenOwner Entries count increases on ZFS Storage Appliance.
<NOTE:1409693.1> - Sun Storage 7000 Unified Storage System: NFSv4 clients cannot mount shares if NFSv4 identity domains do not match
<NOTE:1416406.1> - Sun ZFS Storage Appliances Troubleshooting Resource Center
<NOTE:1213725.1> - Sun Storage 7000 Unified Storage System: Configuration and tuning for NFS performance
<NOTE:1375378.1> - Sun Storage 7000 Unified Storage System: Possible cause of NFS Lock Issue
<NOTE:1439378.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot UNIX/NFS File and Directory Permission Issues
<NOTE:1405993.1> - Sun Storage 7000 Unified Storage System: Nfs_stat_to_errno: Bad Nfs Status Return Value: 88
<NOTE:1428753.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Identity Mapping and Cross-Platform File Sharing Issues
<BUG:15771518> - SUNBT6858883-AK-2011.04.24 NEED WAY TO CLEAR NFS LOCKS ON BEHALF OF NON-SOLARIS
<NOTE:359515.1> - Mount Options for Oracle files when used with NFS on NAS devices
<NOTE:1213739.1> - Sun Storage 7000 Unified Storage System: Known NFS performance issues

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback