Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1395461.1
Update Date:2018-05-11
Keywords:

Solution Type  Troubleshooting Sure

Solution  1395461.1 :   Sun Storage 7000 Unified Storage System: Best Practice Recommendations for Network Configuration  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7320
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 Datalink layer
 Interface Layer
 Routing
 Services
 Clustering Considerations
References


Applies to:

Sun Storage 7210 Unified Storage System - Version All Versions and later
Oracle ZFS Storage ZS3-4 - Version All Versions and later
Oracle ZFS Storage ZS4-4 - Version All Versions and later
Oracle ZFS Storage ZS3-BA - Version All Versions and later
Sun Storage 7410 Unified Storage System - Version All Versions and later
7000 Appliance OS (Fishworks)

Purpose

This document will explain the best practice recommendations for setting up various network related configurations on the Sun Storage 7000 Unified Storage System.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance Community

Troubleshooting Steps

The appliance uses a 4 layer model for network configuration.

  • Devices - these are the physical instances of a network interface, for example the onboard quad gigabit ethernet card in a 7110 gives 4 devices out of the box - nge0, nge1, nge2 and nge3, and the zs4-4 has four built-in ixgbe 10Gig NICs.
  • Datalinks - the datalinks are the means by which packets are sent and received, they can be associated with one or more devices. Link aggregation (LACP), VNICs and VLANs are configured at the datalink layer.
  • Interfaces - these are the means by which addressing is configured. An interface is either associated with a single datalink, or is associated with a group of other interfaces in an IPMP group.
  • Routing - Governs how the IP packets will be directed. Routes can be added automatically by the system or manually by an administrator. Automatically added routes will show as "System", "DHCP" or "Dynamic" depending on how it was added.  A manually added route will show as "Static", or "Inactive" if associated with an offline or inactive interface.

Datalink layer

Link Aggregation or LACP is used primarily as a means of increasing performance (example configuration). It works by associating two or more devices with a single datalink to increase the throughput available to that datalink. It can be seen then that for Link Aggregation to work correctly the devices that are to be used in the aggregation must be cabled to the same switch before attempting to configure the datalink.
To configure a link aggregation via the BUI simply navigate to Configuration > Network and click on the "+" icon next to the Datalinks section. Next, simply name the new datalink, check the LACP checkbox and then select which devices should be used in the datalink from the list of available devices.
Please note that some switches do not use the LACP protocol, please see the Configuration:Network:#Datalinks section of the appropriate Administration manual for your system, or the same section available through the HELP link in the BUI for recommendations on the Properties to use for different scenarios. Please also note that some configuration on the switch may also be necessary. See your switch manufacturers documentation for details.

An example of configuring a switch to support Link Aggregation can be seen here:
Document 1400154.1 (Sun Storage 7000 Unified Storage System: An example of how to configure Link Aggregation on a switch).

Attempting to configure an LACP datalink to a switch that is not configured or able to support it can cause the Management Interface on the appliance to become unusable.
See Document 1396100.1 (Sun Storage 7000 Unified Storage System: Causes and Solutions for Well Known General Networking Problems) for details on this.

VLANs
can be used to increase network security and isolation, and also to increase the number of available datalinks if there are a small number of available network devices. Again, you must make sure that your network switch is able to support VLANs.
If you do use VLANs then this can often mean that you are configuring many datalinks with many interfaces associated with those datalinks.  Please note that if you wish to use SNMP to monitor your system that there is a bug in older code which limits monitoring to 20 interfaces.

As of 2013.1 code, VNICs can be used similarly to VLANs to create multiple interfaces on the same physical datalink/device.

Interface Layer

IPMP is used primarily as a way of increasing redundancy so that network connectivity is unaffected by the failure of a single component be it a physical network port, a cable or a switch. To provide this redundancy, an IPMP group is created where the IPMP interface sits above two or more interfaces that are associated with datalinks. For the maximum redundancy the lower level interfaces associated with datalinks must have those datalinks associated with devices connected physically to different switches so that if one switch fails, other datalinks are still active.
To configure an IPMP interface via the BUI navigate to Configuration > Network and click on the "+" icon ext to Interfaces. Now check the "IP MultiPathing Group" checkbox and select the interfaces to include in the IPMP group from the list of available interfaces.
To see more details on configuring interfaces and IPMP see the Configuration:Network#Interfaces section of the appropriate Administration manual for your system, or check the same section available through the online HELP in the BUI.
There are two methods that are used by the appliance to determine if an interface is failed.

  • Probe-based failure detection - this will issue an ICMP probe from each of the datalink-associated interfaces (the test interfaces) in the IPMP group in turn, to either a default gateway, or the first 5 systems on the same subnet that respond to a multicast ICMP probe. If 5 consecutive pings are unanswered the interface is considered failed. Please note that this does not necessarily mean that the device itself has physically failed. Indeed this is probably the least likely cause of the problem. See Document 1396100.1 (Sun Storage 7000 Unified Storage System: Causes and Solutions for Well Known General Networking Problems) for further details. If then 10 further consecutive pings are answered the interface is then considered repaired.
  • Link-based failure detection - uses properties of the network device driver to check on whether the link to the network is active.

The best practice recommendation is to use link-based failure detection on the appliance. This removes the dependence on other networking components external to the appliance to provide a stable network interface. To enable link-based failure detection you need to make sure that the test interfaces in an IPMP group do not have a traditional IP addresses configured. Instead they should be configured with the address and netmask of 0.0.0.0/8. Only the IPMP interface itself should be configured with a valid IP address and netmask for the appropriate subnet.

It is possible to have both link aggregation for performance reasons, and IPMP for redundancy reasons.  The best best practice is to create two or more aggregated datalinks on the appliance, such that within each aggregated datalink are two or more devices connected from the appliance to the same network switch. An interface is then created for each aggregated datalink with the 0.0.0.0/8 address so it will do link-based failure detection. Finally an IPMP interface will be created and configured with an appropriate IP address on the correct subnet, this IPMP interface will be linked to all the 0.0.0.0/8 test interfaces on the aggregated datalinks. Each of those test interfaces can be chosen to be ACTIVE or STANDBY as required.

Routing

If routing is administered manually and RIP and RIPng routing protocols are not allowed to automatically configure dynamic routes, then follow these best practices:

  • Have a single default route configured to use the main admin network interface on the appliance
  • Disable the dynrouting service in configuration services
  • Configure individual static routes for each subnet that the data share clients use

This will ensure that requests made by clients on the data networks are not routed back through the admin interface. The interfaces and datalinks that connect to the client data networks should use the higher throughput devices if any are installed. e.g. the 10 Gbps Ethernet devices rather than the 1 Gbps onboard devices.  These onboard devices can be used for the admin interface as this will not require a high throughput.
Best practice in a cluster is to have the default route via the locked, management network(s), and use higher precedence "subnet routes" via the data interface(s). If you have management and data network interfaces on the same subnet, you should set multihoming to adaptive or strict.


Please see the Configuration:Network#Routing section of the appropriate Administration manual for the appropriate system, or the same section of the online guide available via the BUI "HELP" icon, for further details on routing and the "multihoming" policy.

Services

PLEASE NOTE: For all appliances running 2013.1.x releases, 'DNS-less' operation is NOT supported and could cause undesirable results.

       The Appliance DNS service must be configured with a working DNS server which contains the appropriate A and PTR records for all names and IP addresses used by the appliance.

DNS - The appliance works best when the DNS service is correctly configured and able to resolve all hostnames and client IP addresses successfully.  Although it is possible to specify the loopback IP address of 127.0.0.1 during initial configuration for DNS servers this is not recommended in a production environment, and is only suitable for testing purposes.  The appliance will not be able to resolve hostnames of itself or other servers in this situation and critical services may not work.  This is especially true if Active Directory is used as a directory service. In this case at least one of the DNS servers must be able to resolve hostname and server records in the Active Directory portion of the domain namespace. The DNS server(s) should contain both forward and reverse lookup entries for the appliance.

Please ensure that at least one of your configured DNS servers is a physical machine that does NOT reside on the ZFSSA ( as a VM )."

There have been numerours times when we have had DNS related issues because the customer hosted all their DNS servers on VM's that reside on the ZFSSA and after a power outage they could not bring things up cleanly due to reliance on a DNS server that was not available.

NTP - It is recommended that NTP be used to synchronize the time on the appliance and on any other severs that may be required to provide client access to shares. For example if Active Directory is used to authenticate users of an SMB share then the time on the Active Directory Server and the appliance must agree. NTP is the best way to achieve this. In order to have NTP synchronize the times there must be less than 5 minutes difference between the time on the appliance and the time provided by the NTP server when NTP is configured.

Dynrouting - It is recommended that dynrouting be disabled unless the production network is specifically using RIP to dynamically advertise routes.

Clustering Considerations

When configuring a cluster, it is recommended that each cluster head have it's own dedicated admin interface that is private and locked out of the cluster resources that will not move during takeover and failback operations. The reason for this is so that each head will always have access to DNS and still be accessible via the BUI when the head is stripped (passive). This will enable faster troubleshooting to find the root cause of unexpected takeovers or reboots.  It will also allow a support bundle to be collected from a cluster head in a stripped state. Usually two of the built-in interfaces are used for management, e.g. nge0 on headA and nge1 on headB, which will make nge0 unusable on headB and nge1 unusable on headA. If the system is running at least 2013.1 code, management can be done via VNICs which are both in the same physical NIC, such as ixgbe0, thus only using a single physical network port for management. 

Back to Document 1392086.1 (Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems).

 

Oracle ZFS Storage Appliance services and associated IP port numbers

If a firewall is present between the clients and the Oracle ZFS Storage Appliance, make
sure the ports for services used by the client(s) are unblocked in the firewall.

 

 IP_port


Check for relevancy - 11-May-2018

References

http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/networking-bestprac-zfssa-2215767.pdf
<NOTE:1396100.1> - Sun Storage 7000 Unified Storage System: Causes and Solutions for Well Known General Networking Problems
<NOTE:1542826.2> - Information Center: Disk Storage
<NOTE:1400154.1> - Sun Storage 7000 Unified Storage System: An Example of How to Configure Link Aggregation on a Switch
<NOTE:1392086.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback