Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1439858.1
Update Date:2017-06-13
Keywords:

Solution Type  Problem Resolution Sure

Solution  1439858.1 :   Sun Storage 7000 Unified Storage System: All SMB (Windows) Shares Are Inaccessible  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Oracle ZFS Storage ZS5-2
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Oracle ZFS Storage ZS5-4
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Oracle ZFS Storage Appliance Racked System ZS4-4
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  
  • _Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Symptoms
Cause
Solution
References


Applies to:

Oracle ZFS Storage ZS4-4 - Version All Versions and later
Sun Storage 7110 Unified Storage System - Version All Versions and later
Sun Storage 7310 Unified Storage System - Version All Versions and later
Sun ZFS Storage 7320 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

When accessing to the appliance share, either all shares are inaccessible or prompted with user and password

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk Storage ZFS Storage Appliance Community

Cause

1. Lost connection to an Active Directory domain controller
2. SMB server service fault

Solution

Once connected to an Active Directory domain controller, ZFS Storage Appliance (ZFSSA) maintains the connection using the domain controller when it joined to the domain.  This connection is used for pass-through authentication when the member of the domain accesses to the resource

When this connection is lost, users are prompted for user name and password while any domain credential can fulfill the authentication.

Therefore, for the first step, you need to verify this connection.  To do this, you can navigate to Configuration / Services / AD from either BUI or CLI.  When ZFSSA has an active connection with a domain controller, you will see the host nae and IP address of the Active Directory domain controller.  If no active connection is available to a domain controller, it will be displayed such as below.  (The below is BUI.)

When active connection is available:

Mode: Domain
Domain: naslab2k8.west.oracle.com
Selected Domain Controller: naslab03.naslab2k8.oracle.sun.com (192.168.2.22)

When active connection is NOT available:

Mode: Domain
Domain: naslab2k8.west.sun.com
Selected Domain Controller: None

       -- O R--

Selected Domain Controller: (0.0.0.0)


Software upgrade, temporary AD server failures, DNS server unavailability or network issue can cause connection issue.  2011.1 has improved the way to reconnect to the domain controller(s), however, there are times it requires manual intervention.

In order to resolve this, you need ZFSSA rejoin the domain.  In Configuration / Services / AD screen, follow the steps below.

  • Click "Join Workgroup."
  • Enter workgroup name and click "APPLY"
  • Confirm
  • Click "Join Domain"
  • Enter the administrator credentials to rejoin the domain.。

If these steps require further explaination and cannot rejoin the domain, please refer to the <Document:1402353.1>.

While we have an active connection to AD and all users are not authenticated, it maybe problems with SMB service.


Please contact to ZFSSA technical support to file a service request.  The engineer responsible for the service request may provide you steps to collect information for further diagnosis or arrange a remote session to collect necessary information remotely.



Support note:

First check if you see the messages such as below in /var/ak/logs/debug.sys:

Jul 23 09:07:28 server-1 smbd[5293]: [ID 216626 daemon.debug] smbrdr[43]: reply mismatch (117)
Jul 23 09:07:28 server-1 smbd[5293]: [ID 216626 daemon.debug] smbrdr[117]: reply mismatch (43)
Jul 23 09:07:28 server-1 smbd[5293]: [ID 898164 daemon.debug] smbrdr_tree_connectx: REPLY_MESSAGE_MISMATCH


If you see these messages, it is likely to be in the case of  <BUG:15661658> P3 utility/cifs CIFS server cannot recover from DC reply mismatch in libsmbrdr.

2010.Q3 recent versions have binary fixes.  If your customer is using 2010.Q3 version and do not plan to update, please request the binary fix.  this is because we do not have sufficient evidence that 2011.1 entirely resolved the problem.  However, we have not seen this issue with 2011.1.  If you come across with this problem, please engage RPE/backline.

If you do not see "reply mismatch" message, it maybe the case that RPE is working on.  Since we do not have details, please confirm the service is online and obtain a crash dump.

If your customer is unwilling to reboot, please collect smbd's gcore  (gcore -g `pgrep smbd`) and diagnose, request for the crash dump in case of future occurrence.



If you do not have time to wait for support and need to restore your sevice back online, you have two options.


The first is restarting SMB service.  To do this, on BUI, you need to navigate to Configuration/Services and then click on the circular arrow icon next to SMB.  This will usually restart SMB service and come back online.

However, by restarting SMB service, please note that it becomes next to impossible for technical support staff to identify the cause.


Other option is Diagnostic Reboot.  To do this, from BUI, click on the circular arrow icon on the top left screen of BUI, right beneath the Oracle logo.  You will be prompted to confirm therefore, click on the "Gather Diagnostics" check box and reboot.

Since collected data must be read, compressed and written, this reboot takes longer time than normal reboot.  This operations is directly related to the amount of memory installed and in the large configuration, it may take a few minutes.

Back to <Document 1402353.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot Active Directory Issues

References

<BUG:15661658> - SUNBT6975798 CIFS SERVER CANNOT RECOVER FROM DC REPLY MISMATCH IN LIBSMBRDR
<NOTE:1402353.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Active Directory Issues

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback