Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1538731.1
Update Date:2017-10-05
Keywords:

Solution Type  Problem Resolution Sure

Solution  1538731.1 :   Sun Storage 7000 Unified Storage System: Replication fails with "space quota exceeded"  


Related Items
  • Sun ZFS Storage 7320
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
  • Oracle ZFS Storage Appliance Racked System ZS4-4
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Oracle ZFS Storage ZS4-4
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-6781443331>

Applies to:

Sun ZFS Storage 7420 - Version All Versions to All Versions [Release All Releases]
Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases]
Sun Storage 7110 Unified Storage System - Version All Versions to All Versions [Release All Releases]
Sun ZFS Storage 7120 - Version All Versions to All Versions [Release All Releases]
Sun Storage 7410 Unified Storage System - Version All Versions to All Versions [Release All Releases]
7000 Appliance OS (Fishworks)

Symptoms

From the replication SOURCE system:

Fri Feb 8 04:54:21 2013
nvlist version: 0
time = 0x511484fd
hrtime = 0xa69b652282c50
action = (embedded nvlist)
nvlist version: 0
target_label = u1eis04nas17-bkp
target_uuid = 3ca5516f-0ac0-e587-b8ea-dfdad7c90231
uuid = 1a52ef6e-064b-eb62-d0b0-d2f9e5e1d68a
state = sending
dataset = exalogic/local/NODE_8/general
(end action)

event = update done
result = failure
errmsg = stage 'stream_send' failed: zfs_send: cannot send 'exalogic/local/NODE_8': Broken pipe
remote_status = ok


From the replication TARGET system:

Fri Feb 8 04:50:50 2013
nvlist version: 0
time = 0x5114842a
hrtime = 0x117cd408d57f89
pkg = (embedded nvlist)
nvlist version: 0
source_asn = bd8f7331-c1bb-c49e-8186-e23ff5c3f597
source_name = aueis12nasx03
uuid = 1a52ef6e-064b-eb62-d0b0-d2f9e5e1d68a
state = receiving
(end pkg)

event = recv_done
result = failed
error = zfs_receive failed: cannot receive new filesystem stream: destination pool17a/nas-rr-1a52ef6e-064b-eb62-d0b0-d2f9e5e1d68a/NODE_8 space quota exceeded


From the replication TARGET system ALERTS:

Fri Feb 8 04:38:51 2013
nvlist version: 0
class = alert.ak.appliance.nas.project.replication.receive.start
source = svc:/appliance/kit/akd:default
project = NODE_8
source_host = aueis12nasx03
uuid = 788e8679-0936-6d60-8472-9cc60ad6b438
link =

Fri Feb 8 04:50:50 2013
nvlist version: 0
class = alert.ak.appliance.nas.project.replication.receive.fail.misc
source = svc:/appliance/kit/akd:default
link = 788e8679-0936-6d60-8472-9cc60ad6b438
project = NODE_8
source_host = aueis12nasx03
ak_errmsg = zfs_receive failed: cannot receive new filesystem stream: destination pool17a/nas-rr-1a52ef6e-064b-eb62-d0b0-d2f9e5e1d68a/NODE_8 space quota exceeded
uuid = 501adb71-c35f-c420-e929-b4899247620b

 

Cause

The "space quota exceeded" error message for the 'NODE_8' replications is related to the 'quota_snap' property setting.

Data quotas

A data quota enforces a limit on the amount of space a filesystem or project can use. By default, it will include the data in the filesystem and all snapshots. Clients attempting to write new data will get an error when the filesystem is full, either because of a quota or because the storage pool is out of space. As described in the snapshot section, this behavior may not be intuitive in all situations, particularly when snapshots are present. Removing a file may cause the filesystem to write new data if the data blocks are referenced by a snapshot, so it may be the case that the only way to decrease space usage is to destroy existing snapshots.

If the 'include snapshots' property is unset, then the quota applies only to the immediate data referenced by the filesystem, not any snapshots. The space used by snapshots is enforced by the project-level quota but is otherwise not enforced. In this situation, removing a file referenced by a snapshot will cause the filesystem's referenced data to decrease, even though the system as a whole is using more space. If the storage pool is full (as opposed to the filesystem reaching a preset quota), then the only way to free up space may be to destroy snapshots.

Data quotas are strictly enforced, which means that as space usage nears the limit, the amount of data that can be written must be throttled as the precise amount of data to be written is not known until after writes have been acknowledged. This can affect performance when operating at or near the quota. Because of this, it is generally advisable to remain below the quota during normal operating procedures.

Quotas are managed through the BUI under Shares -> General -> Space Usage -> Data.

They are managed in the CLI as the quota and quota_snap properties.

To set the 'quota_snap' property for a share via the CLI (example):

clownfish:shares default/foo > get
aclinherit = restricted (inherited)
atime = true (inherited)
checksum = fletcher4 (inherited)
compression = off (inherited)
copies = 1 (inherited)
mountpoint = /export/foo (inherited)
quota = 0 (inherited)
readonly = false (inherited)
recordsize = 128K (inherited)
reservation = 0 (inherited)
secondarycache = all (inherited)
nbmand = false (inherited)
sharesmb = off (inherited)
sharenfs = on (inherited)
snapdir = hidden (inherited)
vscan = false (inherited)
sharedav = off (inherited)
shareftp = off (inherited)
root_group = other (default)
root_permissions = 700 (default)
root_user = nobody (default)
casesensitivity = (default)
normalization = (default)
utf8only = (default)
quota_snap = (default)
reservation_snap = (default)
custom:int = (default)
custom:string = (default)
custom:email = (default)
clownfish:shares default/foo > set quota_snap=false
quota_snap = false(uncommitted)


clownfish:shares default/foo > commit
clownfish:shares default>

Solution

Recommendation: Set the 'quota_snap' property on the project/share to FALSE.

The "space quota exceeded" can happen when there is no available space on shares as well (on source node):

# zfs list
NAME                                                                                   USED  AVAIL  REFER  MOUNTPOINT
pool-570/local/FMW                                                                      31K  55.4T    31K  /export
pool-570/local/FMW_dr_clusters                                                        1.18T  22.9T    31K  /export
pool-570/local/FMW_dr_clusters/IDMDOMAIN                                              10.3M  4.99G  9.96M  /export/IDMDOMAIN
pool-570/local/FMW_dr_clusters/IDMMS1                                                 20.0G      0  20.0G  /export/IDMMS1   <<< full
pool-570/local/FMW_dr_clusters/IDMMS2                                                 31.5K  5.00G  31.5K  /export/IDMMS2
pool-570/local/FMW_dr_clusters/IDM_BIN                                                4.39G  5.61G  4.38G  /export/IDM_BIN
pool-570/local/FMW_dr_clusters/IDM_DOMAIN                                             31.5K  5.00G  31.5K  /export/IDM_DOMAIN
pool-570/local/FMW_dr_clusters/IDM_P1                                                 30.0G      0  30.0G  /export/IDM_P1   <<< full
pool-570/local/FMW_dr_clusters/OCR2                                                   31.5K  10.0G  31.5K  /export/OCR2

The outcome was to increase quota and reservation for /export/IDMMS1 and /export/IDM_P1 shares :

# zfs list -o space | egrep "IDMMS1|IDM_P1"
NAME                                                              AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool-570/local/FMW_dr_clusters                                    22.8T   1.20T        0     31K              0      1.20T
pool-570/local/FMW_dr_clusters/IDMMS1                             10.0G   20.0G      64K   20.0G              0          0
pool-570/local/FMW_dr_clusters/IDM_P1                             15.0G   30.0G     340K   30.0G              0          0

Finally, the replication did succeed :

slcnas570:shares FMW_dr_clusters replication> select action-000
cli:shares FMW_dr_clusters action-000> get
                           id = caf1f326-1e38-cebb-f3d5-bd69d1531419
                       target = slcnas505
                      enabled = true
                   continuous = false
                include_snaps = false
                max_bandwidth = unlimited
                   bytes_sent = 0
               estimated_size = 0
          estimated_time_left = 00:00:00
           average_throughput = 0B/s
                      use_ssl = true
                        state = idle
            state_description = Idle (no update pending)
                  next_update = Tue May 13 2014 09:00:00 GMT+0000 (UTC)
                    last_sync = Tue May 13 2014 08:47:57 GMT+0000 (UTC)
                     last_try = Tue May 13 2014 08:47:57 GMT+0000 (UTC)
                  last_result = success

Worth mentioning a few bugs related to quota issues that have been fixed under 2013.1.1.9 :

   15505861     SUNBT6744280 write performance degrades when ZFS filesystem is near quota (18524586)
   15758542     SUNBT7117263 Replications can fail with 'space quota exceeded' after compression (18531721)
   16849863     clone of a share fails if the share run out of space due to quota (18531713)
   17192457     replication reverse fails due to quota validation (18531709)
   17529610     allow zfs_inherit to reset quota even when it is less than used space (18524588)
   17563136     Exceeding quota in replica package makes it completely useless (18531723)
   18110996     clone of a share fails if the share run out of space due to quota (18531731)
   18245698     clone into project that exceeded the space quota produce inconsistent results (18531733)
 

References

<NOTE:1503867.1> - Configure and Mount NFS shares from SUN ZFS Storage 7320 for SPARC SuperCluster
<BUG:15505861> - SUNBT6744280 WRITE PERFORMANCE DEGRADES WHEN ZFS FILESYSTEM IS NEAR QUOTA
<BUG:16849863> - CLONE OF A SHARE FAILS IF THE SHARE RUN OUT OF SPACE DUE TO QUOTA
<BUG:17192457> - REPLICATION REVERSE FAILS DUE TO QUOTA VALIDATION
<BUG:17529610> - ALLOW ZFS_INHERIT TO RESET QUOTA EVEN WHEN IT IS LESS THAN USED SPACE
<BUG:17563136> - EXCEEDING QUOTA IN REPLICA PACKAGE MAKES IT COMPLETELY USELESS
<BUG:18110996> - CLONE OF A SHARE FAILS IF THE SHARE RUN OUT OF SPACE DUE TO QUOTA
<BUG:18245698> - CLONE INTO PROJECT THAT EXCEEDED THE SPACE QUOTA PRODUCE INCONSISTENT RESULTS

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback