![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Problem Resolution Sure Solution 2326695.1 : Oracle ZFS Storage Appliance: Resumed Replication Does Not Correctly Report Bytes_Sent, Estimated_Size and Estimated_Time_Left
In this Document
Created from <SR 3-16083948771> Applies to:Sun ZFS Storage 7120 - Version All Versions to All Versions [Release All Releases]Sun ZFS Storage 7420 - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage ZS3-2 - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage ZS4-4 - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage ZS5-2 - Version All Versions to All Versions [Release All Releases] 7000 Appliance OS (Fishworks) SymptomsOne of the new features available in OS8.7.0 is Resumable Replication. In previous versions of the code, any time an initial (or seed) replication failed, (for example, a node panics, or the network between source and target drops), all replicated data would be lost. The replication action would have to be destroyed and a new replication action created. With Resumable Replication, an initial replication is check pointed, allowing the replication action to pick up where it left off, should that initial replication fail. The problem with Resumable Replication is that the tools used to measure the bytes_sent, estimated_size and estimated_time_left no longer report accurate information. You will not know how long it will take for the seed replication to complete.
CauseResumable Replication is always on. You will see this reporting issue if the initial replication fails for any reason followed by a restart of the initial replication action.
SolutionConsider the following example. A 1TB initial replication is set up between a source and target appliance. The replication is successfully started. The alert logs record the activity. Began replicating 'Dbase' to appliance 'zs5'.
The Replication Action is Monitored. Note the accurate reporting of bytes_sent, estimated_size and estimated_time_left : zs5:shares Dbase action-000> show
Properties: id = 30812650-7933-47e0-9904-b81ec62c8225 target_id = d1bbf791-71bd-4fb1-b04c-c78eb5b3b4bb target = xx.xxx.xxx.xxx target_pool = target-pool enabled = true continuous = false include_snaps = true retain_user_snaps_on_target = false dedup = false include_clone_origin_as_data = false max_bandwidth = unlimited bytes_sent = 461G estimated_size = 1.0T estimated_time_left = 01:31:23 average_throughput = 112MB/s use_ssl = false compression = on export_path = state = sending state_description = Sending update ..... On the target node, we can see the dataset accurately reflects the amount of data replicated. zs5# zfs list -t all | egrep 'USED|30812650-7933-47e0-9904-b81ec62c8225'
NAME USED AVAIL REFER MOUNTPOINT target-pool/nas-rr-30812650-7933-47e0-9904-b81ec62c8225 461G 18.9T 87.5K none target-pool/nas-rr-30812650-7933-47e0-9904-b81ec62c8225/Dbase 461G 18.9T 87.5K /export target-pool/nas-rr-30812650-7933-47e0-9904-b81ec62c8225/Dbase@.rr-30812650-7933-47e0-9904-b81ec62c8225-1 0 - 87.5K - For whatever reason, the replication is interrupted. In this particular case, the target node paniced. The alert logs identify the break in replication. Replication of 'Dbase' to 'zs5' failed after sending '461G' out of '1.03T' because
the appliance could not contact the replication target. Action - 30812650-7933-47e0-9904-b81ec62c8225.
Upon recovery of the target node, the initial replication action is resumed and the activity recorded in the logs zs5:shares Dbase action-000> sendupdate Began replicating 'Dbase' to appliance 'zs5'.
However, when you look at the action, you will see that there is no meaningful data being reported for bytes_sent, estimated_size and estimated_time_left zs5:shares Dbase action-000> show
Properties: id = 30812650-7933-47e0-9904-b81ec62c8225 target_id = d1bbf791-71bd-4fb1-b04c-c78eb5b3b4bb target = xx.xxx.xxx.xxx target_pool = target-pool enabled = true continuous = false include_snaps = true retain_user_snaps_on_target = false dedup = false include_clone_origin_as_data = false max_bandwidth = unlimited bytes_sent = 12.6G estimated_size = 11.5K estimated_time_left = 00:00:00 average_throughput = 110MB/s use_ssl = false compression = on export_path = state = sending state_description = Sending update It has now become impossible to determine how much time is left before the replication completes. The BUI also incorrectly reports these values.
Unfortunately, there is no accurate way to determine the bytes_sent, estimated_size and estimated_time_left from the BUI or CLI. Please open an SR with Oracle support. Oracle Support can determine these parameters for you with some internal tools at our disposal.
You can look at the dataset on the target. The USED column will tell you how much data has been replicated. You can then calculate the time left. # zfs list -t all | egrep 'USED|30812650-7933-47e0-9904-b81ec62c8225'
Attachments This solution has no attachment |
||||||||||||||||
|