![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1951504.1 : DAT-72 - ASC: 0x0 ASCQ: 0x2 : Volume_Overflow, End-of-Media Detected, End of Partition/Medium Detected
This document presents how to deal with media errors on DAT 72 tape drives with the following symptoms seen in host OS logs: - ASC/ASCQ: 0x0/0x2 - Volume_Overflow - End-of-Media Detected - End of partition/medium detected In this Document
Applies to:Sun Storage DAT 72 Tape Drive - Version Not Applicable and laterSun SPARC Enterprise T5220 Server - Version All Versions and later Sun SPARC Enterprise T2000 Server - Version All Versions and later Information in this document applies to any platform. SymptomsThe host OS will report the following error in the /var/adm/messages file: /var/adm/messages:
Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0/pci@0/pci@8/pci@0/pci@8/pci@0/scsi@8/st@0,0 (st3):
Dec 3 13:55:10 xxxxxxxx Error for Command: write Error Level: Fatal Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.notice] Requested Block: 1588995 Error Block: 1588995 Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.notice] Vendor: HP Serial Number: 9 $DR-1 Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.notice] Sense Key: Volume_Overflow Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.notice] ASC: 0x0 (end of partition/medium detected), ASCQ: 0x2, FRU: 0x0 Dec 3 13:55:10 xxxxxxxx scsi: [ID 107833 kern.notice] End-of-Media Detected Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.warning] WARNING: /pci@0/pci@0/pci@8/pci@0/pci@8/pci@0/scsi@8/st@0,0 (st3): Dec 3 13:55:26 xxxxxxxx Error for Command: write_file_mark Error Level: Fatal Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.notice] Requested Block: 1588995 Error Block: 1588995 Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.notice] Vendor: HP Serial Number: 9 $DR-1 Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.notice] Sense Key: Volume_Overflow Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.notice] ASC: 0x0 (end of partition/medium detected), ASCQ: 0x2, FRU: 0x0 Dec 3 13:55:26 xxxxxxxx scsi: [ID 107833 kern.notice] End-of-Media Detected Note that the "Requested Block" and "Error Block" are the same in both situations and they are very high numbers. If the numbers do not match, this error is unlikely to be correctly reported. There will be a number of soft errors accumulated on this tape drive: # iostat -E
[...]
st3 Soft Errors: 6 Hard Errors: 0 Transport Errors: 0 Vendor: HP Product: C7438A Revision: ZP8B Serial No: 9 Depending on what method is being used for the backup, different error messages will appear and they may be alarming. For example: Filesystem backup started @ Wednesday, December 3, 2014 02:19:05 PM EAT
Backing up rpool file system Backing up rpool/ROOT file system Backing up rpool/ROOT/SDP5 file system Backing up rpool/var file system Backing up rpool/var/opt file system Backing up rpool/var/opt/fds file system warning: cannot send 'rpool/var/opt/fds@backup20141203x1341': I/O error Filesystem backup ended @ Wednesday, December 3, 2014 05:41:56 PM EAT As can be seen in the example above, the drive was writing successfully for 3 hours and 20 minutes when the error appeared. In case of DAT-72 drives, at 5 MB/s write throughput, it is possible to write approximately 18 GB of data in one hour. If the drive was able to continuously write for this long without an exception, it is unlikely to be a read/write issue. ChangesThere may have been a large amount of data imported to the server recently, or the server has been upgraded, with fallback installation taking up disk capacity. However, even if there are no apparent changes, the server may have been accumulating data and increasing the backup set until it grew beyond tape capacity CauseThe cause of the issue is that the backup set has grown too large for the tape drive to fit on a single cartridge. A DAT 72 cartridge can fit 36 GB of data natively (without compression) or up to 72 GB of data with 2:1 compression. This compression ratio is possible, but due to the small size of the drive's buffer, it is difficult to achieve. In the example presented above, the file systems being backed up were 64.3 GB in size: # df -kl
Filesystem kbytes used avail capacity Mounted on
[...] rpool/ROOT/SDP5 140894208 49825270 59212177 46% / rpool 140894208 97 59212177 1% /rpool rpool/var/opt/fds 140894208 12965342 59212177 18% /var/opt/fds In addition to this data, there may have been other data already written to tape in this session. However, at 64 GB, it will already be barely possible to fit this data on the tape drive. SolutionWorkaround: If the backup set size can be reduced to below tape cartridge's native capacity, the critical backup at this time will complete. This can be achieved by splitting the backup to several cartridges (if your application or script allows to split the backup), by selectively backing up different data to different cartridges or by erasing unnecessary files on your disk storage. NOTE: Erasing data is risky. Please take extra care that you do not erase critical data by accident, especially if you do not have recent backups! As this is a system administration task, we cannot advise what data may be safely erased. If you cannot determine this on your own, please engage internal or external consultancy services to advise what data may be deleted
Test the workaround by repeating the backup job. If the backup set size is <36 GB and the job still fails, engage Oracle Tape Support in resolving this issue. If the backup succeeds, do note that this is only a workaround and it will not resolve the issue permanently. As the backup set grows, you will inevitably hit a hard limit where you will be unable to erase any more data without compromising your system and you are running at risk of being unable to complete critical backups at that point. Solution As a proactive measure, please contact your Oracle Sales representative and inquire about backup solutions better suited to your environment. If it is impossible to reduce the backup set size to less than 36 GB without compromising your system, this is the only solution to the issue you are facing. In addition to offering vastly more capacity, new tape technology increases throughput by an order of magnitude. A half-height LTO5 tape drive in a 1U rackmount takes up the same amount of space as a typical DAT 72 tape drive, but it allow you to write up to 1500 GB of data natively to a compatible LTO-5 cartridge, or up to 3000 GB with compression for a 40-fold increase over DAT 72. Additionally, maximum throughput of an LTO5 drive is 140 MB/s natively, a 28-fold improvement over DAT 72. This throughput is maintained on restore, allowing much faster disaster recovery. Please bear in mind that replacing hardware will not resolve this issue. Attempting hardware replacement exposes you to unnecessary operational risk as you may remain without backups for considerable time periods.
Attachments This solution has no attachment |
||||||||||||||||||
|