![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1522925.1 : Snap Management Utility for the Oracle Database - Information and Troubleshooting
This document provides general information and troubleshooting tips for Oracle Snap Management Utility for Oracle Database. In this Document
Applies to:Oracle ZFS Storage ZS5-4 - Version All Versions and laterOracle ZFS Storage ZS5-2 - Version All Versions and later Oracle ZFS Storage ZS3-2 - Version All Versions and later Oracle ZFS Storage ZS3-4 - Version All Versions and later Oracle ZFS Storage ZS3-BA - Version All Versions and later 7000 Appliance OS (Fishworks) PurposeThis document provides general information and troubleshooting tips for Oracle Snap Management Utility for Oracle Database.
ScopeThis document is intended for database administrators and Oracle support engineers.
DetailsOracle Snap Management Utility for Oracle Database is a management tool for administering snapshot-based backups of Oracle databases hosted on Sun ZFS Storage Appliance systems. The tool allows an administrator to backup, restore, recover and clone Oracle databases using ZFS snapshot technology. These type of backups (backups to primary storage) have specific use cases and are not intended to replace the standard backup practices the administrator is using in their database environments. Additionally the tool has the capability of creating database clones from RMAN backups that are stored on the appliance. The tool consists of a single Java application that is designed to run on a management station as a background process on Unix systems and a Windows Service on Windows systems. Once the tool has started you access the tool using standard desktop clients such as the ssh command, the Windows Remote Shell (winrs.exe) command or a web browser. During operations the tool will establish secure shell sessions with the database host and storage systems and remotely execute a series of synchronized commands. The user must provide the tool with valid user accounts to use during snapshot operations. Additionally the user specified in these accounts must have the appropriate permissions and privileges to run various commands on the host or storage including creating mountpoints, mounting filesystems, scanning the SCSI bus for new disks, taking a snapshot of a share, rolling back a share to a snapshot, cloning a snapshot and destroying a share. Additionally on host systems the tool will alternate running commands as the user specified in the account (required to be a privileged user who can perform filesystem and SCSI management) and as the Oracle database user. The tool uses the sqlplus command to control and query the database during operations. It uses OS authentication when connecting to the database (as the Oracle user). Depending on the operation being perform the tool may shutdown and restart the database. Software Data DirectoryThe software data directory is located in the following locations depending on the host operating system:
The program data directory is not removed when the software is uninstalled. The software data directory can be removed by the user after they have uninstalled the software and they no longer need the data. Backing up the software data directory The data directory can be backed up by first shutting down the software and then copying the directory and its contents. Restoring the software data directory The data directory can be restored by first shutting down the software and the restoring the data directory from a backup or other copy. System RequirementsThe software has the following system requirements. Management Host
Database Host
Storage Appliance
Oracle Database
Support Matrix
Network (TCP) Port Usage Incoming TCP ports:
The incoming ports are configurable by modifying the file /opt/oracle/smu/etc/smu.conf.
The outgoing ports are configurable by overriding the default port property of the account you create for the resource. This information is also described in the User Guide (http://docs.oracle.com/cd/E39520_01/pdf/E39313.pdf) in Table 3 on page 11 and in Table 9 on page 54.
Limitations and RestrictionsThe software has the following limitations and restrictions. General
Snap Backup
Snap Restore
Snap Recover
Snap Clone
RMAN Clone
Sample RMAN export runblock: configure controlfile autobackup on; Clone Deprovision
TroubleshootingThis section provides troubleshooting tips for the tool. Expected activities not seen when viewing the activity logActivities can be filtered. If you do not see the expected rows in the activity table check to see if any filter criteria is specified. Change or reset the criteria to view the activity logs you are interested in. ORA-27102: out of memory while cloning a databaseIf you receive ORA-27102 when cloning a snap backup or RMAN backup this indicates that there is not enough shared memory available for the database clone. You can either add more shared memory or delete other databases that are running on the host or cluster. Alternatively you can consider cloning the database to another host or cluster that has available space. ORA-01034: ORACLE not available during snap backupBefore creating a snap backup of the database the tool must be able to connect to and query the database for vital information including the list of files the database is using. This error indicates that the database instance that the tool tried to connect is shutdown and not running. You must restart the database instance that so that the tool can operate correctly. If this is a RAC database you can also modify the host account you for the database to use one of the other cluster nodes that is up. Cluster nodes that were down before offline snap backup are up and running afterwardSMU uses the srvctl stop and srvctl start commands to shutdown the database temporarily when performing an offline backup of the database. For a cluster database, if some nodes were down before the backup they will be brought up after the backup. If you want a particular cluster node to remain down after the backup you must use the srvctl disable command to disable the node so that the srvctl start command will not restart the node. Auth fail or HTTP 401 when performing a taskWhen executing a task the tool will login to one or more host and storage systems. If the user and/or password for the account is incorrect then the task will fail with either “Auth fail” if the account is for a Linux, Solaris or storage appliance system or an HTTP 401 error is the account is for a Windows system. You can test account settings prior to executing tasks to help ensure that the account settings are correct. Use the “accounts test” command from the CLI or click the test button in the column of the account you want to check the settings of. Can not delete a backupYou can only delete a backup if it does not have any dependent clones. When perform a delete task the tool will check the backup to see if any clones were made from it and the clones are still active. If there are then the tool will fail the task and not let the backup be deleted. Can not restore to the specified backupYou can only restore a database to the specified backup if there are no database clones made from any newer backups of the database. The reason for this is because backups are based on ZFS snapshots. When you rollback a ZFS snapshot any newer snapshots are automatically destroyed. Can not clone from snap backup to different host
Wrong network address is used in the mount entries for the clone databaseThis version of the product uses a simple algorithm to chose the network path to the clone shares.
If the network address chosen for the clone shares is not suitable or desired the user peform the following steps: For Linux and Solaris database hosts
For Windows database hosts
SMU failed on the first mount of the clone of an rman backup - permission denied FFAS: Error creating clone using Oracle Snap Management Utility Invalid application file layout. Remote share X has already been backed upThis error indicates that the database file layout does not allow the taking of an online backup of the database. In order to take an online backup of the database the datafiles and archived logs must reside in separate shares. During an online backup snapshots of the datafile shares are taken first while the database is in backup mode. Next, the current redo logs are archived. And then snapshots of the archived log shares are taken. If, during the backup sequence, the tool detects that shares have already had snapshots taken of them it will fail the online backup task and display this error. host X login: timeout: socket is not establishedThis error indicates that the software could not connect to the database host (Linux or Solaris) or storage appliance. This error occurs when the database host or storage appliance are not reachable or do not respond to the connection request within a timeout period. Verify that the database host or storage appliance are up and reachable over the network and try the operation again. Tasks are not sortable by Task ID using the column sort controlsThe software UI was developed using the Oracle Advanced Development Framework (ADF) toolkit. The first column in ADF tables is not sortable by design. In this case the first column in the Tasks table is the Task ID. Rows can be sorted by the first column using the Advanced Sorting menu which can be accessed by clicking View → Sort → Advanced from the Tasks table menu. Could not find all shares or shares were unavailable due to pool statusThis error message can occur during a task when the software searches for the shares to operate on. One of the main features the software provides is the ability to map shares from their external attributes (mountpoint or lunguide) to their internal appliance identifier (pool/collection/project/share). This error message indicates that the shares the software was looking for either do no exist on the appliance or are not available because the storage pool they are in is in a state other than online or degraded. You can encounter this error if you specify the wrong storage account with a database account. In particular this error will occur when using ASM databases and the wrong storage account is specified for the database. The software is not able to determine which external storage system an iSCSI LUN is using and so will only search the storage that was linked to by the database account. ORA-19809 occurs when creating a snap clone databaseThis error message indicates that the size specified for the flash recovery area (FRA) is too small to support the clone database. The software sets the size of the FRA for the clone database based on the db_recovery_file_dest_size initialization parameter of the database that was backed up. Since a snap clone is an identical copy of the origin database including the size of each redo log, the size parameter should be adequate for the clone database. It is possible that the FRA size might be too small. To address this issue you need to create the clone database from a backup of the origin database that has a suitable FRA size. The WS-Management service cannot process the request because the request contained invalid selectors for the resourceThis error occurs with Windows hosts when the shell session that the software has established has been idle too long. The software uses Windows Remote Shell to connect to the host and establish a session. Windows Remote Shell will automatically log the session out if the idle timeout period expires. The software alternates issuing commands to the host and storage. It is possible for the host session to be idle while the software sends commands to the storage. To resolve this issue increase the WinRS idle timeout period to a bigger value (the software requires the timeout period to be 2 hours or more). C:\>winrm set winrm/config/winrs @{IdleTimeout=”7200000”}
BUI always displays fetching data or displays it frequentlyThe BUI is designed to refresh itself regularly so that it can display the current status of tasks and other items in the various UI panes. It is possible that the amount of data to display can grow over time. Completed tasks are retained until removed or deleted by the user. It is possible to accumulate a large number of tasks that prevent the BUI from refreshing its display properly. To address this issue you can either delete completed tasks that are no longer needed or disable the UI refresh by modifying the global refresh settings. No way to delete, trim or clear the activity logThe activity log is a record of all actions perform by the software users. As such it is meant to maintain an audit trail so there is no user supported way to remove entries from the log. The software BUI will only display the 1000 most recent activity log records. If you need to view more entries use the software CLI. It will display all records in the log. The WS-Management service cannot process the request. The maximum number of concurrent operations for this user has been exceeded. Close existing operations for this user, or raise the quota for this userThis error indicates that WinRM setting MaxConcurrentOperationsPerUser is set too low. The software recommended value for this setting is 1500. The software executes many SQL Plus, RMAN and system commands on the host while performing operations, greater than the number of commands allowed by default. To modify this setting run the following command C:\>winrm set winrm/config/service @{MaxConcurrentOperationsPerUser=”1500”}
Clone database task hangs when target database host is Linux running UEK kernel and dNFS is enabled in the target Oracle homeThis error occurs when the UEK kernel 2.6.32-300.11.1.el5uek is running on the database host. More information on this issue is available in Doc ID 1460787.1. To resolve you must upgrade your kernel to or disable dNFS. The software does not support the use of an oranfstab file in this release. Clone database task hangs during control file creation on Windows database hostThis error occurs because the WinRM setting MaxTimeoutms is set too low. Some of the commands the software runs can take a while to complete. The software requires that this parameter be set to a value high enough to allow these long running commands to complete. This error can also be verified by examining the software log for an exception like the following Exception in thread "Thread-416" javax.xml.ws.soap.SOAPFaultException: The WS-Management service cannot complete the operation within the time specified in OperationTimeout. To address the issue modify the setting of the MaxTimeoutms setting and run the clone task again. C:\>winrm set winrm/config @{MaxTimeoutms=”72000000”}
No rows for datafile X in v$datafile_copy system viewThis error occurs during an RMAN clone operation when the software can not find a row for one of the backed up data files in the backed up control file v$datafile_copy system view. This indicates that the data file is not a part of the backup set or the control file was backed up before the data file was backed up. The software requires that each data file in the backup has a row in this system view so that it can calculate the maximum SCN (system change number) to recover the database to. To resolve this issue you must create a valid backup that contains the data files, archived logs and control file. Database already in backup modeThis error indicates that a database that the software attempted to back up was already in backup mode. This can indicate that another process or program is backing up the database. The software will not backup a database that is already in backup mode. If the database is not being backed up by other utilities then the database must be taken out of backup mode manually before the software can successfully backup the database. Can not map disk <hostname | ip address>:<lun guid>This error indicates that software could not find the clone disk on the database host or node. When cloning an ASM database the software clones the appropriate snapshot on the appliace to create new LUNs. The software then uses operating system specific commands to discover the clone LUNs from the database host or node. If the clone LUNs can not be discovered then the software will report this error. The usual cause for this error is that the appropriate iSCSI targets have not been logged into on the database host or node. The software performs no SAN configuration and requires that all iSCSI targets be logged into before any ASM database clone operations are performed. Does SMU need to communicate with Recovery catalog at main site for deploying Dev. DB with image copy (Clone) at DR site?SMU does not use currently use the RMAN catalog. SMU requires that the backup shares contain a single full image copy backup. SMU will scan the backup shares and identify the control files, data files and archive logs in the backup shares. It will then snap and clone the backup shares and mount the clone shares on the target database host and proceed to configure and start a clone database that uses the files as is. If Source DB is RAC, Target DB should be RAC as well?No. SMU can detect if the backup is of a single instance or RAC database and will perform the appropriate processing on the clone based on whether it is targeted for a single instance or RAC environment. In other words you can create single instance or RAC clones no matter if the source is single instance or RAC. There are multiple OS (Redhat/IA, SPARC/Solaris) at Source side, We should prepare the same platform for Target DBs?
SMU BUI Unresponsive after Login
Backup format should be image copy, correct? and need the following files - Data & ControlYes, the backup must be in image copy format. (See "RMAN Clone" section above. Backup files must use the RMAN %U format specification and must contain only one control file, one or more data files and one or more archived logs) SMU Snapshot Deletion BUG 25700694 - SMU is unable to delete snapshot; the snapshot is missingSMU is unable to delete snapshot as the snapshot is missing. Workaround from the bug: The customer reports the following workaround for their issue. So, I just renamed the snapshot on the ZFSSA to the name known by SMU and So, in case of a mismatch a snapshot with the same name needs to be created Wrong Oracle Home Used After Database UpgradeSee this document for information: Oracle ZFS Storage Appliance: Snap Management Utility (SMU) Uses Wrong Oracle Home After Database Upgrade <Doc ID 2324472.1> Download Snap Management Utility from <Patch 152376-02>: Oracle Snap Management Utility for Oracle Database 1.3.0: S10 patch References<BUG:25700694> - SMU IS UNABLE TO DELETE SNAPSHOT; THE SNAPSHOT IS MISSING<NOTE:1210656.1> - Clone your dNFS Production Database for Testing <NOTE:1460787.1> - DB Hangs When DNFS is Enabled on UEK kernel <BUG:9316059> - CAN'T RENAME DISK GROUP ON 11GR2 <NOTE:1452614.1> - How To Setup DNFS (Direct NFS) On Oracle Release 11.2 <BUG:13571798> - DATABASE CANNOT CREATE DIRECTORIES ON DIRECT NFS VOLUME HANDLING INSTRUCTIONS: <NOTE:762374.1> - Step by Step - Configure Direct NFS Client (DNFS) on Linux (11g) Attachments This solution has no attachment |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|