Oracle Cloud Infrastructure Documentation

Backup Failures on Bare Metal and Virtual Machine DB Systems

Database backups can fail for various reasons. Typically, a backup fails because either the database host cannot access the object store, or there are problems on the host or with the database configuration.

This topic includes information to help you determine the cause of the failure and fix the problem. The information is organized into several sections, based on the error condition. If you already know the cause, you can skip to the section with the suggested solution. Otherwise, use the procedure in Determining the Problem to get started.

Determining the Problem

In the Console, a failed database backup either displays a status of Failed or hangs in the Backup in Progress or Creating state. If the error message does not contain enough information to point you to a solution, you can use the database CLI and log files to gather more data. Then, refer to the applicable section in this topic for a solution.

To identify the root cause of the backup failure

Database Service Agent Issues

Your Oracle Cloud Infrastructure Database makes use of an agent framework to allow you to manage your database through the cloud platform. Occasionally you might need to restart the dcsagent program if it has the status of stop/waiting to resolve a backup failure.

To restart the database service agent

Oracle Clusterware Issues

Oracle Clusterware enables servers to communicate with each other so that they can function as a collective unit. Occasionally you might need to restart the Clusterware program to resolve a backup failure.

To restart the Oracle Clusterware

Object Store Connectivity Issues

Backing up your database to Oracle Cloud Infrastructure Object Storage requires that the host can connect to the applicable Swift endpoint. You can test this connectivity by using a Swift user.

To ensure your database host can connect to the object store

Host Issues

One or more of the following conditions on the database host can cause backups to fail:

Interactive Commands in the Oracle Profile

If an interactive command such as oraenv, or any command that might return an error or warning message, was added to the .bash_profile file for the grid or oracle user, Database service operations like automatic backups can be interrupted and fail to complete. Check the .bash_profile file for these commands, and remove them.

The File System Is Full

Backup operations require space in the /u01 directory on the host file system. Use the df -h command on the host to check the space available for backups. If the file system has insufficient space, you can remove old log or trace files to free up space.

Incorrect Version of the Oracle Database Cloud Backup Module

Your system might not have the required version of the backup module (opc_installer.jar). See Unable to use Managed Backups in your DB System for details about this known issue. To fix the problem, you can follow the procedure in that section or simply update your DB system and database with the latest bundle patch.

Changes to the Site Profile File (glogin.sql)

Customizing the site profile file ($ORACLE_HOME/sqlplus/admin/glogin.sql) can cause managed backups to fail in Oracle Cloud Infrastructure. In particular, interactive commands can lead to backup failures. Oracle recommends that you not modify this file for databases hosted in Oracle Cloud Infrastructure.

Database Issues

An improper database state or configuration can lead to failed backups.

Database Not Running During Backup

The database must be active and running while the backup is in progress.

To check that the database is active and running

Archiving Mode Set to NOARCHIVELOG

When you provision a new database, the archiving mode is set to ARCHIVELOG by default. This is the required archiving mode for backup operations. Check the archiving mode setting for the database and change it to ARCHIVELOG, if applicable.

To check and set the archiving mode

Stuck Database Archiver Process and Backup Failures

Backups can fail when the database instance has a stuck archiver process. For example, this can happen when the flash recovery area (FRA) is full. You can check for this condition using the srvctl status database -db <db_unique_name> -v command. If the command returns the following output, you must resolve the stuck archiver process issue before backups will succeed:

Instance <instance_identifier> is running on node *<node_identifier>. Instance status: Stuck Archiver

Refer to ORA-00257:Archiver Error (Doc ID 2014425.1) for information on resolving a stuck archiver process.

After resolving the stuck process, the command should return the following output :

Instance <instance_identifier> is running on node *<node_identifier>. Instance status: Open

If the instance status does not change after you resolve the underlying issue with the device or resource being full or unavailable, try one of the following workarounds:

  • Restart the database using the srvctl command to update the status of the database in the clusterware
  • Upgrade the database to the latest patchset levels

Temporary Tablespace Errors

If fixed table statistics are not up to date on the database, backups can fail with errors referencing temporary tablespace present in the dcs-agent.log file. For example:

select status from v$rman_status where COMMAND_ID=<backup_id>

ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
			

Gather your fixed table statics as follows to resolve this issue:

conn / as sysdba

exec dbms_stats.gather_fixed_objects_stats();

RMAN Configuration and Backup Failures

Editing certain RMAN configuration parameters can lead to backup failures in Oracle Cloud Infrastructure. To check your RMAN configuration, use the show all command at the RMAN command line prompt.

See the following list of parameters for details about RMAN the configuration settings that should not be altered for databases in Oracle Cloud Infrastructure.

RMAN configuration settings that should not be altered

RMAN Retention Policy and Backup Failures

The RMAN retention policy configuration can be the source of backup failures. Using the REDUNDANCY retention policy configuration instead of the RECOVERY WINDOW policy can lead to backup failures. Be sure to use the RECOVERY WINDOW OF 30 DAYS configuration.

To configure the RMAN retention policy setting

Loss of Objectstore Wallet File and Backup Failures

RMAN backups fail when an objectstore wallet file is lost. The wallet file is necessary to enable connectivity to the object store.

To confirm that the objectstore wallet file exists and has the correct permissions

TDE Wallet and Backup Failures

Incorrect TDE Wallet Location Specification

For backup operations to work, the $ORACLE_HOME/network/admin/sqlnet.ora file must contain the ENCRYPTION_WALLET_LOCATION parameter formatted exactly as follows:

ENCRYPTION_WALLET_LOCATION=(SOURCE=(METHOD=FILE)(METHOD_DATA=(DIRECTORY=/opt/oracle/dcs/commonstore/wallets/tde/$ORACLE_UNQNAME)))

Important

In this wallet location entry, $ORACLE_UNQNAME is an environment variable and should not be replaced with an actual value.

To check the TDE wallet location specification

Incorrect State of the TDE Wallet

Database backups fail if the TDE wallet is not in the proper state. The following scenarios can cause this problem:

The ORACLE_UNQNAME environment variable was not set when the database was started using SQL*Plus
A pluggable database was added with an incorrectly configured master encryption key

Incorrect Configuration Related to the TDE Wallet

Several configuration parameters related to the TDE wallet can cause backups to fail.

To check configuration related to the TDE wallet

Missing TDE Wallet File

The TDE wallet file (ewallet.p12) can cause backups to fail if it is missing, or if it has incompatible file system permissions or ownership. Check the file as shown in the following example:

[oracle@orcl tde]$ ls -ltr /opt/oracle/dcs/commonstore/wallets/tde/$ORACLE_UNQNAME/ewallet.p12

-rwx------ 1 oracle oinstall 5680 Apr 18 13:09 /opt/oracle/dcs/commonstore/wallets/tde/orclbkp_iadxzy/ewallet.p12

The TDE wallet file should have file permissions with the octal value "700" (-rwx------), and the owner of this file should be a part of the oinstall operating system group.

Missing Auto Login Wallet File

The auto login wallet file (cwallet.sso) can cause backups to fail if it is missing, or if it has incompatible file system permissions or ownership. Check the file as shown in the following example:

[oracle@orcl tde]$ ls -ltr /opt/oracle/dcs/commonstore/wallets/tde/$ORACLE_UNQNAME/cwallet.sso

-rwx------ 1 oracle oinstall 5725 Apr 18 13:09 /opt/oracle/dcs/commonstore/wallets/tde/orclbkp_iadxyz/cwallet.sso

The auto login wallet file should have file permissions with the octal value "700" (-rwx------), and the owner of this file should be a part of the oinstall operating system group.

Other Causes of Backup Failures

Unmounted Commonstore Mount Point

The mount point /opt/oracle/dcs/commonstore must be mounted, or backups will fail.

To check the commonstore mount point
To confirm that ora.data.commonstore.acfs is online

The Database Is Not Properly Registered

Database backups fail if the database is not registered with the dcs-agent. This scenario can occur if you manually migrate the database to Oracle Cloud Infrastructure and do not run the dbcli register-database command.

To check whether the database is properly registered, review the information returned by running the srvctl config database command and the dbcli list-databases command. If either command does not return a record of the database, contact Oracle Support Services.

For instructions on how to register the database, refer to the following topics:

Obtaining Further Assistance

If you were unable to resolve the problem using the information in this topic, follow the procedures below to collect relevant database and diagnostic information. After you have collected this information, contact Oracle Support.

To collect database information for use in problem reports
To collect diagnostic information regarding failed jobs
To collect DCS agent log files
To collect TDE configuration details