# AWS Backup Monitoring ## AWS Backup - Overview AWS Backup is a fully managed service that centralizes and automates data protection across AWS services and hybrid workloads. It provides a unified backup policy framework, enabling you to configure, schedule, and manage backup jobs across resources such as Amazon EBS, Amazon RDS, Amazon DynamoDB, Amazon EFS, and more from a single console. AWS Backup operates using two key monitor types: **Backup** and **Backup Vault**. - Backup monitors provide region-level visibility into backup, restore, and copy job metrics, including job success and failure rates, completion times, and recovery point statuses. - Backup Vault is a secure, logical storage container within AWS Backup that stores and manages recovery points. Backup Vault monitors provide vault-level visibility into recovery points, backup and copy job statistics, vault lock configurations, and retention policies. With proactive alerting and historical trend analysis, Applications Manager's AWS Backup monitoring tools help prevent backup failures, identify bottlenecks, and maintain data recovery readiness. ## Creating a new AWS Backup monitor To learn how to create a new AWS Backup/AWS Backup Vault monitor, [refer here](https://www.manageengine.com/products/applications_manager/help/aws-monitoring-tools.html#NewMonitor). ## Monitored Parameters Go to the **Monitors Category View** by clicking the **Monitors** tab. Click on the **Backup** or **Backup Vault** instance available under **Amazon** in the **Cloud Apps** section. Click on the [AWS Backup](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup) monitor to see all the metrics listed under the following tabs: - [Overview](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup-overview) - [Backup Jobs](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup-backup-jobs) - [Restore Jobs](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup-restore-jobs) - [Copy Jobs](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup-copy-jobs) - [Backup Plans](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#backup-plans) Click on the [AWS Backup Vault](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault) monitor to see all the metrics listed under the following tabs: - [Overview](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault-overview) - [Backup Jobs](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault-backup-jobs) - [Copy Jobs](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault-copy-jobs) - [Recovery Points](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault-recovery-points) - [Configuration](https://www.manageengine.com/products/applications_manager/help/aws-backup-monitoring-tools.html#vault-configuration) ## AWS Backup Metrics Displayed below is the AWS Backup bulk configuration view distributed into three tabs: - **Availability** tab gives the availability history for the past 24 hours or 30 days. - **Performance** tab gives the health status and events for the past 24 hours or 30 days. - **List view** tab enables you to perform [bulk admin configurations](https://www.manageengine.com/products/applications_manager/help/bulk-config.html). By clicking a monitor from the list, you'll be taken to the AWS Backup dashboard. ### AWS Backup - Overview | Parameter | Description | |---|---| | **BACKUP JOBS STATISTICS** | | | Failed Backup Jobs | The total number of backup jobs that failed between the poll interval. | | Expired Backup Jobs | The total number of backup jobs that expired before completion between the poll interval. | | Aborted Backup Jobs | The total number of backup jobs that were aborted between the poll interval. | | Completed Backup Jobs | The total number of backup jobs that completed successfully between the poll interval. | | Created Backup Jobs | The total number of backup jobs created between the poll interval. | | **BACKUP JOBS IN-PROGRESS STATISTICS** | | | Pending Backup Jobs | The average number of backup jobs in pending state between the poll interval. | | Running Backup Jobs | The average number of backup jobs currently running between the poll interval. | | **COPY JOBS STATISTICS** | | | Failed Copy Jobs | The total number of copy jobs that failed between the poll interval. | | Completed Copy Jobs | The total number of copy jobs that completed successfully between the poll interval. | | Created Copy Jobs | The total number of copy jobs created between the poll interval. | | **COPY JOBS IN-PROGRESS STATISTICS** | | | Running Copy Jobs | The average number of copy jobs currently running between the poll interval. | | **RESTORE JOBS STATISTICS** | | | Failed Restore Jobs | The total number of restore jobs that failed between the poll interval. | | Completed Restore Jobs | The total number of restore jobs that completed successfully between the poll interval. | | **RESTORE JOBS IN-PROGRESS STATISTICS** | | | Running Restore Jobs | The average number of restore jobs currently running between the poll interval. | | Pending Restore Jobs | The average number of restore jobs in pending state between the poll interval. | | **RECOVERY POINTS STATISTICS** | | | Expired Recovery Points | The total number of recovery points that have expired between the poll interval. | | Partial Recovery Points | The total number of recovery points in partial state between the poll interval. | | Completed Recovery Points | The total number of recovery points in completed state between the poll interval. | | **DELETING RECOVERY POINTS** | | | Deleting Recovery Points | The average number of recovery points currently being deleted between the poll interval. | | **BACKUP COMPLETION TIME** | | | Backup Completion Time | The average time taken for backup jobs to complete, calculated from CreationDate to CompletionDate across completed backup jobs (in seconds). | | **BACKUP SIZE** | | | Backup Size | The average size of completed backup jobs (in MB). | | **COPY COMPLETION TIME** | | | Copy Completion Time | The average time taken for copy jobs to complete, calculated from CreationDate to CompletionDate across completed copy jobs (in seconds). | ### AWS Backup - Backup Jobs | Parameter | Description | |---|---| | **BACKUP JOB RATE** | | | Backup Job Failure Rate | The percentage of backup jobs that failed out of all completed, failed, and expired backup jobs (in %). | | Backup Job Success Rate | The percentage of backup jobs that completed successfully out of all completed, failed, and expired backup jobs (in %). | **Note:** - **Backup Rate Metrics:** Success and failure rates are calculated based only on terminal states (Completed, Failed, and Expired). Jobs in other states such as Created, Running, Pending, and Aborted are excluded. | Parameter | Description | |---|---| | **Backup Job Details** | | | Job ID | The unique identifier of the job. | | Backup Plan | The ID of the backup plan associated with this job. | | Service Type | The AWS service type of the backed-up resource. | | Resource ID | The resource identifier associated with the job. | | Vault | The name of the backup vault where the backup is stored. | | Backup Size | The size of the backup (in MB). | | Created Time | The date and time the job was created. | | Start By | The date and time by which the job must start before it is cancelled. | | Completion Date | The date and time the job was completed. | | Status | The current status of the job. | | Status Message | A detailed status message explaining the reason for the job status. | ### AWS Backup - Restore Jobs | Parameter | Description | |---|---| | **RESTORE JOB RATE** | | | Restore Job Failure Rate | The percentage of restore jobs that failed out of all completed and failed restore jobs (in %). | | Restore Job Success Rate | The percentage of restore jobs that completed successfully out of all completed and failed restore jobs (in %). | **Note:** - **Restore Rate Metrics:** Success and failure rates are calculated based only on terminal states (Completed and Failed). Jobs in other states such as Pending and Running are excluded. | Parameter | Description | |---|---| | **Restore Job Details** | | | Job ID | The unique identifier of the job. | | Service Type | The AWS service type of the backed-up resource. | | Resource ID | The resource identifier associated with the job. | | Recovery Point ID | The ARN of the recovery point associated with this restore job. | | Backup Size | The size of the backup (in MB). | | Created Time | The date and time the job was created. | | Completion Date | The date and time the job was completed. | | Status | The current status of the job. | | Status Message | A detailed status message explaining the reason for the job status. | ### AWS Backup - Copy Jobs | Parameter | Description | |---|---| | **COPY JOB RATE** | | | Copy Job Failure Rate | The percentage of copy jobs that failed out of all completed and failed copy jobs (in %). | | Copy Job Success Rate | The percentage of copy jobs that completed successfully out of all completed and failed copy jobs (in %). | **Note:** - **Copy Rate Metrics:** Success and failure rates are calculated based only on terminal states (Completed and Failed). Jobs in other states such as Created and Running are excluded. | Parameter | Description | |---|---| | **Copy Job Details** | | | Job ID | The unique identifier of the job. | | Source Vault | The name of the source backup vault from which the copy was initiated. | | Destination Vault | The destination backup vault for the copy job. | | Resource ID | The resource identifier associated with the job. | | Backup Size | The size of the backup (in MB). | | Created Time | The date and time the job was created. | | Completion Date | The date and time the job was completed. | | Status | The current status of the job. | | Status Message | A detailed status message explaining the reason for the job status. | ### AWS Backup - Backup Plans | Parameter | Description | |---|---| | **Backup Plan Details** | | | Plan ID | The unique identifier of the backup plan. | | Backup Plan Name | The display name of the backup plan. | | Version ID | The unique version ID of the backup plan. | | Created Time | The date and time the backup plan was created. | | Last Executed Time | The date and time the backup plan was last executed. | ## AWS Backup Vault Metrics Displayed below is the AWS Backup Vault bulk configuration view distributed into three tabs: - **Availability** tab gives the availability history for the past 24 hours or 30 days. - **Performance** tab gives the health status and events for the past 24 hours or 30 days. - **List view** tab enables you to perform [bulk admin configurations](https://www.manageengine.com/products/applications_manager/help/bulk-config.html). By clicking a monitor from the list, you'll be taken to the AWS Backup Vault dashboard. ### AWS Backup Vault - Overview | Parameter | Description | |---|---| | **BACKUP JOBS STATISTICS** | | | Failed Backup Jobs | The total number of backup jobs that failed between the poll interval. | | Expired Backup Jobs | The total number of backup jobs that expired before completion between the poll interval. | | Aborted Backup Jobs | The total number of backup jobs that were aborted between the poll interval. | | Completed Backup Jobs | The total number of backup jobs that completed successfully between the poll interval. | | Created Backup Jobs | The total number of backup jobs created between the poll interval. | | **BACKUP JOBS IN-PROGRESS STATISTICS** | | | Pending Backup Jobs | The average number of backup jobs in pending state between the poll interval. | | Running Backup Jobs | The average number of backup jobs currently running between the poll interval. | | **COPY JOBS STATISTICS** | | | Failed Copy Jobs | The total number of copy jobs that failed between the poll interval. | | Completed Copy Jobs | The total number of copy jobs that completed successfully between the poll interval. | | Created Copy Jobs | The total number of copy jobs created between the poll interval. | | **COPY JOBS IN-PROGRESS STATISTICS** | | | Running Copy Jobs | The average number of copy jobs currently running between the poll interval. | | **RESTORE JOBS STATISTICS** | | | Failed Restore Jobs | The total number of restore jobs that failed between the poll interval. | | Completed Restore Jobs | The total number of restore jobs that completed successfully between the poll interval. | | **RESTORE JOBS IN-PROGRESS STATISTICS** | | | Running Restore Jobs | The average number of restore jobs currently running between the poll interval. | | Pending Restore Jobs | The average number of restore jobs in pending state between the poll interval. | | **RECOVERY POINTS STATISTICS** | | | Expired Recovery Points | The total number of recovery points that have expired between the poll interval. | | Cold Recovery Points | The total number of recovery points in cold storage between the poll interval. | | Partial Recovery Points | The total number of recovery points in partial state between the poll interval. | | Completed Recovery Points | The total number of recovery points in completed state between the poll interval. | | **DELETING RECOVERY POINTS** | | | Deleting Recovery Points | The average number of recovery points currently being deleted between the poll interval. | | **BACKUP COMPLETION TIME** | | | Backup Completion Time | The average time taken for backup jobs to complete, calculated from CreationDate to CompletionDate across completed backup jobs (in seconds). | | **BACKUP SIZE** | | | Backup Size | The average size of completed backup jobs (in MB). | | **COPY COMPLETION TIME** | | | Copy Completion Time | The average time taken for copy jobs to complete, calculated from CreationDate to CompletionDate across completed copy jobs (in seconds). | ### AWS Backup Vault - Backup Jobs | Parameter | Description | |---|---| | **BACKUP JOB RATE** | | | Backup Job Failure Rate | The percentage of backup jobs that failed out of all completed, failed, and expired backup jobs (in %). | | Backup Job Success Rate | The percentage of backup jobs that completed successfully out of all completed, failed, and expired backup jobs (in %). | **Note:** - **Backup Rate Metrics:** Success and failure rates are calculated based only on terminal states (Completed, Failed, and Expired). Jobs in other states such as Created, Running, Pending, and Aborted are excluded. | Parameter | Description | |---|---| | **Backup Job Details** | | | Job ID | The unique identifier of the job. | | Service Type | The AWS service type of the backed-up resource. | | Resource ID | The resource identifier associated with the job. | | Backup Size | The size of the backup (in MB). | | Created Time | The date and time the job was created. | | Completion Date | The date and time the job was completed. | | Status | The current status of the job. | | Status Message | A detailed status message explaining the reason for the job status. | ### AWS Backup Vault - Copy Jobs | Parameter | Description | |---|---| | **COPY JOB RATE** | | | Copy Job Failure Rate | The percentage of copy jobs that failed out of all completed and failed copy jobs (in %). | | Copy Job Success Rate | The percentage of copy jobs that completed successfully out of all completed and failed copy jobs (in %). | **Note:** - **Copy Rate Metrics:** Success and failure rates are calculated based only on terminal states (Completed and Failed). Jobs in other states such as Created and Running are excluded. | Parameter | Description | |---|---| | **Copy Job Details** | | | Job ID | The unique identifier of the job. | | Source Vault | The name of the source backup vault from which the copy was initiated. | | Destination Vault | The destination backup vault for the copy job. | | Resource ID | The resource identifier associated with the job. | | Backup Size | The size of the backup (in MB). | | Created Time | The date and time the job was created. | | Completion Date | The date and time the job was completed. | | Status | The current status of the job. | | Status Message | A detailed status message explaining the reason for the job status. | ### AWS Backup Vault - Recovery Points | Parameter | Description | |---|---| | **RECOVERY POINTS STATISTICS** | | | Expired Recovery Points | The total number of recovery points that have expired between the poll interval. | | Cold Recovery Points | The total number of recovery points in cold storage between the poll interval. | | Partial Recovery Points | The total number of recovery points in partial state between the poll interval. | | Completed Recovery Points | The total number of recovery points in completed state between the poll interval. | | **DELETING RECOVERY POINTS** | | | Deleting Recovery Points | The average number of recovery points currently being deleted between the poll interval. | | **Recovery Point Details** | | | Recovery Point ID | The unique ARN of the recovery point. | | Service Type | The AWS service type of the backed-up resource. | | Resource ID | The resource identifier associated with the recovery point. | | Created Time | The date and time the recovery point was created. | | Last Restore Time | The date and time the recovery point was last used for a restore. | | Status | The current status of the recovery point. | ### AWS Backup Vault - Configuration | Parameter | Description | |---|---| | **VAULT INFORMATION** | | | Vault Name | The name of the backup vault. | | Creation Date | The date the backup vault was created. | | Number of Recovery Points | The total number of recovery points stored in the vault. | | **VAULT LOCK CONFIGURATION** | | | Vault Lock Enabled | Specifies whether vault lock is enabled for this backup vault. | | Lock Date | The date the vault lock was applied. | | Max Retention Days | The maximum retention period that the vault retains its recovery points (in day(s)). | | Min Retention Days | The minimum retention period that the vault retains its recovery points (in day(s)). | | Encryption Key ARN | The server-side encryption key that is used to protect the backups stored in the vault. |