AWS Elastic Disaster Recovery (DRS) is a cost-effective, reliable disaster recovery solution that minimizes downtime and prevents data loss by continuously replicating source servers using block-level replication to AWS.
ManageEngine Applications Manager's integration with AWS DRS offers a streamlined way to monitor and manage your disaster recovery operations. This integration provides real-time monitoring of your regional DRS environment, providing an overview of source server fleet status, replication states, and recovery readiness across AWS Regions.
Applications Manager provides monitoring through three distinct components:
To learn how to create a new Elastic Disaster Recovery monitor, refer here.
Go to the Monitors Category View by clicking the Monitors tab. Click on the Elastic Disaster Recovery (DRS) instance available under Amazon in the Cloud Apps section. Displayed is the Elastic Disaster Recovery bulk configuration view distributed into three tabs:
By clicking on the respective monitor name from the list, you'll be taken to the Elastic Disaster Recovery dashboard.
Click on the Elastic Disaster Recovery (DRS) monitor to see all the metrics listed under the following tab:
Click on the DRS Source Server monitor to see all the metrics listed under the following tabs:
Click on the DRS Recovery Instance monitor to see all the metrics listed under the following tabs:
| Parameter | Description |
|---|---|
| SOURCE SERVER FLEET OVERVIEW | |
| Source Servers | The average number of source servers being replicated to AWS DRS in this region between the poll interval. |
| Active Source Servers | The average number of source servers actively replicating data to AWS DRS in this region between the poll interval. |
| Protected Source Servers | The average number of source servers that are fully protected and ready for recovery in this region between the poll interval. |
| SOURCE SERVERS | |
| Source Server ID | The unique identifier for the source server. |
| Hostname | The hostname of the source server. |
| Data Replication State | Detailed data replication state of the source server. |
| Last Launch Result | Result of the last recovery launch attempt. |
| RECOVERY INSTANCES | |
| Recovery Instance ID | The unique identifier of the recovery instance. |
| Source Server ID | The unique identifier for the source server. |
| EC2 Instance ID | EC2 instance ID of the recovery instance attached. |
| EC2 Instance State | The current state of the EC2 recovery instance. |
| Failback State | The current failback state of the recovery instance. |
| Parameter | Description |
|---|---|
| LAST RECOVERY DETAILS | |
| Last Launch Type | The type of the last recovery launch. |
| Last Launch Job ID | The job ID of the last recovery launch. |
| Last Launch Result | The result of the last recovery launch attempt for this source server. |
| Last Launch Time | The time of the last recovery launch API call. |
| REPLICATION PROGRESS | |
| Replication Progress | The average percentage of data that has been replicated to the staging area at the time of polling (in %). |
| LAG DURATION | |
| Lag Duration | The average amount of time that the source server is behind the replication target between the poll interval (in s). |
| BACKLOG | |
| Backlog | The maximum amount of data that has not yet been replicated to the staging area at the time of polling (in MB). |
| ELAPSED REPLICATION DURATION | |
| Elapsed Replication Duration | The maximum elapsed time of the replication run at the time of polling (in mins). |
| DURATION SINCE LAST SUCCESSFUL RECOVERY LAUNCH | |
| Duration Since Last Successful Recovery Launch | The maximum elapsed time since the last successful recovery launch at the time of polling (in mins). |
| Parameter | Description |
|---|---|
| RECOVERY INSTANCES | |
| Recovery Instance ID | The unique identifier of the recovery instance. |
| EC2 Instance ID | EC2 instance ID of the recovery instance attached. |
| EC2 Instance State | The current state of the EC2 recovery instance. |
| Failback State | The current failback state of the recovery instance. |
| Parameter | Description |
|---|---|
| SOURCE SERVER DETAILS | |
| Recovery Instance ID | The recovery instance ID associated with this source server. |
| Hostname | The hostname of the source server. |
| Agent Version | The version of the AWS Replication Agent installed on the source server. |
| Creation Time | The date and time the source server was added to the DRS service. |
| Last Updated Time | The date and time the source server was last updated. |
| LAUNCH SETTINGS | |
| Instance Type Right-Sizing Method | The method used for right-sizing the target EC2 instance type. |
| Copy Private IP | Indicates if the private IP address is copied during launch. |
| Copy Tags | Indicates if the tags are copied from the source server to the recovery instance. |
| Launch Template ID | The EC2 launch template ID used for launching recovery instances. |
| OS BYOL | Indicates if Bring Your Own License (BYOL) is enabled for the operating system. |
| REPLICATION SETTINGS | |
| Staging Area Subnet ID | The subnet ID of the staging area. |
| EBS Encryption | The EBS encryption setting for replicated disks. |
| Replication Server Instance Type | The EC2 instance type used for the replication server. |
| Default Staging Disk Type (Large) | The default EBS volume type for large staging disks. |
| Auto-Replicate New Disks | Indicates if the new disks added to the source server are automatically replicated. |
| Use Dedicated Replication Server | Indicates if a dedicated replication server is used for this source server. |
| Associate Default Security Group | Whether the default security group is associated with the replication server. |
| Replication Server Security Group IDs | The security group IDs associated with the replication server. |
| Data Plane Routing | The network routing used for data replication. |
| Bandwidth Throttling | The bandwidth throttling setting, where 0 indicates no throttling (in Mbps). |
| Create Public IP | Indicates if a public IP is created for the replication server. |
| REPLICATION STATUS DETAILS | |
| Replication Direction | The direction of data replication for the source server. |
| Data Replication Error | The error message for the current data replication, if any. |
| Data Replication State | The current state of data replication for this source server. |
| Replicating From | The Availability Zone from which data is being replicated. |
| Replicating To | The staging Availability Zone to which data is being replicated. |
| Replicated Storage | The total replicated storage across all disks (in GB). |
| Total Storage | The total storage capacity across all disks (in GB). |
| Parameter | Description |
|---|---|
| INSTANCE INFORMATION | |
| Data Replication State | The current state of data replication for this recovery instance. |
| Data Replication Error | The error message for the current data replication, if any. |
| Failback State | The current failback state of the recovery instance. |
| REPLICATION PROGRESS | |
| Replication Progress | The average progress of the data synchronization process for the recovery instance at the time of polling (in %). |
| REPLICATION LAG DURATION | |
| Lag Duration | The average time difference between the source and recovery instance, representing potential data loss (RPO) between the poll interval (in s). |
| REPLICATION BACKLOG | |
| Replication Backlog | The maximum amount of data waiting to be synchronized to the recovery instance at the time of polling (in MB). |
| ELAPSED REPLICATION DURATION | |
| Elapsed Replication Duration | The maximum time the recovery instance has been in its current replication state at the time of polling (in mins). |
| Parameter | Description |
|---|---|
| RECOVERY INSTANCE DETAILS | |
| Source Server ID | The unique identifier for the source server. |
| EC2 Instance ID | The EC2 instance ID of the recovery instance. |
| EC2 Instance State | The current state of the EC2 recovery instance. |
| Job ID | The job ID that initiated the recovery instance. |
| Drill Instance | Indicates if this recovery instance was launched as a drill. |
| Hostname | The hostname of the source server. |
| FQDN | The fully qualified domain name (FQDN) of the recovery instance. |
| Agent Version | The version of the AWS Replication Agent installed on the source server. |
| Point-in-Time Snapshot Timestamp | The timestamp of the last point-in-time snapshot. |
| Last Updation Time | The date and time the source server was last updated. |
| Agent Last Seen | The timestamp when the agent was last seen by the service. |
| REPLICATION STATUS DETAILS | |
| Replicating From | The Availability Zone from which data is being replicated. |
| Replicating To | The staging Availability Zone to which data is being replicated. |
| Replication Start Time | The timestamp when the replication has started. |
| Replicated Storage | The total replicated storage across all disks (in GB). |
| Total Storage | The total storage capacity across all disks (in GB). |
| FAILBACK DETAILS | |
| Failback State | The current failback state of the recovery instance. |
| Failback Client ID | The failback client ID of the recovery instance. |
| Failback Job ID | The failback job ID associated with the recovery instance. |
| Failback to Original Server | Indicates if failback is configured to return to the original source server. |
| Failback Client Last Seen | The timestamp when the failback client was last seen. |
It allows us to track crucial metrics such as response times, resource utilization, error rates, and transaction performance. The real-time monitoring alerts promptly notify us of any issues or anomalies, enabling us to take immediate action.
Reviewer Role: Research and Development