Schedule demo

Amazon Redshift Monitoring

Overview

Redshift is a fully managed, cloud-based data warehousing service designed for fast analytic queries on large datasets. It uses a cluster-based architecture with massively parallel processing (MPP) and columnar storage to deliver petabyte-scale warehousing capabilities.

Applications Manager’s Redshift monitoring tool helps you gain visibility into the performance, resource utilization, and availability of your Redshift clusters. The monitor collects essential metrics related to cluster status, database performance, resource usage, and configuration parameters. Monitoring these metrics enables administrators to identify performance bottlenecks, optimize query workloads, and ensure smooth data warehouse operations.

Creating a new Redshift monitor

To learn how to create a new Redshift monitor, refer here.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on the Redshift instance available under Amazon in the Cloud Apps section. Displayed is the Redshift bulk configuration view distributed into three tabs:

  • Availability tab shows the availability history for the past 24 hours or 30 days.
  • Performance tab shows health status and events for the past 24 hours or 30 days.
  • List view tab enables you to perform bulk admin configurations.

By clicking a monitor from the list, you'll be taken to the Amazon Redshift dashboard which includes the following tabs:

Performance Overview

ParameterDescription
CLUSTER INFORMATION
Health StatusIndicates the health of the cluster.
Possible values: Healthy, Unhealthy.
Maintenance ModeIndicates whether the cluster is in maintenance mode.
Possible values: ON, OFF.
Cluster StatusDisplays the current operational state of the cluster.
Cluster Availability StatusShows the availability of the cluster for queries.
Possible values: Available, Unavailable, Maintenance, Modifying, Failed.
Number of NodesTotal number of compute nodes in the cluster.
Total Storage CapacityThe total storage capacity of the cluster, measured in GiB.
CPU UTILIZATION
CPU Utilization (%)The average percentage of CPU utilization for the entire cluster at the time of polling, representing overall processing load across leader and compute nodes.
DISK UTILIZATION
Disk Utilization (%)The average percentage of disk space used across the cluster at the time of polling, representing overall storage consumption.
DATA THROUGHPUT
Rate of Read ThroughputTotal amount of data read from disk per second by the cluster between the polling interval (MB/second).
Read ThroughputTotal amount of data read from disk between the polling interval (MB).
Rate of Write ThroughputTotal amount of data written to disk per second by the cluster between the polling interval (MB/second).
Write ThroughputTotal amount of data written to disk between the polling interval (MB).
NETWORK THROUGHPUT
Inbound Network ThroughputThe average rate at which the cluster receives data over the network between the polling interval (MB/second).
Outbound Network ThroughputThe average rate at which the cluster sends data over the network between the polling interval (MB/second).
CLUSTER IOPS
Read IOPSThe average number of disk read operations performed per second by the cluster between the polling interval.
Write IOPSThe average number of disk write operations performed per second by the cluster between the polling interval.
CLUSTER LATENCY
Read LatencyThe average time taken by the cluster to complete disk read operations between the polling interval (seconds).
Write LatencyThe average time taken by the cluster to complete disk write operations between the polling interval (seconds).
DATABASE CONNECTIONS
Database ConnectionsThe average number of active database connections to the cluster between the polling interval.
COMMIT QUEUE LENGTH
Commit Queue LengthThe maximum number of transactions waiting to be committed during the poll interval.
ACTIVE USER TABLES
Active User TablesThe total number of active user tables in the cluster between the polling interval, excluding Redshift Spectrum tables.
EXCEEDED SCHEMA QUOTAS
Exceeded Schema QuotasThe maximum number of schemas that have exceeded their configured storage quotas between the polling interval.
CONCURRENCY SCALING SECONDS
Concurrency Scaling SecondsTotal seconds consumed by concurrency scaling clusters actively processing queries between the polling interval.
CONCURRENCY SCALING CLUSTERS
Active Concurrency Scaling ClustersThe maximum number of concurrency scaling clusters actively processing queries between the polling interval.
Max Concurrency Scaling ClustersThe maximum number of concurrency scaling clusters allowed for the cluster based on parameter group settings at the time of polling.
 
Note:Concurrency Scaling Seconds and Concurrency Scaling Clusters graph metrics are disabled from data collection by default and mapped under performance polling as Concurrency Scaling Clusters. To enable data collection, navigate to Settings → Performance Polling, select the Optimize Data Collection tab, choose Amazon Redshift as the monitor type and Concurrency Scaling Clusters as the metric name, then set the preferred time interval.

Database Performance

ParameterDescription
QUERY DURATION
Long Query Duration (>10 min)Average time taken to complete queries that exceed 10 minutes between the polling interval (seconds).
Medium Query Duration (1–10 min)Average time to complete queries that run between 1 and 10 minutes between the polling interval (seconds).
Short Query Duration (<1 min)Average time to complete queries taking less than 1 minute between the polling interval (seconds).
QUERY THROUGHPUT
Long Query Throughput (>10 min)Average number of long queries completed per second between the polling interval (queries/second).
Medium Query Throughput (1–10 min)Average number of medium queries completed per second between the polling interval (queries/second).
Short Query Throughput (<1 min)Average number of short queries completed per second between the polling interval (queries/second).
QUERY LIFECYCLE PHASES
Query Planning TimeAverage time queries spent in the QueryPlanning stage between the polling interval (milliseconds).
Query Waiting TimeAverage time queries spent in the QueryWaiting stage between the polling interval (milliseconds).
Query Commit TimeAverage time queries spent in the QueryCommit stage between the polling interval (milliseconds).
QUERY DATA READ TIME
Query Executing Read TimeAverage time queries spent in the QueryExecutingRead stage between the polling interval(milliseconds).
Query Executing Unload TimeAverage time queries spent in the QueryExecutingUnload stage between the polling interval (milliseconds).
QUERY DATA MODIFICATION TIME
Query Executing Insert TimeAverage time queries spent in the QueryExecutingInsert stage between the polling interval (milliseconds).
Query Executing Delete TimeAverage time queries spent in the QueryExecutingDelete stage between the polling interval (milliseconds).
Query Executing Update TimeAverage time queries spent in the QueryExecutingUpdate stage between the polling interval (milliseconds).
Query Executing CTAS TimeAverage time queries spent in the QueryExecutingCtas stage between the polling interval (milliseconds).
Query Executing Copy TimeAverage time queries spent in the QueryExecutingCopy stage between the polling interval (milliseconds).
 
Note:Query Lifecycle Phases, Query Data Read Time, and Query Data Modification Time graph metrics are disabled from data collection by default and mapped under performance polling as Query Runtime Breakdown. To enable data collection, navigate to Settings → Performance Polling, select the Optimize Data Collection tab, choose Amazon Redshift as the monitor type and Query Runtime Breakdown as the metric name, then set the preferred time interval.

Node

ParameterDescription
NODE PERFORMANCE DETAILS
Node IDThe unique identifier of the node in the cluster.
CPU UtilizationThe average percentage of CPU utilization for the entire Amazon Redshift cluster at the time of polling, representing the overall processing load for specific leader or compute nodes (%).
Disk UtilizationThe average percentage of disk space used on an individual node within the Amazon Redshift cluster at the time of polling, indicating storage consumption for that specific leader or compute node (%).
Read ThroughputThe total amount of data read from disk by an individual node within the Amazon Redshift cluster during the polling interval, indicating read throughput for that specific leader or compute node (MB/s).
Write ThroughputThe total amount of data written to disk by an individual node within the Amazon Redshift cluster between the polling interval (MB/s).
Inbound Network ThroughputThe average rate at which an individual node within the Amazon Redshift cluster receives data over the network between the polling interval (MB/s).
Outbound Network ThroughputThe average rate at which an individual node within the Amazon Redshift cluster sends data over the network during the polling interval, indicating outbound network throughput for that specific leader or compute node (MB/s).
NODE IOPS DETAILS
Node IDThe unique identifier of the node in the cluster.
Read IOPSThe average number of disk read operations performed by an individual node within the Amazon Redshift cluster per second between the polling interval (operations/s).
Write IOPSThe average number of disk write operations performed by an individual node within the Amazon Redshift cluster per second during the polling interval, indicating write IOPS for that specific leader or compute node (operations/s).
Read LatencyThe average time taken by an individual node in seconds, within the Amazon Redshift cluster to complete disk read I/O operations between the polling interval (seconds).
Write LatencyThe average time taken by an individual node in seconds, within the Amazon Redshift cluster to complete disk write I/O operations between the polling interval (seconds).
 
Note:Node Performance Details and Node IOPS Details metrics are disabled from data collection by default and mapped under performance polling as Node Metrics. To enable data collection, navigate to Settings → Performance Polling, select the Optimize Data Collection tab, choose Amazon Redshift as the monitor type and Node metrics as the metric name, then set the preferred time interval.

Configuration

ParameterDescription
CLUSTER CONFIGURATION
Node TypeThe node type of nodes in the cluster.
Endpoint AddressThe DNS address of the cluster.
Cluster Creation TimeThe date and time that the cluster was created.
Availability ZoneThe name of the Availability Zone in which the cluster is located.
NETWORK CONFIGURATION
VPC IDThe identifier of the VPC the cluster is in, if the cluster is in a VPC.
Publicly AccessibleIndicates whether the cluster can be accessed from a public network.
Possible values: Yes, No.
Cluster Subnet GroupThe name of the subnet group that is associated with the cluster. This parameter is valid only when the cluster is in a VPC.
Automated Snapshot Retention PeriodThe number of days that automatic cluster snapshots are retained.
DATABASE CONFIGURATION
Database NameThe name of the initial database created with the cluster; defaults to dev if not specified.
Master UsernameThe admin username for the cluster. This name is used to connect to the database specified in the DBName parameter.
Parameter GroupThe name of the parameter group associated with the cluster.
Parameter Apply StatusIndicates whether parameter changes have been applied to the cluster.
EncryptionIndicates whether data in the cluster is encrypted at rest.
Possible values: Enabled, Disabled.
Endpoint PortThe port that the database engine is listening on.
MAINTENANCE CONFIGURATION
Cluster VersionThe version ID of the Amazon Redshift engine running on the cluster.
Allow Version UpgradeIndicates if major version upgrades are applied automatically during maintenance windows.
Preferred Maintenance WindowThe weekly UTC time range when system maintenance can occur.
Next ScheduleThe number of days until the next scheduled maintenance window.

Loved by customers all over the world

"Standout Tool With Extensive Monitoring Capabilities"

It allows us to track crucial metrics such as response times, resource utilization, error rates, and transaction performance. The real-time monitoring alerts promptly notify us of any issues or anomalies, enabling us to take immediate action.

Reviewer Role: Research and Development

carlos-rivero
"I like Applications Manager because it helps us to detect issues present in our servers and SQL databases."
Carlos Rivero

Tech Support Manager, Lexmark

Trusted by over 6000+ businesses globally