Schedule demo

Amazon MSK Cluster Monitoring


Amazon MSK Cluster - An Overview

Applications Manager provides comprehensive monitoring for Amazon Managed Streaming for Apache Kafka (Amazon MSK) clusters. By continuously tracking cluster health, performance, and configuration details, Applications Manager helps you ensure high availability, optimal performance, and quick issue detection across your MSK infrastructure.

Creating a new Amazon MSK Cluster Monitor

To learn how to create a new Amazon MSK Cluster Monitor, refer here.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on MSK Cluster from the 'Child Monitors' under the Cloud Apps table. The bulk configuration view is displayed with three tabs:

  • Availability tab shows the availability history for the past 24 hours or 30 days.
  • Performance tab provides the health status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.

Click on the monitor name to see all the Amazon MSK Cluster metrics listed under the following tabs:

Performance Overview

ParameterDescription
CLUSTER INFORMATION
Cluster StateThe current operational state of the MSK cluster.
Possible Values: ACTIVE, CREATING, UPDATING, DELETING, FAILED, MAINTENANCE, REBOOTING_BROKER, HEALING
Cluster TypeThe type of backend cluster.
Broker TypeThe type of broker used for the cluster.
Possible Values: STANDARD, EXPRESS
Metadata ModeThe configuration of how cluster metadata is managed.
Possible Values: KRaft, ZooKeeper

Note: Applicable only for STANDARD broker type. Metadata mode configuration is supported only for STANDARD clusters. EXPRESS clusters use AWS-managed serverless configuration by default and do not provide an option to select or modify the metadata mode.

Number of BrokersThe total number of broker nodes currently provisioned in the cluster.
Zookeeper Session StateThe connection status of the ZooKeeper session for brokers at the time of polling. This represents the worst state observed among all brokers.
Possible Values:NOT_CONNECTED, ASSOCIATING, CONNECTING, CONNECTED_READ_ONLY, CONNECTED, CLOSED, AUTH_FAILED

Note: Applicable only for STANDARD broker type with ZooKeeper as the metadata mode. ZooKeeper session state is relevant only when the cluster uses ZooKeeper for metadata management and is not applicable for KRaft or EXPRESS clusters.

ACTIVE CONTROLLERS
Active ControllersThe minimum number of active controllers managing the cluster between the polling interval. This helps track if the cluster ever loses its active controller, which is critical to prevent control plane failures.
KAFKA DATALOGS DISK UTILIZATION
Kafka Datalogs Disk UtilizationThe average disk space utilized for Kafka data logs at the time of polling (in %).

Note: Applicable only for STANDARD broker type. Disk utilization metrics are applicable only to STANDARD clusters. EXPRESS clusters use serverless elastic storage and therefore do not report disk utilization.

OFFLINE PARTITIONS
Offline PartitionsThe maximum number of partitions in the cluster that are offline between the polling interval. This helps track inactive or down partitions, indicating data unavailability or broker outages.

Note: Applicable only for STANDARD broker type. Partition leadership and offline state metrics are available only for STANDARD clusters. EXPRESS clusters are fully AWS-managed and do not expose partition leadership or offline state details to users.

GLOBAL PARTITIONS
Global PartitionsThe maximum number of partitions in a cluster, excluding replicas, between the polling interval. This helps track the highest total partition count at any time, essential for resource allocation and capacity planning.
GLOBAL TOPICS
Global TopicsThe maximum number of topics across all brokers in a cluster between the polling interval. This helps track the peak topic count, which is important for assessing capacity requirements and workload spikes.
CLIENT CONNECTIONS
Client ConnectionsThe average number of clients that are actively connected to the brokers in the cluster during the polling interval.
ZOOKEEPER REQUEST MEAN LATENCY
ZooKeeper Request Mean LatencyThe mean latency for Apache ZooKeeper requests during the polling interval (in ms).

Note: Applicable only for STANDARD broker type with ZooKeeper as the metadata mode. This metric is not applicable in KRaft mode or for EXPRESS clusters.

EXPRESS CLUSTER STORAGE USED
Express Cluster Storage UsedThe average amount of storage used across all partitions in the cluster, excluding replicas between the polling interval.

Note: Applicable only for EXPRESS broker type. Express Cluster Storage Used is specific to EXPRESS clusters, which use AWS-managed serverless elastic storage, and is not applicable to STANDARD clusters with EBS-backed storage.

Configuration

ParameterDescription
CLUSTER CONFIGURATION
Kafka VersionThe property that indicates the Apache Kafka version currently running on the cluster.
User Action StatusThe type of action required from the user.
Possible Values: CRITICAL_ACTION_REQUIRED, ACTION_RECOMMENDED, NONE
Broker SizeThe instance type selected for brokers in the Amazon MSK cluster. The broker size determines the broker's compute, memory, and storage capacities.
Creation TimeThe time when the cluster was created.
Enhanced MonitoringIndicates the level of cluster metrics that Amazon MSK publishes to the user's CloudWatch account.
Possible Values: DEFAULT, PER_BROKER, PER_TOPIC_PER_BROKER, PER_TOPIC_PER_PARTITION
SECURITY CONFIGURATION
Unauthenticated AccessSpecifies whether the cluster allows unauthenticated client connections.
TLS Client AuthenticationDefines whether client connections to the cluster require TLS certificate-based authentication.
SASL/IAM AuthenticationIndicates whether the cluster is enabled for SASL authentication using AWS IAM credentials.
SASL/SCRAM AuthenticationIndicates whether SASL/SCRAM (username-password based) authentication is enabled for client access.
Encryption Key (Data At Rest)The AWS KMS key used to encrypt data stored at rest within the cluster.
STORAGE CONFIGURATION

Note: Applicable only for STANDARD broker type. Provisioned storage settings apply only to STANDARD clusters with EBS volumes. EXPRESS clusters use AWS-managed serverless storage without provisioned capacity.

EBS Storage Volume Per BrokerThe Amazon EBS volume size allocated for each broker in the cluster (in GiB).
Provisioned Storage Throughput Per BrokerThe throughput of the EBS volumes for the data drive on each broker (in MiB/s). This value is available only when throughput is enabled.
NETWORK CONFIGURATION
Client SubnetsThe list of configured subnets in which the cluster's brokers are deployed and client connections are established.
Public AccessDefines whether the cluster allows client connections over the public internet.

Loved by customers all over the world

"Standout Tool With Extensive Monitoring Capabilities"

It allows us to track crucial metrics such as response times, resource utilization, error rates, and transaction performance. The real-time monitoring alerts promptly notify us of any issues or anomalies, enabling us to take immediate action.

Reviewer Role: Research and Development

carlos-rivero
"I like Applications Manager because it helps us to detect issues present in our servers and SQL databases."
Carlos Rivero

Tech Support Manager, Lexmark

Trusted by over 6000+ businesses globally