Apache Zookeeper Monitoring


Apache Zookeeper - An Overview

Apache Zookeeper is an open-source server that reliably coordinates distributed processes and applications. It allows distributed processes to coordinate with each other through a shared hierarchal namespace which is organized similarly to a standard file system. ZooKeeper server maintains configuration information, naming, providing distributed synchronization, and providing group services, used by distributed applications.

Monitoring Apache Zookeeper - What we do.

Apache Zookeeper provides a hierarchical file system (with ZNodes as the system files) that helps with the discovery, registration, configuration, locking, leader selection, queueing, etc of services working in different machines. By providing an efficient way of Zookeeper monitoring, Applications Manager aims to help administrators manage their Zookeeper server, collect all the metrics that can help when troubleshooting and be alerted automatically of potential issues. Let’s take a look at what you need to see to monitor Zookeeper and the performance metrics to gather with Applications Manager:

  • Resource utilization details - Automatically discover Zookeeper Clusters, monitor memory (heap and non-heap) on the znode get alerts of changes in resource consumption.
  • Thread and JVM usage - Track thread usage with metrics like Daemon, Peak and Live Thread Count. Ensure that started threads don’t overload the server's memory.
  • Performance Statistics - Gauge the amount of time it takes for the server to respond to a client request, queued requests and connections in the server and performance degradation due to network usage (client packets sent and recieved).
  • Cluster and Configuration details - Track the number of Znodes, the watcher setup over the nodes and the number of followers within the ensemble. Keep an eye on the leader selection stats and client session times.
  • Fix Performance Problems Faster - Get instant notifications when there are performance issues with the components of Apache Zookeeper. Become aware of performance bottlenecks and take quick remedial actions before your end users experience issues.

Apache Zookeeper - Adding a new monitor

Supported versions: 3.4.9

Prerequisites for monitoring Apache Zookeeper metrics: Click here

Using the REST API to add a new Apache Zookeeper monitor: Click here

To create an Apache Zookeeper Monitor, follow the steps given below:

  1. Click on New Monitor link. Choose Apache Zookeeper.
  2. Enter Display Name of the monitor.
  3. Enter the IP Address or hostname of the host where zookeeper server runs.
  4. Enter the JMX Port of the Zookeeper server. By default, it will be 7199. Or Check in zkServer.sh file for the JMX_PORT.
  5. To discover only this node and not all nodes in the cluster disable the option Discover all nodes in the Cluster. By default, it is enabled which means all the nodes in the cluster are discovered by default.
  6. Enter the credential details like user name, password and JNDIPath or select credentials from a Credential Manager list.
  7. Check Is Authentication Required field to give the jmx credentials to be used to connect to the Zookeeper server..
  8. Enter the polling interval time in minutes.
  9. Click Test Credentials button, if you want to test the access to Apache Zookeeper Server.
  10. Choose the Monitor Group from the combo box with which you want to associate Apache Zookeeper Monitor (optional). You can choose multiple groups to associate your monitor.
  11. Click Add Monitor(s). This discovers Apache Zookeeper from the network and starts monitoring. 

Note:
In case you are unable to add the monitor even after enabling JMX, try providing the below argument:
 -Djava.rmi.server.hostname=[YOUR_IP]

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Apache Zookeeper under the Services Table. Displayed is the Apache Zookeeper bulk configuration view distributed into three tabs:

  • Availability tab gives the Availability history for the past 24 hours or 30 days.
  • Performance tab gives the Health Status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.

Click on the monitor name to see all the server details listed under the following tabs:

Overview

PARAMETER DESCRIPTION
LEADER ELECTION STATUS
Replica Name The shard replica name.
State The serving mode: leader, follower,Leader Election or standalone if not running in an ensemble.
Election in Progress Indicates whether election is in progress or not. Values can be YES or NO.
Election Start Time The time of start of leader elections.
MEMORY DETAILS
Total Physical Memory Size The total size of physical memory that is available for the Zookeeper for its operations and storage.
Free Physical Memory Size The total size of physical memory that is free and available for the Zookeeper clusters and nodes.
Committed Virtual Memory Size The total size of virtual memory that is currently occupied by the corresponding Zookeeper nodes.
Total Swap Space Size The total size of the swap space that is available for swapping when the virtual memory reaches the limit.
Free Swap Space Size The free swap space size that is available for swapping when the virtual memory reaches the limit.
THREAD DETAILS
Daemon Thread Count The number of daemon threads that are running. A daemon thread is a thread that does not prevent the JVM from exiting when the program finishes, even if the thread is still running.
Peak Thread Count The maximum thread count since JVM start.
Live Thread Count The current number of live threads (daemon and non-daemon) on the node.
Total Started Thread Count The total number of started threads.
HEAP MEMORY DETAILS
Committed Heap Memory The total amount of committed heap memory.
Initial Heap Memory The Minimum heap memory allocated.
Maximum Heap Memory The maximum heap memory that the Zookeeper can use.
Used Heap Memory The total used heap memory.
NON-HEAP MEMORY DETAILS
Committed Non-Heap Memory The total amount of committed non-heap memory.
Initial Non-Heap Memory The Minimum Non-Heap memory allocated.
Maximum Non-Heap Memory The maximum non-heap memory that the Zookeeper can use.
Used Non-Heap Memory The total used non-heap memory.

Performance

PARAMETERS DESCRIPTION
PACKETS STATISTICS
Packets Received/Min The number of packets received. (Rate per minute)
Packets Sent/Min The number of packets sent. (Rate per minute)
LATENCY
Minimum Request Latency The minimum amount of time it takes for the server to respond to a client request (since the server was started). The unit of the time is milliseconds.
Average Request Latency The average time it takes for the server to respond to a client request (since the server was started). The unit of the time is milliseconds.
Maximum Request Latency The maximum amount of time it takes for the server to respond to a client request (since the server was started). The unit of the time is milliseconds.
NUMBER OF CONNECTIONS ALIVE
Number of Connections Alive The number of live connections.
NUMBER OF OUTSTANDING REQUESTS
No of Outstanding Requests The number of queued requests in the server. This goes up when the server receives more requests than it can process.

InMemory Data Tree

PARAMETER DESCRIPTION
NODE COUNT
Node Count The number of nodes in the Zookeeper.
WATCH COUNT
Watch Count The number of watchers setup over Zookeeper nodes.
EPHEMERAL NODE COUNT
Ephemeral Node Count The number of Ephemerals nodes.
APPROXIMATE DATA SIZE
Approximate Data Size The size of the data used. (in bytes).

Cluster Details

PARAMETER DESCRIPTION
NODE COUNT
Node Count The number of znodes in the Zookeeper cluster.
WATCH COUNT
Watch Count The number of watchers setup over Zookeeper nodes.
EPHEMERAL NODE COUNT
Ephemeral Node Count The number of ephemeral nodes. (Ephemeral nodes in Apache ZooKeeper are great for transient data: These znodes exists as long as the session that created the znode is active.)
APPROXIMATE DATA SIZE
Approximate Data Size  size of the data used. (in bytes).

Configuration

PARAMETER DESCRIPTION
ZOOKEEPER CONFIGURATION DETAILS
Replica Name  
State  
Client Port The port to listen for client connections.
Init Limit The amount of time, in ticks (see Tick Time), to allow followers to connect and sync to a leader. Increased this value as needed, if the amount of data managed by ZooKeeper is large.
Max Client Connections Per Host The maximum number of concurrent connections (at the socket level) that a single client, identified by IP address.
Max Session Timeout The maximum session timeout in milliseconds that the server will allow the client to negotiate. Defaults to 20 times the Tick time.
Min Session Timeout The minimum session timeout in milliseconds that the server will allow the client to negotiate. Defaults to 2 times the Tick Time.
Quorum Address A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers having quorum address to contact the each other.
Zookeeper Start Time The time of start of the Zookeeper.
Sync Limit The amount of time, in ticks (see Tick Time), to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be dropped.
Tick The length of a single tick, which is the basic time unit used by ZooKeeper, as measured in milliseconds.
Tick Time The length of a single tick, which is the basic time unit used by ZooKeeper, as measured in milliseconds. It is used to regulate heartbeats, and timeouts. For example, the minimum session timeout will be two ticks.
Version The version of the zookeeper installed.
CONFIGURATION DETAILS
VM Name The Java virtual machine name.
Boot Class Path The boot class path that is used by the bootstrap class loader to search for class files.
VM Vendor The Java virtual machine implementation vendor.
Class Path The Java class path that is used by the system class loader to search for class files.
Spec Vendor The vendor of the JMX specification implemented by this product.
Spec Version The version of the JMX specification implemented by this product.