Elasticsearch Monitoring

Elasticsearch - An Overview

Elasticsearch is a highly scalable, distributed, open source RESTful search and analytics engine. It is multitenant-capable with an HTTP web interface and schema-free JSON documents. Based on Apache Lucene, Elasticsearch is one of the most popular enterprise search engines today and is capable of solving a growing number of use cases like log analytics, real-time application monitoring, and click stream analytics.

Monitoring Elasticsearch - What we do

Let’s take a look at what you need to see to monitor Elasticsearch, the performance metrics to gather and how you can ensure that your search server is up and operating as expected with Applications Manager's Elasticsearch monitoring:

Resource Utilization Details - Applications Manager automatically discovers Elasticsearch servers, monitors memory and CPU and notifies you of changes in resource consumption of thread pool queues.
Real-Time Data - You get up-to-the-second insight into cluster runtime metrics, individual cluster nodes, real-time threads and configurations.
Cluster and Node Monitoring - Stay on top of your cluster and node health in real-time with fine-grained statistics of performance from Disk I/O Java to Memory usage metrics.
Search and Indexing Performance - Gain complete control of your indexes and mappings. Monitor query latency, file system cache usage and request rates and take action if it surpasses a threshold.
Fix Performance Problems Faster - Get instant notifications when there are performance issues. Become aware of performance bottlenecks and take quick remedial actions before your end users experience issues.

Creating a new Elasticsearch monitor

Using the REST API to add a new Elasticsearch monitor:Click here

To create an Elasticsearch Monitor, follow the steps given below:

Click on New Monitor link. Choose ElasticsearchCluster.
Specify the Display Name of the Elasticsearch monitor.
Enter the HostName or IP Address of the host where Elasticsearch Cluster runs.
Enter the Port of the Elasticsearch Cluster. By default, it will be 9200.
Enter the polling interval time in minutes.
Click Test Credentials button, if you want to test the access to Elasticsearch server.
Choose the Monitor Group from the combo box with which you want to associate Elasticsearch Monitor (optional). You can choose multiple groups to associate your monitor.
Click Add Monitor(s). This discovers Elasticsearch from the network and starts monitoring.

Note:

Security/Firewall Requirements - The Elastic Search Cluster host and port should be accessible from the machine where Applications Manager is installed.
User Privilege - The required user credentials should be provided.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on the Elasticsearch and ElasticsearchCluster monitors under the Web Server/Services Table. Displayed is the Elasticsearch or the ElasticsearchCluster bulk configuration view distributed into three tabs:

Availability tab displays the Availability history for the past 24 hours or 30 days.
Performance tab displays the Health Status and events for the past 24 hours or 30 days.
List view enables you to perform bulk admin configurations.

Click on the monitor name to see all the server details listed under the following tabs:

Elasticsearch Cluster

Overview

Parameter	Description
Node Details
Node Name	The name of the node
Node Type	The type of the node (Client or Data or Master-Eligible or Master-Data)
Avg Query Time	The first phase of search operation is Query. The time taken to process the query in all shards
Avg Fetch Time	The second phase of search operation is Fetch. The time taken to retrieve the query result, only from the shards which have the requested data.
CLUSTER OVERVIEW
Cluster Status	The status of the cluster depending on the replicas of the cluster.
Total Nodes	The total number of nodes in the cluster.
Total Indices	The total number of indices in the cluster.
Total Shards	The total number of shards in the cluster.
Total Docs	The total number of documents present in the cluster.

Cluster Details

Parameter	Description
NODES SPLITUP
Client Node	The total number of Client Nodes in the cluster.
Data Node	The total number of Data Nodes in the cluster.
Master Node	The total number of Master Eligible Nodes in the cluster.
Data-Master Node	The total number of Data Nodes, which also acts as Master Eligible Nodes in the cluster.
SHARDS COUNT
Active Shards	The number of Active Shards present in the cluster.
Active Primary Shards	The number of Primary Shards that are Active in the cluster.
Relocating Shards	The number of Relocating Shards present in the cluster.
Initializing Shards	The number of Initializing Shards present in the cluster.
Unassigned Shards	The number of Unassigned Shards present in the cluster.
Delayed Unassigned Shards	The number of Delayed Unassigned Shards present in the cluster.
Total Shards	The number of Shards present in the cluster.
Top 20 Pending Tasks by Priority
Insert Order	The order of the task in which the pending task is inserted into the queue.
Priority	The priority assigned for the particular task.
Source	The source for the pending task.
Wait Time by Priority	The total waiting time of the pending task in that queue based on priority (in milliseconds).
Top 20 Pending Tasks by Wait Time
Insert Order	The order of the task in which the pending task is inserted into the queue.
Priority by Wait Time	The priority assigned for the particular task based on Wait Time.
Source	The source for the pending task
Wait Time	The total waiting time of the pending task in that queue (in milliseconds).

Indices

PARAMETER	DESCRIPTION
Indices Overview
Index Name	The name of the index representing a collection of documents.
Documents	Indicates the number of documents that are available in the particular index.
Indexing Latency	Amount of time taken to index a document in the particular index (in millisecond).
Indexing Rate	The number of documents that are indexed per second.
Query Latency	Amount of time taken to process the query in the particular index (in millisecond).
Query Rate	The number of queries that are processed by the index per second.
Fetch Latency	Amount of time taken to run the query and retrieve the data in the particular index (in millisecond).
Fetch Rate	The number of queries that are run and retrieved data by the index per second.
Current Merges	Indicates the number of merges that have occurred in the particular index.
Merge Time	Amount of time taken to merge segments in the particular index (in millisecond).
Flush Time	Amount of time taken to flush one or more indices to disk (in millisecond).
Refresh Time	Amount of time taken to refresh an index (in millisecond).

Configuration

PARAMETER	DESCRIPTION
CONFIGURATION DETAILS
Cluster Name	The name of the cluster.
Total Nodes	The total number of nodes in the cluster.
Master Node Name	The name of the Master Node in the cluster.
Master Node Port	The port on which the Master node of Elasticsearch runs.
Master Node IP	The IP address in which the Master Node runs.
Publish Port	The publish port of the cluster.

Elasticsearch

Overview

PARAMETER	DESCRIPTION
AVERAGE SYSTEM LOAD
Avg. System Load	The average value of the amount of load that is being processed by the system (in the last 1 minute, 5 minutes, and 15 minutes).
CPU UTILIZATION
CPU Utilization	Amount of CPU currently being utilized by the node (in %).
SEARCH TIME
Average Query Time	The first phase of search operation is Query. The time taken to process the query in all shards
Average Fetch Time	The second phase of search operation is Fetch. The time taken to retrieve the query result, only from the shards which have the requested data.
SEGMENT TIME
Average Merge Time	The average time taken for segment merging in a node. (A shard in elasticsearch is a Lucene index, broken down into segments. Segments are, periodically, merged into larger segments to keep the index size at bay and expunge deletes.)
Average Refresh Time	The average time spent in refreshing an index. (Refresh time increases with the number of file operations for the Lucene index).
INDEXING TIME
Average Index Time	The average time taken to index a document. (Documents are indexed i.e stored and made searchable.)
Average Delete Time	The average time taken to delete an existing index.
Indexed Count	The number of documents indexed.
Deleted Count	The number of deleted documents.
Indexing Rate	The number of documents that are indexed per second.
GET TIME
Average Get Time	The average time taken to retrieve information about one or more indexes
Existing Count	The number of get requests that were present.
Missing Count	The number of get requests that were missing.
FLUSH TIME
Average Flush Time	The average time taken to flush one or more indices to disk. (The flush process of an index basically frees memory from the index by flushing data to the index storage and clearing the internal transaction log.)
WARMER TIME
Average Warmer Time	The average time taken to perform a warmup search on an index. (Index warming allows to run registered search requests to warm up the index before it is available for search.)
PERCOLATE TIME
Average Percolate Time	The average time spent running percolator queries. (One of Elasticsearch's core feature is the ability to do search in reverse with the percolator. The percolator automatically indexes the query terms with the percolator queries. This allows the percolator to percolate documents more quickly.)

Memory Details

The total space used in the Direct Buffer pool.

PARAMETER	DESCRIPTION
HEAP MEMORY
Used Heap Percent	The percentage of JVM heap currently in use.
Free Heap Percent	The percentage of JVM heap currently free
NON-HEAP MEMORY
Used Non-Heap Percent	The percentage of non-heap memory currently in use.
Free Non-Heap Percent	The percentage of non-heap memory currently free.
GARBAGE COLLECTION
GC Time - Young	The total time spent on young-generation garbage collections.
GC Time - Old	The total time spent on old-generation garbage collections.
GC Count - Young	The total number of young-generation garbage collections.
GC Count - Old	The total number of old-generation garbage collections.
BUFFER POOLS
Direct Buffer Space Used	The total space used in the Direct Buffer pool.
Mapped Buffer Space Used	The total space used in the Mapped Buffer pool.
Direct Buffer Connection Count	The total connections to Direct Buffer pool.
Mapped Buffer Connection Count	The total connections to Mapped Buffer pool.

I/O Details

PARAMETER	DESCRIPTION
DISK I/O COUNT
Disk Read Count	The number of read ( from the disk) requests by Elasticsearch.
Disk Write Count	The number of write ( to the disk) requests by Elasticsearch.
DISK I/O SIZE
Disk Read Size	The total size of read requests ( from the disk) by Elasticsearch.
Disk Write Size	The total size of write requests ( to the disk) by Elasticsearch.
CACHE DETAILS
Cache Name	The name of the cache.
Total Size (MB)	The size of the cache.
Evictions	The number of evictions from the filter cache.
BREAKER DETAILS
Breaker Name	The name of the Circuit Breaker. (Circuit breakers are designed to deal with situations when request processing needs more memory than available. This would mean OOM (OutOfMemoryException). So sometimes it is better to fail a query instead of getting OOM, because when OOM appears JVM becomes not responsive.)
Limit Size (MB)	The limit size of the particular Breaker.
Used Size (MB)	The used size of the particular Breaker.
Tripped	The total number of times the breaker circuit tripped.

Thread Pools

PARAMETER	DESCRIPTION
THREAD DETAILS
Thread Name	The name of the thread.
Configured Threads	The number of threads of current configured type.
Queue	The number of thread of current type in queue.
Active	The number of active threads of current type.
Rejected	The number of rejected threads of current type.
Largest	The number of largest threads of current type.

Network

PARAMETER	DESCRIPTION
TRANSPORT
Transmitted Bytes	The number of bytes sent by the network. (Transport metrics about cluster communication)
Received Bytes	The number of bytes received by the network. (Transport metrics about cluster communication)
Transmitted Packets	The number of data packets sent by the network. (Transport metrics about cluster communication)
Received Packets	The number of data packets received by the network. (Transport metrics about cluster communication)
TCP CONNECTOR
Active Connections	The number of active TCP connections.
Passive Connections	The number of passive TCP connections.
HTTP CONNECTOR
Current Connections	The number of http connections currently active.
Total Connections	The total number of http connections.

Configuration

PARAMETER	DESCRIPTION
CONFIGURATION DETAILS
Cluster Name	The name of the cluster.
Node Name	The name of the node in the cluster.
Node Type	The type of the node (Client/Data/Master-Eligible/Data-Master).
Host	The IP address of the Host.
ElasticSearch Version	The version of the installed Elasticsearch.
Port	The port in which Elasticsearch runs.
ElasticSearch Home	The home directory of Elasticsearch.
Total Processors	The total number of processors in the current node
Java Version	The version of Java running in the node.
Java Vendor	The Java vendor.

Elasticsearch Monitoring

Elasticsearch - An Overview

Monitoring Elasticsearch - What we do

Creating a new Elasticsearch monitor

Monitored Parameters

Elasticsearch Cluster

Overview

Cluster Details

Indices

Configuration

Elasticsearch

Overview

Memory Details

I/O Details

Thread Pools

Network

Configuration

Loved by customers all over the world

"Standout Tool With Extensive Monitoring Capabilities"

"I like Applications Manager because it helps us to detect issues present in our servers and SQL databases."

Carlos Rivero

Trusted by thousands of leading businesses globally