Kubernetes Performance Monitoring

Kubernetes - An Overview

Kubernetes (or k8s) is an open-source container orchestration system for automating deployment, scaling and management of application containers across clusters of hosts. Kubernetes clusters can span hosts across public, private, or hybrid clouds. K8s orchestration allows users to build application services across multiple containers, schedule those containers across a cluster, scale those containers, and manage the health of those containers over time.

Monitor all your Kubernetes performance workloads using a single tool

Applications Manager's Kubernetes monitoring tool lets administrators adapt monitoring strategies to account for the new infrastructure layers introduced (when adopting containers and the container orchestration) with a distributed Kubernetes environment.

Auto-discover the parts and map relationships between objects in the cluster - Kubernetes nodes, namespaces, deployments, replica sets, pods, and containers.
Track the capacity and resource utilization of your cluster and be able to drill into specific parts of the cluster.
Identify if you have enough nodes in your cluster and resource allocations to existing nodes is sufficient for deployed applications.
Ensure all nodes on the cluster are healthy - monitor the CPU and memory for Kubernetes nodes (workers and masters).
Ensure all desired pods in a deployment are running and not in a restart loop.
Set up alerts for Container restarts to identify issues with either a container or its host that affect performance of their applications.
Monitor the performance outliers of the Kubernetes-hosted applications running inside your cluster and track down any individual errors.
View the status of Kubernetes Master and Node components – API Server, the Etcd key/value store, Scheduler and Controller.
Monitor crucial Kubernetes performance metrics to predict future resource requirements and ensure that your cluster has sufficient capacity to handle potential workload spikes.

Note: In the Kubernetes cluster architecture, it is sufficient to add the primary master node alone to the Applications Manager. Applications Manager will automatically discover all the other master and worker nodes within the cluster and monitor them closely. There is no need to individually add each node as a Kubernetes performance monitor as this will lead to a performance issue.

Discover more with a Kubernetes performance monitor

Prerequisites for setting up Kubernetes performance monitor: kubectl should be installed on the machine where Kubernetes is installed.

Using the REST API to add a new Kubernetes performance monitor: Click here

Follow the steps given below to create a new Kubernetes monitor:

Click on New Monitor link.
Select Kubernetes under Virtualization category.
Specify the Display Name of the Kubernetes Server.
Enter the Cluster hostname/ IP address of the server where Kubernetes is running.
Enter the credential details like user name and password for authentication, or select the required credentials from the Credential Manager list after enabling the Select from Credential list option.
Check the box to enable Public Key Authentication (Supported for SSH2 only), the SSH Key for SSH authentication.
Specify the command prompt value, which is the last character in your command prompt. Default value is $ and possible values are >, #, etc.
Enter the SSH port. Default SSH port used is 22.
Enable the Monitor Specific Namespace(s)option if you wish to monitor only specific namespace(s) in the Kubernetes environment. After enabling, specify the following details:
- Filter Condition: Select the filtering condition to include or exclude monitoring of specific namespace(s) in the Kubernetes environment.
- Namespace Name(s): Specify the name of the namespace(s) to be included/excluded while monitoring. You can enter multiple namespaces as comma-separated values.
Check the Enable Event Log Monitoring box to enable the option to monitor Event Log details.
Specify the Polling Interval in minutes.
Choose the Monitor Group with which you want to associate the Kubernetes to, from the combo box (optional). You can choose multiple groups to associate your monitor.
Click Add Monitor(s). This discovers the Kubernetes from the network and starts monitoring it.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Kubernetes under the Virtualization table. Displayed is the Kubernetes bulk configuration view distributed into three tabs:

Availability tab gives the Availability history for the past 24 hours or 30 days.
Performance tab gives the Health Status and events for the past 24 hours or 30 days.
List view enables you to perform bulk admin configurations.

On clicking a monitor from the list, you'll be taken to the Kubernetes performance monitor dashboard. It has thirteen tabs -

Overview

Parameter	Description	Supported in Prometheus
CLUSTER USAGE DETAILS
Average Cluster CPU Usage	Average CPU used by the cluster
Average Cluster Memory Usage	Average Memory used by the cluster
CLUSTER DETAILS
Control Plane	Control Plane URL.
Git Version	Git version used in Kubernetes.
Build Date	Build Date
Compiler	Name of the compiler.
Platform	Version of the Platform.
CLUSTER SUMMARY
Namespace Count	Total number of Namespace.
Service Count	Total number of Services.
Deployment Count	Total number of Deployment.
Daemonset Count	Total number of Daemonsets.
Statefulset Count	Total number of Statefulsets.
Total Jobs Count	Total number of Jobs.
Replication Controller Count	Total number of Replication Controllers.
Replica Set Count	Total number of Replica Sets.
Ingress Count	Total number of Ingress.
CLUSTER NODE SUMMARY
Total Node Count	Total number of Nodes.
Master Node Count	Total number of Master Nodes.
Worker Node Count	Total number of Worker Nodes.
CLUSTER PODS SUMMARY
Total Pods Count	Total number of Pods.
Running Pods Count	Total number of Running Pods.
Succeeded Pods Count	Total number of Pods.
Pending Pods Count	Total number of Pending Pods.
Failed Pods Count	Total number of Failed Pods.
Unknown Pods Count	Total number of Unknown Pods.
CLUSTER CONTAINER SUMMARY
Total Containers Count	Total number of Containers.
Running Containers Count	Total number of Running Containers.
Completed Containers Count	Total number of Completed Containers.
Terminated Containers Count	Total number of Terminated Containers.
Waiting Containers Count	Total number of Waiting Containers.
CLUSTER PODS USAGE DETAILS
Used Pod Count	Total number of Used Pods.
Maximum Pod Count	Maximum number of Allocatable Pods.
Top 5 Nodes by Used Pod Count
Used Pod Count	Total number of Used Pods.
COMPONENT DETAILS
Component Name	Name of the component.
Status	Status of the component.
Component Message	Root Cause message of the Component.

Namespace

Parameter	Description	Supported in Prometheus
NAMESPACE DETAILS
Namespace Name	Name of the Namespace.
Resource Version	The version Total number of Namespace.
Namespace Availability	Availability of Namespace.
Namespace Created Time	Time at which the Namespace was created.
NAMESPACE PODS USAGE DETAILS
Namespace Name	Name of Namespace.
Total Pods Count	Total number of Pods present in the Namespaces.
Running Pods Count	Total number of Running Pods present in the Namespaces.
Succeeded Pods Count	Total number of Succeeded Pods present in the Namespaces.
Pending Pods Count	Total number of Pending Pods present in the Namespaces.
Failed Pods Count	Total number of Failed Pods present in the Namespaces.
Unknown Pods Count	Total number of Unknown Pods present in the Namespaces.

Node

Parameter	Description	Supported in Prometheus
TOP 5 NODES BY MEMORY DETAILS
Memory Limit	Maximum limit of Node memory in GiB.
Memory Requests	Total number of memory requests.
TOP 5 NODES BY CPU DETAILS
CPU Limit	Maximum limit of CPU
CPU Request	Total number of CPU requests
NODE MEMORY AND CPU DETAILS
Name	Name of the node.
Allocatable Memory(Gi)	The CPU resources of a node that are available for scheduling in Gi.
Memory Limit(%)	The maximum limit of memory resource which can be used.
Memory Request(%)	Total memory requests in %.
Allocatable CPU Processor Count	Total number of CPU processors that are available.
CPU Limit(%)	The maximum limit of CPU resource which can be used.
CPU Request(%)	Total CPU requests in %.
NODE POD DETAILS
Name	Name of the pod.
Pod Usage Details	Total number of pods available with used and free pods split-up.
Kube-system Pod Count	Total number of Kube state pods.
Non-Kube-system Pod Count	Total number of non-Kube state pods.
Image Count	Total number of images in the node.
Used Pod Count	Total number of pods present in Kubernetes.
Allocatable Pod Count	Total number of pods that are available.
NODE DETAILS
Name	Name of the node.
OSImage	OSImage name.
OS	Name of the OS in which the container is deployed.
Architecture	Architecture details.
Type	Type of node.
Kubelet Version	The version of Kubelet used.
Allocatable Ephemeral Storage(Gi)	Size of temporary memory available in Gi.
Created Time	Time at which the node was created.

Pods

Parameter	Description	Supported in Prometheus
POD DETAILS
Pod Name	Name of the pod.
Pod Namespace	Namespace in which the pod resides
Pod Node Name	Name of the pod-node.
Pod Application	Name of the pod application.
Pod Type	Type of pod.
Pod created	The means by which the pod was created.
Pod Status	Status of the pod.
Pod Start Time	The start time of the pod.
Pod Created Time	Time at which the pod was created.
TOP 10 PODS BY MEMORY DETAILS
Pods Memory Limit	Maximum limit of memory.
Pods Memory Request	Total number of memory requests.
TOP 10 PODS BY CPU DETAILS
Pods CPU Limit	Maximum limit of CPU.
Pods CPU Request	Total number of CPU requests.
POD MEMORY AND CPU DETAILS
Pod Name	Name of the pod.
Pod Namespace	Namespace of the pod.
Total number of Containers	Total number of containers run by the pod.
Pod CPU Limit(%)	The maximum limit of CPU resource which can be used.
Pod CPU Request(%)	Total CPU requests by pod in %.
Pod Memory Limit(%)	The maximum limit of memory resource that can be used.
Pod Memory Request(%)	Total memory requested in %.
Pod created	The means by which the pod was created.
Pod Persistent Volumes Claim	Name of the Claim through which a pod can access the persistent volume.

Containers

Parameter	Description	Supported in Prometheus
TOP 5 CONTAINERS BY RESTART COUNT
Container Restart Count	Total number of times the container has been restarted.
CONTAINER DETAILS
Container Name	Name of the container.
Container Image	Name of the container image.
Container Pod Name	Name of the container pod.
Container Restart Count	Total number of times the container has been restarted.
Container Status	Status of the container.
Container Start Time	Start time of the container.

Services

Parameter	Description	Supported in Prometheus
SERVICE DETAILS
Services Name	Name of the service.
Services Namespace	Name of the Namespace in which the service resides.
Services Application	Name of the Service application.
Service Type	Type of the service.
Cluster IP	Cluster IP Address.
Service Ports	Name of the port that connects with the service.
Service Created Time	Creation time of the service.
DEPLOYMENT DETAILS
Deployment Name	Name of the deployment.
Deployment Namespace	Namespace where the deployment exists.
Deployment Replica Count	Total number of replicas in a deployment.
Running Replica	Total number of Running Pods in a deployment.
Deployment Available Replica Count	Total number of available replicas in a deployment.
Deployment Availability	Availability of the deployment.

Daemonset

Parameter	Description	Supported in Prometheus
DAEMONSET DETAILS
Name	Name of the Daemonset.
Namespace Name	Name of the Namespace where the Daemonset is present.
Desired Replica	Total number of desired Pods. Default value is 1.
Current Replica	Total number of Current Pods.
Running Replica	Total number of Running Pods.
Available Replica	Total number of Available Pods.
Misscheduled Replica	Total number of Misscheduled Pods.

Statefulset

Parameter	Description	Supported in Prometheus
STATEFULSET DETAILS
Name	Name of the Statefulset.
Namespace Name	Name of the Namespace where the Statefulset is present.
Desired Replica	Total number of desired Pods. Default value is 1.
Running Replica	Total number of Running Pods.
Available Replica	Total number of Available Pods.

Replica

Parameter	Description	Supported in Prometheus
REPLICATION CONTROLLER DETAILS
Name	Name of the Replication Controller.
Namespace Name	Name of the Namespace where the Replication Controller is present.
Desired Replica	Total number of desired Pods. Default value is 1.
Running Replica	Total number of Running Pods.
Available Replica	Total number of Available Pods.
REPLICA SET DETAILS
Name	Name of the ReplicaSet.
Namespace Name	Name of the Namespace where the ReplicaSet is present.
Desired Replica	Total number of desired pods. Default value is 1.
Running Replica	Total number of Running Pods.
Available Replica	Total number of Available Pods.

Jobs

Parameter	Description	Supported in Prometheus
CLUSTER JOBS SUMMARY
Total Jobs Count	Total number of Jobs.
Running Jobs Count	Total number of Running Jobs.
Completed Jobs Count	Total number of Completed Jobs.
JOBS DETAILS
Name	Name of the Job.
Namespace Name	Name of the Namespace where the Jobs are present.
Parallelism Replica	Total number of Pod replicas, a job should run in parallel.
Desired Replica	Total number of desired Pods.
Successful Replica	Total number of Pods in successful state.
Job Start Time	The start time of the Job.
Job Completion(Min)	Time taken for job completion (in minutes).

Persistent Volumes

Parameter	Description	Supported in Prometheus
PERSISTENT VOLUMES DETAILS
PV Name	Name of the Persistent Volume.
PV Status	Status of the Persistent Volume.
PV Claim	Name of the Persistent Volume Claim.
PV Access Mode	The mode through which you can access the Persistent Volume.
PV Storage Class	Name of the Persistent Volume storage class.
PV Capacity(GiB)	The capacity of the Persistent Volume in GiB.
PV Created Time	Creation time of the Persistent Volume.
PERSISTENT VOLUMES CLAIM DETAILS
PVC Name	Name of the Persistent Volume Claim.
PVC Namespace	Name of the Namespace in which the Claim exists.
PVC Status	Status of the Persistent Volume Claim.
PV Name	Name of the Persistent Volume associated with this Claim.
PVC Access Mode	The mode through which you can access the Persistent Volume.
PVC Storage Class	Name of the Persistent Volume storage class.
PVC Requests(GiB)	Total number of Persistent Volume Claim requests in GiB.
PVC Created Time	Creation time of Persistent Volume Claim.

Events

Parameter	Description	Supported in Prometheus
CLUSTER EVENT SUMMARY
Total Event Count	Total number of Events.
Failed Event Count	Total number of Failed Events.
Normal Event Count	Total number of Normal Events.
Warning Event Count	Total number of Warning Events.
EVENT DETAILS
Event Name	Name of the Event.
Event Created Time	The time at which the Event was created.
Event Namespace	Name of the Namespace where the Event is associated.
Event Type	Type of the Event. Possible values: Warning/ Normal/ Failed
Event Kind	Module of the Event. Possible values : Pod/ Node
Involved Object	The module object involved.
Reason	Reason of the Event.
Message	Message of the Event.
Last Updated Time	The latest updated time of the Event.

Service Map

Displays a graphical map view containing namespace and service details.
All the namespace with its status and pods count for each phase will be seen inside cluster circle.
Green color indicates that the namespace is UP and red color indicates it is DOWN.
The cluster services under a namespace can be seen branching as a tree.
Each service contains its host IP address and port details.

Applications Manager Kubernetes Performance Monitoring: Service map of Kubernetes clusters

Was this content helpful?