Schedule demo

Kubernetes Performance Monitoring



Kubernetes - An Overview

Kubernetes (or k8s) is an open-source container orchestration system for automating deployment, scaling and management of application containers across clusters of hosts. Kubernetes clusters can span hosts across public, private, or hybrid clouds. K8s orchestration allows users to build application services across multiple containers, schedule those containers across a cluster, scale those containers, and manage the health of those containers over time.

Monitor all your Kubernetes performance workloads using a single tool

Applications Manager's Kubernetes monitoring tool lets administrators adapt monitoring strategies to account for the new infrastructure layers introduced (when adopting containers and the container orchestration) with a distributed Kubernetes environment.

  • Auto-discover the parts and map relationships between objects in the cluster - Kubernetes nodes, namespaces, deployments, replica sets, pods, and containers.
  • Track the capacity and resource utilization of your cluster and be able to drill into specific parts of the cluster.
  • Identify if you have enough nodes in your cluster and resource allocations to existing nodes is sufficient for deployed applications.
  • Ensure all nodes on the cluster are healthy - monitor the CPU and memory for Kubernetes nodes (workers and masters).
  • Ensure all desired pods in a deployment are running and not in a restart loop.
  • Set up alerts for Container restarts to identify issues with either a container or its host that affect performance of their applications.
  • Monitor the performance outliers of the Kubernetes-hosted applications running inside your cluster and track down any individual errors.
  • View the status of Kubernetes Master and Node components — API Server, the Etcd key/value store, Scheduler and Controller.
  • Monitor crucial Kubernetes performance metrics to predict future resource requirements and ensure that your cluster has sufficient capacity to handle potential workload spikes.

Note: In the Kubernetes cluster architecture, it is sufficient to add the primary master node alone to the Applications Manager. Applications Manager will automatically discover all the other master and worker nodes within the cluster and monitor them closely. There is no need to individually add each node as a Kubernetes performance monitor as this will lead to a performance issue.

Discover more with a Kubernetes performance monitor

Prerequisites for setting up Kubernetes performance monitor: kubectl should be installed on the machine where Kubernetes is installed.

Using the REST API to add a new Kubernetes performance monitor:Click here

Follow the steps given below to create a new Kubernetes monitor:

  1. Click on New Monitor link. 
  2. Select Kubernetes under Virtualization category.
  3. Specify the Display Name of the Kubernetes Server.
  4. Enter the Cluster hostname/ IP address of the server where Kubernetes is running. 
  5. Enter the credential details like user name and password for authentication, or select the required credentials from the Credential Manager list after enabling the Select from Credential list option.
  6. Check the box to enable Public Key Authentication (Supported for SSH2 only), the SSH Key for SSH authentication.
  7. Specify the command prompt value, which is the last character in your command prompt. Default value is $ and possible values are >, #, etc.
  8. Enter the SSH port. Default SSH port used is 22.
  9. Enable the Monitor Specific Namespace(s)option if you wish to monitor only specific namespace(s) in the Kubernetes environment. After enabling, specify the following details:
    • Filter Condition: Select the filtering condition to include or exclude monitoring of specific namespace(s) in the Kubernetes environment.
    • Namespace Name(s): Specify the name of the namespace(s) to be included/excluded while monitoring. You can enter multiple namespaces as comma-separated values.
  10. Check the Enable Event Log Monitoring box to enable the option to monitor Event Log details.
  11. Specify the Polling Interval in minutes.
  12. Choose the Monitor Group with which you want to associate the Kubernetes to, from the combo box (optional). You can choose multiple groups to associate your monitor.
  13. Click Add Monitor(s). This discovers the Kubernetes from the network and starts monitoring it.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Kubernetes under the Virtualization table. Displayed is the Kubernetes bulk configuration view distributed into three tabs:

  • Availability tab gives the Availability history for the past 24 hours or 30 days.
  • Performance tab gives the Health Status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.

On clicking a monitor from the list, you'll be taken to the Kubernetes performance monitor dashboard. It has thirteen tabs -

Overview

ParameterDescriptionSupported in SSH modeSupported in Prometheus
CLUSTER USAGE DETAILS
Average Cluster CPU UsageAverage CPU used by the cluster
Average Cluster Memory UsageAverage Memory used by the cluster
CLUSTER DETAILS
Control PlaneControl Plane URL.
Git VersionGit version used in Kubernetes.
Build DateBuild Date
CompilerName of the compiler.
PlatformVersion of the Platform.
CLUSTER SUMMARY
Namespace CountTotal number of Namespace.
Service CountTotal number of Services.
Deployment CountTotal number of Deployment.
Daemonset CountTotal number of Daemonsets.
Statefulset CountTotal number of Statefulsets.
Total Jobs CountTotal number of Jobs.
Replication Controller CountTotal number of Replication Controllers.
Replica Set CountTotal number of Replica Sets.
Ingress CountTotal number of Ingress.
COMPONENT DETAILS
Component NameName of the component.
StatusStatus of the component.
Component MessageRoot Cause message of the Component.

Namespace

ParameterDescriptionSupported in SSH modeSupported in Prometheus
NAMESPACE DETAILS
Namespace NameName of the Namespace.
Resource VersionThe version number of the Namespace.
Namespace CPU Usage (%)Percentage of CPU resources consumed by all pods running within the Namespace.
Namespace Memory Usage (%)Percentage of memory resources utilized by all pods running within the Namespace.
Namespace AvailabilityAvailability of Namespace.
Namespace Created TimeTime at which the Namespace was created.
NAMESPACE PODS USAGE DETAILS
Namespace NameName of Namespace.
Total Pods CountTotal number of Pods present in the Namespaces.
Running Pods CountTotal number of Running Pods present in the Namespaces.
Succeeded Pods CountTotal number of Succeeded Pods present in the Namespaces.
Pending Pods CountTotal number of Pending Pods present in the Namespaces.
Failed Pods CountTotal number of Failed Pods present in the Namespaces.
Unknown Pods CountTotal number of Unknown Pods present in the Namespaces.
NAMESPACE MEMORY & CPU DETAILS
Namespace NameName of the Namespace.
Namespace CPU LimitTotal CPU limit allocated to all workloads within the Namespace.
Namespace CPU RequestTotal CPU requested by all pods running within the Namespace.
Namespace Memory Limit (Gi)Total memory limit allocated to all workloads within the Namespace, measured in GiB.
Namespace Memory Request (Gi)Total memory requested by all pods running within the Namespace, measured in GiB.
Namespace Network Received (Gi)Total amount of network data received by all pods within the Namespace, measured in GiB.
Namespace Network Transmitted (Gi)Total amount of network data transmitted by all pods within the Namespace, measured in GiB.

Node

ParameterDescriptionSupported in SSH modeSupported in Prometheus
TOP 5 NODES BY MEMORY DETAILS
Memory LimitMaximum limit of Node memory in GiB.
Memory RequestsTotal number of memory requests.
TOP 5 NODES BY CPU DETAILS
CPU LimitMaximum limit of CPU.
CPU RequestTotal number of CPU requests.
NODE MEMORY AND CPU DETAILS
NameName of the node.
Allocatable Memory (Gi)The memory resources of a node that are available for scheduling, measured in Gi.
Memory Limit (%)The maximum limit of memory resource which can be used.
Memory Request (%)Total memory requests in percentage.
Allocatable CPU Processor CountTotal number of CPU processors that are available for scheduling.
CPU Limit (%)The maximum limit of CPU resource which can be used.
CPU Request (%)Total CPU requests in percentage.
Node Disk Usage (%)Percentage of disk space currently used on the node.
Allocatable Ephemeral Storage (Gi)The amount of ephemeral storage available on the node for pod scheduling, measured in Gi.
NODE POD DETAILS
NameName of the pod.
Pod Usage DetailsTotal number of pods available with used and free pods split-up.
Kube-system Pod CountTotal number of Kube state pods.
Non-Kube-system Pod CountTotal number of non-Kube state pods.
Image CountTotal number of images in the node.
Used Pod CountTotal number of pods present in Kubernetes.
Allocatable Pod CountTotal number of pods that are available.
Pod Utilization (%)Percentage of allocatable pods that are currently in use.
NODE DETAILS
NameName of the node.
HostnameThe hostname of the node.
Internal IPThe Internal IP address of the node.
OSImageName of the OSImage.
OSName of the OS in which the container is deployed.
ArchitectureArchitecture details.
TypeType of node.
Kubelet VersionThe version of Kubelet used.
Allocatable Ephemeral Storage(Gi)Size of temporary memory available in Gi.
Created TimeTime at which the node was created.

Pods

ParameterDescriptionSupported in SSH modeSupported in Prometheus
POD DETAILS
Pod NameName of the pod.
Project NameName of the project in which the pod is created.
Pod NamespaceNamespace in which the pod resides.
Pod Node NameName of the node in which the pod resides.
Number of ContainersTotal number of containers running in the pod.
Pod TypeType of the pod.
Pod IPIP address of the pod.
Pod StatusStatus of the pod.
Pod Start TimeTime at which the pod was started.
Pod Created TimeTime at which the pod was created.
Pod Persistent Volume ClaimsNumber of Persistent Volume Claims associated with the pod.
TOP PODS BY CPU DETAILS
Top Pods by CPU UsageGraph showing pods with highest CPU usage.
Top Pods by CPU ThrottledGraph showing pods experiencing CPU throttling.
POD MEMORY DETAILS
Pods Memory RequestTotal memory requested by the pod.
POD MEMORY AND CPU DETAILS
Pod CPU LimitCPU limit configured for the pod.
Pod CPU RequestCPU requested by the pod.
CPU UsageActual CPU consumed by the pod.
CPU ThrottledCPU throttling experienced due to limit enforcement.
Memory LimitMemory limit configured for the pod.
Memory RequestMemory requested by the pod.
Memory UsedActual memory consumed by the pod.
Disk UsedDisk space used by the pod.
Network RxData received by the pod.
Network TxData transmitted by the pod.

Containers

ParameterDescriptionSupported in SSH modeSupported in Prometheus
TOP 5 CONTAINERS BY RESTART COUNT
Container Restart CountTotal number of times the container has been restarted.
CONTAINER DETAILS
Container NameName of the container.
Container ImageImage of the container.
Pod NameName of the pod which hosts the container.
Container StatusStatus of the container.
Container Restart CountNumber of times the container was restarted.
Container Start TimeTime at which the container was started.
TOP CONTAINERS BY CPU DETAILS
Top Containers by CPU UsageGraph of top containers by CPU usage.
Top Containers by CPU ThrottledGraph of top containers by CPU throttling.
CONTAINER CPU DETAILS
CPU LimitCPU limit configured for the container.
CPU RequestCPU requested by the container.
CPU UsageActual CPU consumed by the container.
CPU ThrottledCPU throttling experienced by the container.
CONTAINER MEMORY DETAILS
Container Memory Limit (Gi)Memory limit configured for the container.
Container Memory Request (Gi)Memory requested by the container.
Container Memory Usage (%)Actual memory consumed by the container.
CONTAINER DISK & NETWORK DETAILS
Disk UsedDisk space used by the container.
Container Network Received (Gi)Data received by the container.
Container Network Transmitted (Gi)Data transmitted by the container.
Top Containers by Network ReceivedGraph of top containers by data received.
Top Containers by Network TransmittedGraph of top containers by data transmitted.

Services

ParameterDescriptionSupported in SSH modeSupported in Prometheus
SERVICE DETAILS
Services NameName of the service.
Services NamespaceName of the Namespace in which the service resides.
Services ApplicationName of the Service application.
Service TypeType of the service.
Cluster IPCluster IP Address.
Service PortsName of the port that connects with the service.
Service Created TimeCreation time of the service.
DEPLOYMENT DETAILS
Deployment NameName of the deployment.
Deployment NamespaceNamespace where the deployment exists.
Deployment Replica CountTotal number of replicas in a deployment.
Running ReplicaTotal number of Running Pods in a deployment.
Deployment Available Replica CountTotal number of available replicas in a deployment.
Deployment AvailabilityAvailability of the deployment.

Daemonset

ParameterDescriptionSupported in SSH modeSupported in Prometheus
DAEMONSET DETAILS
NameName of the Daemonset.
Namespace NameName of the Namespace where the Daemonset is present.
Desired ReplicaTotal number of desired Pods. Default value is 1.
Current ReplicaTotal number of Current Pods.
Running ReplicaTotal number of Running Pods.
Available ReplicaTotal number of Available Pods.
Misscheduled ReplicaTotal number of Misscheduled Pods.

Daemonset

ParameterDescriptionSupported in SSH modeSupported in Prometheus
DAEMONSET DETAILS
NameName of the Daemonset.
Namespace NameName of the Namespace where the Daemonset is present.
Desired ReplicaTotal number of desired Pods. Default value is 1.
Current ReplicaTotal number of Current Pods.
Running ReplicaTotal number of Running Pods.
Available ReplicaTotal number of Available Pods.
Misscheduled ReplicaTotal number of Misscheduled Pods.

Statefulset

ParameterDescriptionSupported in SSH modeSupported in Prometheus
STATEFULSET DETAILS
NameName of the Statefulset.
Namespace NameName of the Namespace where the Statefulset is present.
Desired ReplicaTotal number of desired Pods. Default value is 1.
Running ReplicaTotal number of Running Pods.
Available ReplicaTotal number of Available Pods.

Replica

ParameterDescriptionSupported in SSH modeSupported in Prometheus
REPLICATION CONTROLLER DETAILS
NameName of the Replication Controller.
Namespace NameName of the Namespace where the Replication Controller is present.
Desired ReplicaTotal number of desired Pods. Default value is 1.
Running ReplicaTotal number of Running Pods.
Available ReplicaTotal number of Available Pods.
REPLICA SET DETAILS
NameName of the ReplicaSet.
Namespace NameName of the Namespace where the ReplicaSet is present.
Desired ReplicaTotal number of desired pods. Default value is 1.
Running ReplicaTotal number of Running Pods.
Available ReplicaTotal number of Available Pods.

Jobs

ParameterDescriptionSupported in SSH modeSupported in Prometheus
CLUSTER JOBS SUMMARY
Total Jobs CountTotal number of Jobs.
Running Jobs CountTotal number of Running Jobs.
Completed Jobs CountTotal number of Completed Jobs.
JOBS DETAILS
NameName of the Job.
Namespace NameName of the Namespace where the Jobs are present.
Parallelism ReplicaTotal number of Pod replicas, a job should run in parallel.
Desired ReplicaTotal number of desired Pods.
Successful ReplicaTotal number of Pods in successful state.
Job Start TimeThe start time of the Job.
Job Completion(Min)Time taken for job completion (in minutes).

Persistent Volumes

ParameterDescriptionSupported in SSH modeSupported in Prometheus
PERSISTENT VOLUMES DETAILS
PV NameName of the Persistent Volume.
PV StatusStatus of the Persistent Volume.
PV ClaimName of the Persistent Volume Claim.
PV Access ModeThe mode through which you can access the Persistent Volume.
PV Storage ClassName of the Persistent Volume storage class.
PV Capacity(GiB)The capacity of the Persistent Volume in GiB.
PV Created TimeCreation time of the Persistent Volume.
PERSISTENT VOLUMES CLAIM DETAILS
PVC NameName of the Persistent Volume Claim.
PVC NamespaceName of the Namespace in which the Claim exists.
PVC StatusStatus of the Persistent Volume Claim.
PV NameName of the Persistent Volume associated with this Claim.
PVC Access ModeThe mode through which you can access the Persistent Volume.
PVC Storage ClassName of the Persistent Volume storage class.
PVC Requests(GiB)Total number of Persistent Volume Claim requests in GiB.
PVC Created TimeCreation time of Persistent Volume Claim.
PVC UsageStorage space currently used by the Persistent Volume Claim.
PVC Free SpaceAvailable free storage in the Persistent Volume Claim.
PVC Usage (%)Percentage of storage utilized in the Persistent Volume Claim.

Events

ParameterDescriptionSupported in SSH modeSupported in Prometheus
CLUSTER EVENT SUMMARY
Total Event CountTotal number of Events.
Failed Event CountTotal number of Failed Events.
Normal Event CountTotal number of Normal Events.
Warning Event CountTotal number of Warning Events.
EVENT DETAILS
Event NameName of the Event.
Event Created TimeThe time at which the Event was created.
Event NamespaceName of the Namespace where the Event is associated.
Event TypeType of the Event. Possible values: Warning/Normal/Failed
Event KindModule of the Event. Possible values: Pod/Node
Involved ObjectThe module object involved.
ReasonReason of the Event.
MessageMessage of the Event.
Last Updated TimeThe latest updated time of the Event.

Service Map

  • Displays a graphical map view containing namespace and service details.
  • All the namespace with its status and pods count for each phase will be seen inside cluster circle.
  • Green color indicates that the namespace is UP and red color indicates it is DOWN.
  • The cluster services under a namespace can be seen branching as a tree.
  • Each service contains its host IP address and port details.

Applications Manager Kubernetes Performance Monitoring: Service map of Kubernetes clusters

Loved by customers all over the world

"Standout Tool With Extensive Monitoring Capabilities"

It allows us to track crucial metrics such as response times, resource utilization, error rates, and transaction performance. The real-time monitoring alerts promptly notify us of any issues or anomalies, enabling us to take immediate action.

Reviewer Role: Research and Development

carlos-rivero
"I like Applications Manager because it helps us to detect issues present in our servers and SQL databases."
Carlos Rivero

Tech Support Manager, Lexmark

Trusted by over 6000+ businesses globally