# Monitoring GPU with OpManager

From Version 12.8.658, NVIDIA GPU monitoring is supported through 12 CLI monitors introduced for monitoring key metrics across GPU performance, power and thermal management, device status, and memory utilization. These monitors give sysadmins, DevOps teams, and AI engineers direct visibility into GPU resource consumption and health on Linux-based infrastructure — where NVIDIA GPUs are most commonly deployed for compute-intensive workloads.

### Associating NVIDIA GPU Monitors at Device Level

To associate a NVIDIA GPU Monitor, go to the respective Linux-based device's **Snapshot page → Monitors → Performance Monitors → Actions → Add Performance Monitors**.

Here, monitors are listed in the "NVIDIA GPU Monitors" section. Select the monitors required, and click on **Add**.

[![GPU Monitoring Support in Opmanager](https://www.manageengine.com/network-monitoring/help/images/GPU_Monitors1.png)](https://www.manageengine.com/network-monitoring/help/images/GPU_Monitors1.png)

### Associating NVIDIA GPU Monitors to Multiple Devices

Performance monitors can be associated to multiple devices in bulk from the Settings page. The steps below cover associating the 12 NVIDIA GPU monitors available under the net-snmp vendor to one or more devices. Devices will be associated with the monitors only when the respective CLI credentials are configured for those devices.

1. Navigate to **Settings → Monitoring → Performance Monitors**.
2. Click the **Associate** button at the top right corner of the page.
3. In the **Associate Monitors** panel, locate the **Vendors** dropdown and select **Net-SNMP**. The monitor list refreshes to display monitors available under the Net-SNMP vendor.

[![GPU Monitoring Support in Opmanager](https://www.manageengine.com/network-monitoring/help/images/gpu_monitors2.png)](https://www.manageengine.com/network-monitoring/help/images/gpu_monitors2.png)

4. From the filtered list, select the required monitors from the **12 NVIDIA GPU monitors** displayed. To include all 12, select the checkbox at the top of the list and click **Next**.
5. In the **Monitor Association** panel, browse the available devices and select the devices requiring GPU monitors, and move them to the **Selected Devices** section.

[![GPU Monitoring Support in Opmanager](https://www.manageengine.com/network-monitoring/help/images/gpu_monitors3.png)](https://www.manageengine.com/network-monitoring/help/images/gpu_monitors3.png)

6. Click **Apply** to save the configuration.

Associating NVIDIA GPU monitors to devices through Settings in OpManager enables continuous performance tracking across GPU-enabled hosts from a single configuration point. Once associated, the collected metrics are available on the Device Snapshot page, where thresholds can be configured per device, or at the Settings level to apply uniform threshold values across multiple devices simultaneously.

### NVIDIA GPU Monitors (Linux-Based)

| Category | Monitor | Description |
|---|---|---|
| GPU Performance | NVIDIA GPU Utilization | Percentage of GPU resources engaged in processing workloads |
| GPU Performance | NVIDIA GPU Memory Utilization | Percentage of the total GPU memory (VRAM) that is actively allocated and being used by running applications and processes |
| GPU Performance | NVIDIA GPU Clock Speed in Percent | Percentage of GPU core clock speed relative to its maximum rated frequency |
| GPU Performance | NVIDIA GPU Memory Clock Speed in Percent | Percentage of GPU memory clock speed relative to its maximum rated frequency |
| GPU Status | NVIDIA GPU Availability | Indicates whether the NVIDIA GPU is currently available and accessible by the system |
| GPU Status | NVIDIA GPU Compute Mode | Indicates whether the NVIDIA GPU compute mode is set to a specific configuration |
| GPU Status | NVIDIA GPU Display Status | Indicates whether the GPU's display output is currently active or inactive |
| GPU Status | NVIDIA GPU Persistence Mode | Indicates whether NVIDIA GPU Persistence Mode is enabled, allowing the GPU to remain initialized between sessions and reducing initialization delay |
| GPU Power / Thermal Management | NVIDIA GPU Temperature in Celsius | Current operating temperature of the GPU in degrees Celsius |
| GPU Power / Thermal Management | NVIDIA GPU Power Draw in Watts | Amount of electrical power currently consumed by the GPU, measured in watts |
| GPU Power / Thermal Management | NVIDIA GPU Power Draw in Percent | Percentage of GPU power usage relative to its maximum rated power limit |
| GPU Power / Thermal Management | NVIDIA GPU Fan Speed in Percent | Percentage of GPU fan speed relative to its maximum rated speed |