Adaptive Thresholds in OpManager

Adaptive thresholds enable users to optimize efficiency of alerts being received, by modifying threshold values for critical monitors dynamically using OpManager's Machine Learning-based predictive algorithms. It eliminates the need for manual intervention with deciding thresholds, and fully automates the process of studying complex datasets and arriving at feasible threshold values for each monitor.

Here's how OpManager's Adaptive Thresholds help simplify the process of determining threshold values:

  1. Once Adaptive Thresholds are enabled, OpManager collects the necessary performance data from all the monitors and feeds them into our advanced predictive algorithms.
  2. These algorithms start reading the patterns in the recorded data and generate relevant threshold values, with every value and every pattern being taken into consideration.
  3. Once OpManager has at least three days of data for the concerned monitor(s), it then finalizes on the data pattern and starts applying the forecast threshold values to the relevant monitors.

On enabling Adaptive Thresholds, we collect what are called as "deviation values" from the user(s) in order to determine how much the polled value can vary before an alert is raised. Three deviation values - Attention, Trouble and Critical, are collected as percentages in increasing order since OpManager has three levels of alerts.

How are thresholds calculated in Adaptive Threshold mode?

For each hour, OpManager's predictive algorithms provide a Forecast value based on previously observed data patterns and behavior, and the deviation values configured by the user are applied based on that value. For example, consider the following deviation values.

Attention Trouble Critical
5 8 15

If the forecast value for the CPU utilization of a device is 34 for the first hour of the day (0:00 - 1.00), then the corresponding value for raising an alert with criticality "Attention" would be 34+5=39 (Forecast + Attention deviation). Similarly, Trouble and Critical values are also calculated every hour. The calculated values for 5 consecutive hours for different forecast values would be as follows:

Hour of time Forecast value Attention value Trouble value Critical value
0:00 - 1:00 34 39 42 49
1:00 - 2:00 36 41 44 51
2:00 - 3:00 44 49 52 59
3:00 - 4:00 58 63 66 73
4:00 - 5:00 54 59 62 69

Enabling Adaptive Thresholds

Before enabling the Adaptive Thresholds option, note that:

  1. This feature is currently available only for CPU Utilization, Memory Utilization and Response Time monitors in OpManager. We will be progressively rolling out support for other monitors soon.
  2. OpManager will require a minimum of three days of performance data to successfully establish data patterns and implement a model. If you are adding a new device and want to start monitoring it right away, you can use the manual thresholds during this period.
  3. Adaptive Thresholds feature has to be first enabled globally for it to be enabled as an option in all the other pages. If it is disabled globally, only manual thresholds can be configured throughout OpManager. Also, once it is enabled, all devices discovered with the supported monitors will have Adaptive Thresholds enabled for those monitors by default.
  4. Whenever Adaptive Thresholds is disabled anywhere, the threshold values for that monitor will be reverted to the last configured manual threshold values for that monitor (if previously configured).

Adaptive thresholds for the applicable monitors (CPU utilization, Memory Utilization and Response Time) can be enabled globally across OpManager from Settings -> Monitoring -> Monitor Settings. Navigate to this page, select the "Adaptive Thresholds" tab, enable the checkbox and click 'Save'. You can also enable each monitor type individually at a global level by clicking on the little switch buttons next to each monitor and disabling the other monitor type(s).

Setting Adaptive Thresholds in OpManager

Once it has been enabled, it can be controlled on various levels based on your requirements:

  1. Enabling on a monitor level across OpManager:
  2. Setting Adaptive Thresholds in OpManager

    • You can also enable Adaptive Thresholds for a particular monitor used across OpManager. Simply go to the Performance Monitors page under Settings > Monitoring, locate the monitor you wish to enable it for, and click Edit.
    • Enable the Adaptive Thresholds option and click OK to save it.
  3. Enabling via Device Templates:
  4. Setting Adaptive Thresholds in OpManager

    • We can also enable Adaptive Threshold for monitors from Device Templates, in a similar process to configuring on a monitor level as above.
    • Go to Settings -> Configuration -> Device Templates, select the suitable template and then you can click on any of the supported monitors to enable Adaptive Thresholds. Once done, click OK to save your changes.
    • To directly apply this change to devices under the template, click Save and Associate. You can select the devices you want to apply these changes to and click on Associate and Overwrite to apply these changes.
    • If you want to apply this threshold change to devices that will be discovered in the future, just click 'Save'.
  5. Enabling locally on a device level (Device Snapshot):
  6. Setting Adaptive Thresholds in OpManager

    • This method will be useful when Adaptive Thresholds need to be enabled/disabled only for a few devices.
    • Simply go to the Device Snapshot page of the device(s), navigate to any of the supported monitors, click Edit and enable the Adaptive Thresholds option.
    • Click Save to apply the changes to your monitor(s). OpManager will start forecasting threshold values once there is enough data to be used by the algorithms (min. 3 days).