Telemetry | ManageEngine DEX Manager Plus

Why Visibility matters in Digital Experience Monitoring

Today, nearly every task we do at work—whether collaborating, creating, or communicating—happens in a digital workspace. Devices have become an extension of the modern employee, and their performance directly shapes how work gets done. Just as we prioritize regular health checkups for ourselves, it’s critical to routinely assess the health of our endpoints.

But with a growing fleet of devices spread across geographies and hybrid work environments, manual checkups are no longer practical—or scalable. IT teams need a better way to understand how devices perform and how they impact user experience.

That’s where digital experience monitoring begins—with visibility. By collecting and analyzing real-time endpoint telemetry, IT gains deep insight into key performance indicators like CPU usage, memory consumption, boot time, crash frequency, and more. This visibility lays the foundation for identifying experience issues early, reducing support tickets, and ultimately improving employee productivity.

Visibility starts with Telemetry (or) data

You can’t improve what you can’t measure. Telemetry is the foundational layer of digital experience monitoring—it transforms invisible device behavior into actionable insight. By continuously collecting real-time data points like CPU usage, memory load, boot times, system crashes, and more, IT teams gain deep operational awareness into every endpoint, regardless of where it’s located.

But telemetry is more than just monitoring. It enables proactive IT by surfacing early warning signals before they snowball into user complaints or productivity issues. With the right data in place, teams can detect friction, automate responses, and ensure smooth performance—all before the employee even notices. In short, telemetry turns visibility into control, making it the backbone of modern endpoint monitoring and experience management.

How DEX Manager Plus collects and uses Endpoint Telemetry

DEX Manager Plus has a lightweight agent, that sits on end-user devices and operates silently in the background without impacting performance. This agent continuously collects high-fidelity telemetry data from every managed endpoint—on-site, remote, or hybrid—giving IT teams a real-time pulse on the employee experience. Our agent collects data around the clock, even when the device is offline. Critical/Alert-related data is then posted to the server for further analysis.

The lightweight agent continuously captures a rich stream of telemetry data that directly influences user productivity, device health, and digital experience quality. This telemetry can be broadly classified into two categories:

Built-in endpoint metrics monitored out-of-the-box
Custom telemetry collected using user-defined data collectors

Let’s explore each in detail:

Built-in Endpoint Metrics

DEX Manager Plus tracks a curated set of high-impact metrics that provide deep visibility into how well endpoints perform and how they affect end-user experience. These metrics are grouped into four key categories:

Application ReliabilityIdentifies app-related issues like crashes
Device Performance Monitors CPU, memory, GPU, and disk usage to ensure smooth and responsive operation
Device Reliability Tracks hardware health, battery condition, warranty status, and system stability
Device ResponsivenessMeasures user-facing delays like boot time, logon duration, and input lag

These foundational metrics help IT teams spot issues early, prioritize support, and optimize endpoint experience across the workforce.

Below is a structured table covering the monitored metrics and their impact. Since most metrics have configurable thresholds that the admins can set up to identify system degradation, we have also mentioned a best practice threshold for these metrics that can be leveraged by IT teams starting on their experience management journey:

Category	Metric Monitored	Impact on Experience	Best Practice Threshold/Alert if
Application Reliability	Application Crash Events	Crashing applications interrupt work and reduce user trust in IT	All application crash events are monitored.
Device Performance	Free Disk Space	Low disk space causes slowness, failed updates, and app crashes	Free disk space is less than 10 GB
	Free Disk Space (OS Drive)	OS instability and failed operations due to lack of system drive space	Free OS drive space is less than 10GB
	CPU Usage	High CPU leads to slow response times and unresponsive apps	CPU usage exceeds 70% for 5-10 minutes
	Memory Usage	High memory usage causes lags, freezes, and app crashes	Memory usage exceeds 50% for 5 minutes
	Memory Swap Rate	Indicates system is using disk instead of RAM, leading to performance dips	Swap rate exceeds 5000 pages for 10 minutes
	Memory Swap Size	Excessive swap size signals memory overuse and degraded speed	Swap size exceeds 75% for 10 minutes
	CPU Interrupt	High interrupts may indicate hardware faults or driver issues	Interrupts exceed 2% of CPU for 5 minutes
	GPU Usage	High GPU load may slow down graphics-intensive apps, video calls, or design tools	GPU usage exceeds 75% for 10 minutes
	Disk Queue Length	Long disk queue length causes delays in read/write operations	Average queue length exceeds 1 length for 10 minutes
Device Reliability	Battery Health	Poor battery health reduces portability and increases user frustration	Battery health less than 25%-30%(approximately 70–75% wear)
	Warranty	Out-of-warranty devices carry repair risks and cost implications	Warranty expires in 30-60 days
	Device Age	Older devices typically underperform newer ones and are prone to failure	Device age exceeds 3-5 years
	Hard Reset	Frequent hard resets may point to deeper system issues or user frustration	All hard resets are monitored Alert if > 2 hard resets within a 7-day period
	System Crash	System crashes result in data loss and disrupted productivity	All system crashes are monitored
Device Responsiveness	Boot Time	Long boot times cause delays at the start of the workday	Boot time exceeds 60 seconds
	Extended Logon Time	Slow logons hinder user access and readiness to work	Logon time exceeds 60 seconds
	Max Input Delay	High input delay leads to laggy user interactions and frustration	Input delay exceeds 500 ms for 5-10 minutes

Premises for the best practice thresholds

CPU, memory, disk usage thresholds near 85–90% are widely recognized in industry defaults to flag real-world performance issues without triggering noise
Disk space warnings at < 10 GB or < 10% prevent common failure modes while still allowing operating overhead
Memory alerts, especially available RAM under 10%, signal impending swapping and slowdowns.
Durations matter—sustained usage over a window is much more meaningful than brief spikes.

Custom Telemetry with User-Defined Data Collectors

While built-in telemetry covers a broad spectrum of critical device signals, every organization has unique needs based on their environment, workflows, and employee tools. That’s where custom telemetry comes in.

With user-defined data collectors, IT teams can extend monitoring capabilities by defining and collecting custom metrics tailored to their business. Whether it's tracking the availability of connected devices, POS peripherals, or pulling up details of enterprise apps etc, that impacts end-user productivity, DEX Manager Plus allows IT to create lightweight data collectors using PowerShell or prebuilt templates.

Monitor custom hardware sensors
Query application-specific logs or counters
Check service health for internal tools
Track latency or responsiveness for business-critical operations
Pull registry values, WMI data, or command output

Collected data can be fed into a detection and remediation workflow, enabling correlation with core telemetry, alerting, and automated remediation workflows. This gives IT full control over experience monitoring, ensuring no blind spots—even in complex or legacy setups.

In essence, custom collectors bridge the gap between standard metrics and your unique digital environment, helping you go beyond out-of-the-box monitoring to achieve true experience observability.