Exchange Server Monitoring


Exchange Server Versions Supported: Exchange 2019, Exchange 2016, Exchange 2013, Exchange 2010, Exchange 2007, Exchange 2003 and older versions.

Prerequisites for monitoring Exchange Server: Monitoring of Exchange Server is possible only if Applications Manager is running in a Windows System. Refer Prerequisites Section.

Using the REST API to add a new Exchange server monitor: Click here

Attributes Monitored: Refer Exchange Server Parameters to know more about the attributes monitored.

To create a Exchange Server Monitor, follow the given steps:

  1. Click on New Monitor link. Choose Exchange Server under Mail Servers.
  2. Enter a Display name for the new monitor that you're going to add.
  3. Enter the IP Address or hostname of the host in which the Exchange Server is running.
  4. Select Exchange Server Version - Exchange 2019, Exchange 2016, Exchange 2013, Exchange 2010, Exchange 2007, Exchange 2003 or below.
  5. Select the Server Role to be monitored.
  6. Choose the Exchange Server Services you want to monitor from the list.
  7. Select the Exchange Server Services you want to monitor.
  8. Provide the authentication details such as User Name, Domain Name, and Password for the system in which Exchange server is running.
  9. You can enter your own credential details or select preconfigured credentials details in Credentials Manager. If you wish to enter your own credentials, specify username and password details for the monitor.
  10. Choose the Mode of Monitoring - Powershell or WMI. By default, the connectionURI will be detected. If neccessary, it can be customized.
    Know more about using the Powershell option
  11. Enable the Use CredSSP Authentication option only if you want to fetch Exchange Queues metrics for non-mailbox roles in versions 2010 and above where the Applications Manager and Exchange server are in different domains. Know more about using CredSSP authentication
  12. Specify the timeout value in seconds. Value should be greater than 120 seconds. However, this is applicable only in Powershell mode of monitoring.
  13. Enter the polling interval time in minutes.
  14. If you are adding a new monitor from an Admin Server, select a Managed Server.
  15. Choose the Monitor Group from the combo box to which you want to associate the Monitor (optional). You can choose multiple groups to associate your monitor.
  16. Click Add Monitor(s). This discovers the Monitor from the network and starts monitoring them.

Know more about Exchange Server's component-specific performance counters that Applications Manager monitors.

Note:
Monitoring of Exchange Server is possible only if Applications Manager is running in a Windows System. Also, Exchange Server monitor will work only if WMI is enabled in the remote machine in which Exchange Server is running.
Exchange Monitoring now supports data collection in two ways :

  • WMI - For users who have not installed / do not require powershell
    In the new monitor page, choose the WMI Mode of Monitoring
    Mailbox and Database Statistics for Mailbox Server Role is not available for this setting.
  • Powershell:
    In the new monitor page, Choose the Powershell Mode of Monitoring and provide the connectionURI. To use Powershell for data collection make sure the proper steps have been followed to enable powershell remoting.
    If the user has not modified any ports or connectionURI, it need not be customized. Default value for connectionURI will be used.
    Use CredSSP Authentication needs to be enabled only for fetching Exchange Queues in non-mailbox roles in versions 2010 and above where the Applications Manager and Exchange server are in different domains. Click here for the steps to enable CredSSP

Applications Manager lets you effectively monitor the different versions of your Exchange Server and report on performance, availability, and the working of its server roles. You can collect Exchange component-specific performance counters in one central location, detect issues, send alerts and thus prevent possible service outages or configuration problems. Users can proactively manage Exchange servers and identify issues before they become critical.

Applications Manager gathers data related to each of your Exchange Server Roles:

  • Mailbox Server Role - Monitor your mailbox and public folder databases and diagnose issues pertaining to all related messaging data.
  • Client Access Server Role - Monitor overall client access like ActiveSync, .NET, OWA, Web Services connections and hardware performance.
  • Unified Messaging Server Role - Track integrated dial-in access performance and monitor e-mail, voicemail, fax, calendar information and contacts.
  • Hub Transport Server Role - Monitor the mail flow, routing, and delivery within the Exchange organization and identify disk performance bottlenecks.
  • Edge Transport Server Role - Monitor EdgeSync services, Active Directory Application Mode, SMTP connection authentication, transport queue databases and logs.

Applications Manager supports monitoring of counters relevant to the following server roles:

Exchange Server - Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Exchange Server under the Mail Servers Table. Displayed is the Exchange Server bulk configuration view distributed into three tabs:

  • Availability tab, gives the Availability history for the past 24 hours or 30 days.
  • Performance tab gives the Health Status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.
Mailbox Role CountersExchange Server Version
Attribute nameDescription20072010201320162019
POP & IMAP Connections
Current POP Connections The total number of POP connections opened since the computer was last started.
Current IMAP Connections The total number of POP connections opened since the computer was last started.
SMTP Connections
Inbound Connections The total number of currently inbound connections.
Outbound Connections The number of connections that were currently outbound.
Messages Sent Per Second The rate at which outbound messages are being sent.
Messages Received Per Second The rate at which inbound messages are being received.
Latency Requirements Counters
Database PageFault Stalls Per Second The rate of page faults that can't be serviced because there are no pages available for allocation from the database cache. This counter should be 0 on production servers.
Database I/O Reads Average Latency The average time, in ms, to read from the database file. The average value should be below 200 ms. Maximum values shouldn't be higher than 1,000 ms.
Database I/O Writes Average Latency The average time, in ms, to write to the database file. This latency should be less than the MSExchange Database\I/O Database Reads (Recovery) Average Latency when battery-backed write caching is utilized.
I/O Log Reads Per Second The number of times data was read from a log file. Specific to log replay and database recovery operations.
I/O Log Writes Per Second The number of times a log buffer was written to the active log file. Specific to log replay and database recovery operations
Log Record Stalls Per Second The number of log records that can't be added to the log buffers per second because the log buffers are full. The average value should be below 10 per second. Maximum values shouldn't be higher than 100 per second.
Log Threads Waiting The number of threads waiting to complete an update of the database by writing their data to the log. The average value should be less than 10 threads waiting.
Message Queuing Counters
Mailbox Messages Queued For Submission The current number of submitted messages not yet processed by the transport layer. The threshold value should be below 50 at all times. Shouldn't be sustained for more than 15 minutes.
Public Messages Queued For Submission The current number of submitted messages not yet processed by the transport layer. The threshold value should be less than 20 at all times.
Information Store RPC Processing Counters
IS RPC Requests The latency, in ms, averaged for the past 1,024 packets.
IS RPC Averaged Latency The number of client requests currently being processed by the RPC Client Access service.
Cache Statistics
Number Of Cache Active Connections The number of active connections in all data connection pools created for a specific PowerPivot service application instance.
Number Of Cache Idle Connections The number of idle connections in all data connection pools created for a specific PowerPivot service application instance.
Number Of Cache Connections The number of connections to the server stored in the cache.
Cache Total Capacity The size of the cached server connection pool.
RPC Client Throttling & Client Activity Counters
Client RPC Average Latency RPC Average Latency is server RPC latency, in ms, averaged for the past 1,024 packets. The threshold value should be less than 50 ms on average for each client.
RPC Client BackOff Per Second The rate that the server notifies the client to back-off. Higher values may indicate that the server may be incurring a higher load resulting in an increase in overall averaged RPC latencies, causing client throttling to occur.
Client: RPCs Failed Per Second The client-reported rate of failed RPCs since the store was started.
Client: RPCs Failed The client-reported number of failed RPCs since the store was started.
Content Indexing Counters
Percentage Processor Time of indexing The amount of processor time being consumed to update content indexing within the store process. Full crawls increase overall processing time, but should never exceed overall store CPU capacity.
Average Document Indexing Time The average, in ms, of how long it takes to index documents. The threshold value should be less than 30 seconds at all time.
Full Crawl Mode Status This counter is used to determine if a full crawl is occurring for any specified database. Possible values are
  • 1 - going through a full crawl
  • 0 - not going through a full crawl
Average Latency of RPCs used to Obtain Content The average latency, in ms, of the most recent RPCs to the Information Store service. These RPCs are used to get content for the filter daemon for the specified database.
Crawler Mailboxes Remaining Any value of 1 or higher indicated the mailboxes in the database are being crawled but when the crawl is completed, the value is set to 0. 
Client-Related Search Counters
Slow Find Row Rate The rate at which the slower FindRow needs to be used in the mailbox store. The threshold value should be no more than 10 for any specific mailbox store. Higher values indicate applications are crawling or searching mailboxes, which is affecting server performance.
Search Task Rate The number of search tasks created per second. The threshold value should be less than 10 at all times.
Slow QP Threads The number of query processor threads currently running queries that aren't optimized. The threshold value should be less than 10 at all times.
Slow Search Threads The number of search threads currently running queries that aren't optimized. The threshold value should be less than 10 at all times.
Database Counters
Log Bytes Writes Per Second The rate of bytes written to the log. The threshold value should be less than 10,000,000 at all times.
Database Cache Percent Hit The percentage of database file page requests fulfilled by the database cache without causing a file operation. If this percentage is too low, the database cache size may be too small. The threshold value should be over 90% for companies with majority online mode clients. The threshold value should be over 99% for companies with majority cached mode clients. If the hit ratio is less than these numbers, the database cache may be insufficient.
Database Cache Size in MB The amount of system memory, in megabytes (MB), used by the database cache manager to hold commonly used information from the database files to prevent file operations. Maximum value is RAM-2GB. Use this counter along with store private bytes to determine if there are store memory leaks.
Version Buckets Allocated The total number of version buckets allocated. The threshold value should be less than 12,000. The maximum default version is 16,384.
Log Threads Waiting The number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be a bottleneck. The threshold value should be less than 10 on average. Regular spikes concurrent with log record stall spikes indicate that the transaction log disks are a bottleneck.
Log Generation Check Point Depth Represents the amount of work in the log file count that needs to be redone or undone to the database files if the process fails. The threshold value should be below 500 at all times for the Mailbox server role. A healthy server should indicate between 20 and 30 for each database instance. If checkpoint depth increases continually for a sustained period, this indicates either a long-running transaction, or a bottleneck.
Database I/O Reads Average Latency The average length of time, in ms, per database read operation. The threshold value should be 20 ms on average.
Database I/O Writes Average Latency The average length of time, in ms, per database write operation. The threshold value should be 50 ms on average. Maximum values of up to 100 ms are acceptable if not accompanied by database page fault stalls.
Mailbox Assistant Counters
Percentage Processor Time of Mailbox Assistant Percentage Processor Time of Mailbox Assistant. The threshold value should be less than 5% of overall CPU capacity.
Average Event Processing Time in Seconds The average processing time of the events chosen. The threshold value should be less than 2 at all times.
Events in Queue The number of events in the in-memory queue waiting to be processed by the assistants. The threshold value should be a low value at all times. High values may indicate a performance bottleneck.
Events Polled Per Second The number of events polled per second. Determines current load statistics for this counter.
Mailboxes Processed Per Second The rate of mailboxes processed by time-based assistants per second. Determines current load statistics for this counter.
Resource Booking Counters
Average Resource Booking Processing Time The average time to process an event in the Resource Booking Attendant. High values may indicate a performance bottleneck.
Requests Failed in Resource Booking The total number of failures that occurred while the Resource Booking Attendant was processing events. The threshold value should be 0 at all times.
Calendar Attendant Counters
Average Calendar Attendant Processing Time The average time to process an event in the Calendar Attendant. High values may indicate a performance bottleneck.
Requests Failed in Calendar Attendant The total number of failures that occurred while the Calendar Attendant was processing events.The threshold value should be 0 at all times.
Store Client Request Counters
RPC Latency Average The average latency, in ms, of RPC requests. The average is calculated over all RPCs since exrpc32 was loaded. The threshold value should be less than 100 ms at all times.
ROP Requests Outstanding The total number of outstanding remote operations requests. Used for determining current load.
RPC Requests Outstanding The current number of outstanding RPC requests.
RPC Requests Sent Per Second The total number of outstanding RPC requests. Used for determining current load.
Percentage RPC Requests Failed The percentage of failed requests in the total number of RPC requests. Failed means the sum of failed with error code plus failed with exception. The threshold value should be less than 1 at all times.
Percentage RPC Slow Requests The percentage of slow RPC requests among all RPC requests. A slow RPC request is one that has taken more than 500 ms. The threshold value should be less than 1 at all times.
HUB Servers in Retry The number of Hub Transport servers in retry mode. The threshold value should be 0 at all times.
Successful Submission Per Second The number of currently successful mail submission per second.
Failed Submission Per Second The number of failed submissions per second.
Temporary Submission Failures Per Second The number of temporary submission failures per second.
RPC Operations Per Second The current number of RPC operations occurring per second.
Information Store Counters
Client: RPC Operations Per Second The number of RPC operations per second for each client type connection.
JET Log Records Per Second The rate that database log records are generated while processing requests for the client. Used to determine current load.
JET Pages Read Per Second The rate that database pages are read from disk while processing requests for the client. Used to determine current load.
Directory Access: LDAP Reads Per Second The current rate that LDAP reads occur while processing requests for the client. Used to determine the current LDAP read rate per protocol.
Directory Access: LDAP Writes Per Second The current rate that LDAP writes occur while processing requests for the client. Used to determine the current LDAP read rate per protocol.
Messages Delivered Per Second The rate that messages are delivered to all recipients. Indicates current message delivery rate to the store.
Messages Sent Per Second The rate that messages are sent to transport. Used to determine current messages sent to transport.
Messages Submitted Per Second The rate that messages are submitted by clients. Used to determine current rate that messages are being submitted by clients.
User Count of IS The number of users connected to the information store. Used to determine current user load.
Replication Receive Queue Size The number of replication messages waiting to be processed.

 

Mailbox and Database Statistics CountersExchange Server Version
Attribute nameDescription20072010201320162019
Top Mailboxes By Size
Mailbox User Name Username for Exchange Mailbox User.
Total Item Size Specifies used mailbox size, in MB.
Item Count Specifies number of items in mailbox.
Inactive Mailbox Users
Mailbox User Name Username for Exchange Mailbox User.
Last Logon Time Time of Last Login by Mailbox User.
Database Statistics (2010, 2013, 2016 & 2019)
Master Type Specifies if the Mailbox Database is part of a DAG/ Server.
DAG Name / Server The Database Accessibility Group (DAG) name. DAG allows you to replicate
your database where your mail is stored across any number of servers.
Database Name Name of the mailbox database.
Mount Status The mount status of the mailbox stores on the server. Mailboxes contained in unmounted mailbox stores cannot receive incoming
messages.
Mailbox Count The total number of mailboxes that reside in all mailbox stores and public folders.
Database Size (GB) The size of the mailbox database.
Available New Mailbox Space (GB) This represents the unallocated storage capacity within the mailbox database that's available for storing incoming emails. It's created when users delete emails or mailboxes, but the database hasn't reclaimed that space for future use.
Last Full Backup Detailed information about backups performed on storage groups on the connected server.
Circular Logging Enabled Specifies whether or not circular logging is enabled. Circular Logging saves storage on your Exchange Server by preventing transaction logs from building up on the Server.
Days Since Last Full Backup Indicates the number of days passed since the last full backup operation was performed.
Database Statistics (2007)
Database Name Name of the mailbox database
Mount Status The mount status of the mailbox stores on the server. Mailboxes contained in unmounted mailbox stores cannot receive incoming
messages.
Mailbox Count The total number of mailboxes that reside in all mailbox stores and public folders.
Storage Group The exchange storage group name. A Storage Group is a grouping of one or more Mailbox Databases along with Log Files and a Checkpoint File. Not available for versions later than 2007.
Database Availability Groups
Database Availability Group Name The name of the database availability group.
DAG Members The database availability group members.
DAG Members Count The number of members in the particular database availability group.
Database Copy Statistics
Database Name The name of the mailbox database.
Database Status The current status of the database. Possible Statuses are:
  • Failed - The mailbox database copy is in a Failed state because it is not suspended, and it is notable to copy or replay log files. While in a Failed state and not suspended, the system will periodically check whether the problem that caused the copy status to change to Failed has been resolved. After the system has detected that the problem is resolved, and barring no other issues, the copy status will automatically change to Healthy.
  • Seeding - The mailbox database copy is being seeded, the content index for the mailbox database copy is being seeded, or both are being seeded. Upon successful completion of seeding, the copy status should change to Initializing.
  • SeedingSource - The mailbox database copy is being used as a source for a database copy seeding operation.
  • Suspended - The mailbox database copy is in a Suspended state as a result of an administrator manually suspending the database copy by running the Suspend-MailboxDatabaseCopy cmdlet.
  • Healthy - The mailbox database copy is successfully copying and replaying log files, or it has successfully copied and replayed all available log files.
  • ServiceDown - The Microsoft Exchange Replication service is not available or running on the server that hosts the mailbox database copy.
  • Initializing - The mailbox database copy will be in an Initializing state when a database copy has been created, when the Microsoft Exchange Replication service is starting or has just been started, and during transitions from Suspended, ServiceDown, Failed, Seeding, SinglePageRestore, LostWrite, or Disconnected to another state. While in this state, the system is verifying that the database and log stream are in a consistent state. In most cases, the copy status will remain in the Initializing state for about 15 seconds, but in all cases, it should generally not be in this state for longer than 30 seconds.
  • Resynchronizing - The mailbox database copy and its log files are being compared with the active copy of the database to check for any divergence between the two copies. The copy status will remain in this state until any divergence is detected and resolved.
  • Mounted - The active copy is online and accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounted.
  • Dismounted - The active copy is offline and not accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounted.
  • Mounting - The active copy is coming online and not yet accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounting.
  • Dismounting - The active copy is going offline and terminating client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounting.
  • DisconnectedAndHealthy - The mailbox database copy is no longer connected to the active database copy, and it was in the Healthy state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy.
  • DisconnectedAndResynchronizing - The mailbox database copy is no longer connected to the active database copy, and it was in the Resynchronizing state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy.
  • FailedAndSuspended - The Failed and Suspended states have been set simultaneously by the system because a failure was detected, and because resolution of the failure explicitly requires administrator intervention. An example is if the system detects unrecoverable divergence between the active mailbox database and a database copy. Unlike the Failed state, the system will not periodically check whether the problem has been resolved, and automatically recover. Instead, an administrator must intervene to resolve the underlying cause of the failure before the database copy can be transitioned to a healthy state.
  • SinglePageRestore - This state indicates that a single page restore operation is occurring on the mailbox database copy.
Based on these values, we want the Status attribute to be either Mounted (true for the server where the database is mounted) or Healthy (for the servers that hold a copy of it). For the ContentIndexState attribute, we want it to be always Healthy.
Copy Queue Length The Copy Queue Length shows the number of transaction log files waiting to be copied to the passive copy log file folder. A copy is not considered complete until it has been checked for corruption.
Content Index State The state of Microsoft Exchange Server content indexes:
  • Crawling - Database is in the process of indexing database content. Depending on the size of the database, this process could take some time to complete.
  • Disabled - Indexing for the database has been disable by an administrator.
  • Failed - An error has occurred causing the content index to fail.
  • FailedAndSuspended - The Failed and Suspended states have been set simultaneously by the system because a failure was detected, and because resolution of the failure explicitly requires administrator intervention.
  • Healthy - This indicates the Content Index is up to date and has not detected any issues. This is the only state in which a failover (automatic process) to a specific database copy can occur.
  • Seeding - A database copy is in the process of updating its Content Index from another database copy.
  • Suspended - The Suspended status occurs if an administrator manually pauses or suspends it from receiving updates from the active copy. This might be done to update a failed Content Index or to perform troubleshooting for other issues.
Latest Full Backup Time Last full backup time for the mailbox database
Active Copy If the database copy is active "True" is shown else,if Passive, "False" is shown
Days Since Latest Full Backup Indicates the number of days passed since the latest full backup operation was performed.

 

Client Access Server Role CountersExchange Server Version
Attribute nameDescription20072010201320162019
Outlook Web Access Counters
Current Users The number of users currently logged on to Outlook Web Access. This value monitors the number of unique active user sessions, so that users are only removed from this counter after they log off or their session times out. Determines current user load.
Outlook Requests Persec The number of Outlook requests processed each second. Determines current user load.
Average Search Time The average time elapsed while waiting for a search to complete.
Searches timed out The number of Outlook requests timed out.
Average response time The average time (in milliseconds) that elapsed between the beginning and end of an OEH or ASPX request. Used to determine the latency that a client is experiencing. The threshold value should be less than 100 ms at all times. Higher values may indicate high user load or higher than normal CPU time.
ASP.NET Counters
Application Restarts The number of times the application has been restarted during the Web server's lifetime.
Worker Process Restarts The number of times a worker process has restarted on the computer.
Request Wait Time The number of ms the most recent request was waiting in the queue.
Availability Service Counters
Availability Requests sec The number of requests in the application request queue per second. The request can be only for free/busy information or include suggestions. One request may contain multiple mailboxes. Determines the rate at which Availability service requests are occurring.
Average Time to Process a Free Busy Request The average time to process a free/busy request in seconds. A single request may contain multiple mailboxes.
Current requests The number of HTTP requests waiting to be assigned to a thread.
Mailbox Session Hits Per Second The number of hits in a mailbox session.
Public Folder Queries Per Second Public Folder Queries per second is the number of mailboxes for which free busy information is requested from the public folders per second.
Public Folder Request Failures Per Second The number of public folder free busy requests failed per second.
ActiveSync Service Counters
ActiveSync Requests Per second The number of HTTP requests waiting to be assigned to a thread. (Average of 50-100.)
Ping Commands Pending The number of ping commands currently pending on the server. Ping Commands Pending are the number of hanging requests, which should be almost equal to the number of Direct Push and hanging sync users.
Sync Commands Pending The number of sync commands currently pending on the server. Sync Commands Pending are the number of hanging requests, which should be almost equal to the number of Direct Push and hanging sync users.
Requests Queued The number of HTTP requests queued in a thread.
CAS OAB Download Counters
Download Task Queued The number of OAB download tasks queued since the File Distribution service started. The threshold value should be 0 at all times.
Download Tasks Completed The number of OAB download tasks completed since the File Distribution service started. The default value is every 480 minutes or 8 hours. The threshold value should be less than or equal to 3 per day.
WebService Counters
Current Connections The current number of connections established with the Web service. Determines current user load.
Connection Attempts Per Second The rate that connections to the Web service are being attempted. Determines current user load.
Current ISAPI Extension Requests The rate that Internet Server API (ISAPI) extension requests are received by the Web service. Determines current user load.
Other Request Methods Per Second The rate HTTP requests are made that don't use the OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, MOVE, COPY, MKCOL, PROPFIND, PROPPATCH, SEARCH, LOCK, or UNLOCK methods. Determines current user load.
Requests Per Second The number of requests processed each second. Determines current user load.
Completed requests Per Second The number of requests completed each second. Determines current user load.
Autodiscover Counters
Autodiscover Requests Per Second The number of Autodiscover service requests processed each second. Determines current user load.
Unified Messaging Counters
Percentage of failed mailbox connection attempts over the last hour The percentage of mailbox connection attempts that failed in the last hour. The threshold value should be less than 5%.
Percentage of inbound calls rejected by the Unified Messaging service over the last hour. The percentage of inbound calls that were rejected by the Microsoft Exchange Unified Messaging service over the last hour
Percentage of inbound calls rejected by the Unified Messaging worker process over the last hour The percentage of inbound calls that were rejected by the UM worker process over the last hour.
Percentage of messages successfully processed over the last hour The percentage of messages that were successfully processed by the Microsoft Exchange Unified Messaging service over the last hour.
Percentage of partner voice message transcription failures over the last hour The percentage of voice messages for which transcription failed in the last hour.
Unified Messaging calls disconnected on irrecoverable internal error The number of calls disconnected after an internal system error occurred.
Unified Messaging calls disconnected by user failure The total number of calls disconnected after too many user entry failures.
Unified Messaging current calls The number of calls that are currently connected to the UM server.
Unified Messaging total calls per second Total Calls per Second is the number of new calls that have arrived in the last second.
Unified Messaging user response latency User Response Latency is the average response time, in milliseconds, for the system to respond to a user request. This average is calculated over the last 25 calls. This counter is limited to calls that require significant processing.
Exchange Control Panel Counters
Explicit Sign-On Outbound Proxy Requests - Average Response Time The average time (in ms) that requests sent to a secondary Client Access server took to complete during the sampling period. The threshold value should be under 6,000 ms.
Requests - Average Response Time The average time (in ms) the Exchange Control Panel took to respond to a request during the sampling period. The threshold value should be under 6,000 ms.
ASP.net Request Failures Per Second The number of failures per second detected by ASP.NET in the Exchange Control Panel.
Powershell Runspaces - Average Active Time The average time (in seconds) that a Windows PowerShell runspace stays active while executing cmdlets in the Exchange Control Panel during the sampling period.
Powershell Runspaces Per Second The number of Windows PowerShell runspaces created per second in the Exchange Control Panel.
RBAC sessions Per Second The number of RBAC sessions loaded per second in the Exchange Control Panel.

 

Hub/Edge Transport Role CountersExchange Server Version
Attribute nameDescription20072010201320162019
Transport Database Counters (HUB,EDGE)
IO Log Writes Per Second The rate of log file write operations completed. Determines the current load.
IO Log Reads Per Second The rate of log file read operations completed. Determines the current load.
Log Generation Checkpoint Depth The amount of work (in count of log files) that needs to be redone or undone to the database files if the process fails.
Version Buckets Allocated Total number of version buckets allocated. Shows the default backpressure values as listed in the edgetransport.exe.config file.
IO Database Reads Per Second The rate of database read operations completed. Determines the current load.
IO Database Writes Per Second The rate of database write operations completed. Determines the current load.
Log Record Stalls Per Second The number of log records that can't be added to the log buffers per second because they are full. If this counter is nonzero most of the time, the log buffer size may be a bottleneck.
Log Threads Waiting The number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be a bottleneck.
Transport Dumpster Counters (HUB,EDGE)
Dumpster Size The total size (in bytes) of mail items currently in the transport dumpster on this server.
Dumpster Inserts Per Second The rate at which items are inserted into the transport dumpster on this server. Determines the current rate of transport dumpster inserts.
Dumpster Item Count The total number of mail items currently in the transport dumpster on this server. Shows the current number of items being held in the transport dumpster.
Dumpster Deletes Per Second The rate at which items are deleted from the transport dumpster on this server. Determines the current rate of transport dumpster deletions.
Transport Queue Length Counters (HUB,EDGE)
Aggregate Delivery Queue Length All Queues The number of messages queued for delivery in all queues. The threshold should be less than 3,000 and not more than 5,000.
Active Remote Delivery Queue Length The number of messages in the active remote delivery queues. The threshold should be less than 250.
Active Mailbox Delivery Queue Length The number of messages in the active mailbox queues. The threshold should be less than 250.
Submission Queue Length The number of messages in the submission queue. The threshold should be less than 100. If sustained high values occur, investigate Active Directory and Mailbox servers for bottlenecks or performance-related issues.
Active Non-SMTP Delivery Queue Length The number of messages in the drop directory used by a Foreign connector. The threshold should be less than 250.
Retry Mailbox Delivery Queue Length The number of messages in a retry state attempting to deliver a message to a remote mailbox. The threshold should be less than 100.
Retry Non-SMTP Delivery Queue Length The number of messages in a retry state in the non-SMTP gateway delivery queues. The threshold should be less than 100.
Retry Remote Delivery Queue Length The number of messages in a retry state in the remote delivery queues. The threshold should be less than 100.
Unreachable Queue Length The number of messages in the Unreachable queue. The threshold should be less than 100.
Largest Delivery Queue Length The number of messages in the largest delivery queues. The threshold value should be less than 200 for the Edge Transport and Hub Transport server roles.
Poison Queue Length The number of messages in the poison message queue. The threshold value should be 0 at all times.
Transport Load Assessment Counters (HUB,EDGE)
Messages Submitted Per Second The number of messages queued in the Submission queue per second. Determines current load. Compare values to historical baselines.
Messages Completed Delivery Per Second The number of messages delivered per second. Determines current load. Compare values to historical baselines.
Inbound Local Delivery Calls Per Second The number of local delivery attempts per second. Determines current load. Compare values to historical baselines.
Average Bytes Per Message The average number of message bytes per inbound message received. Determines sizes of messages being received for an SMTP receive connector.
Messages Received Per Second The number of messages received by the SMTP server each second. Determines current load. Compare values to historical baselines.
Messages Sent Per Second The number of messages sent by the SMTP send connector each second. Determines current load.
Items Queued for Delivery PerSecond The number of messages queued for delivery per second. Determines current load.
Inbound Message Delivery Attempts Per Second The number of attempts for delivering transport mail items per second. Determines current load. Compare values to historical baselines.
Messages Queued for Delivery Per Second The number of messages queued for delivery per second. Determines current load. Compare values to historical baselines.
Edge Sync Counters(EDGE)
Total topology updates Exchange topology updates found by EdgeSync
Exchange servers total Total number of Exchange Servers found by EdgeSync.
Edge servers total Total number of Edge Transport servers found by EdgeSync.
Hub transport servers total Total number of Hub Transport servers found by EdgeSync.
Edge servers leased total Total number of Edge Transport servers leased by EdgeSync.
Edge objects added Per Second The rate of Edge objects added per second by EdgeSync.
Edge objects deleted Per Second The rate of Edge objects deleted per second by EdgeSync.
Edge objects updated Per Second The rate of Edge objects updated per second by EdgeSync.
Scan jobs completed successfully total The total number of scan jobs completed successfully by EdgeSync.
Scan jobs failed because could not extend lock total The total number of EdgeSync scan jobs that failed because EdgeSync could not extend its lease of an Edge Transport server.
Scan jobs failed because of directory error total The total number of EdgeSync directory errors.
Scan jobs failed because could not lock total The total number of Scan jobs failed because could not lock total.
Source objects scanned Per Second The rate of Active Directory objects scanned per second by EdgeSync.
Target objects scanned Per Second The rate of Edge objects scanned per second by EdgeSync.
Recipient Filter Agent Counters(EDGE)
Recipients rejected by recipient validation Per Second Show the number of recipients rejected by recipient validation per second.
Recipients rejected by block list Per Second Show the number of recipients rejected by block list per second.
Sender Filter Agent Counters(EDGE)
Messages filtered by sender filter Per Second Show the number of messages filtered by the Sender Filter agent per second.
DNS queries Per Second The number of DNS queries per second performed by the Sender Id agent.
Attachment Filtering Counters(EDGE)
Messages attachment filtered The number of messages that were blocked, stripped of attachments, or silently deleted (as per configuration) by the attachment filtering agent.
Messages filtered Per Second The number of messages per second that the attachment filtering agent blocked, stripped of attachments, or silently deleted. If this rate rises greatly beyond what is “normal” for the Exchange server, it may indicate that the organization is being flooded with malicious e-mail.
Content Filter Agent Counters(EDGE)
Messages deleted The total number of messages that were deleted by Content Filter Agent.
Messages quarantined The total number of messages that were quarantined by Content Filter Agent.
Messages rejected The total number of messages that were rejected by Content Filter Agent.
Messages that bypassed scanning The total number of messages that bypass scanning.
Messages scanned per second The number of messages scanned per second.
Active Directory Application Mode (ADAM) Counters (EDGE)
LDAP Searches Per Sec The number of Lightweight Directory Access Protocol (LDAP) search requests issued per second. Used to determine the current LDAP search rate.
LDAP Writes Per Sec The rate at which LDAP clients perform write operations.

 

Edge Transport Role CountersExchange Server Version
Attribute nameDescription20072010201320162019
Exchange-Agents
Agent Name The Name of the agent counters.
Average Agent Processing Time in sec The average agent processing time in seconds per event. The threshold value should be less than 20
Total Agent Invocations The total number of invocations since the last restart. The current invocation rate.
Message Hygiene Counters
Average Scan Time Average time taken to perform scanning of mailboxes as part of the message hygiene using scheduled and on-demand scans. A high value could indicate a bottleneck in scanning.
Scan Requests Rejected Per Second The number of scan requests in this application pool’s queue that were rejected.
Scan Requests Being Scanned The number of scan processes currently running.
Scan Requests Processing Time Per Request The amount of time spent in processing a request.
Antimalware Processing Time Per Request The amount of time taken for the Antimalware engine to process an item.
Unhealthy Antimalware engines Antimalware engines with errors in engine functioning.
Scan Requests Fatal Errors Indicates what percentage of scan requests submitted encountered errors that prevented the processing of those scan requests.
Scan Request Timeouts The number of scan requests that timed out in the last minute.
Scan Processes Running The number of scan processes currently running.
Scan Time Per Request The scan time per request.
Scan Requests Processed Per Second The number of scan requests processed per second.
Scan Request Wait Time Per Request The time for which a scan request waits in the internal queue.
Scan Requests Queued The number of scan requests that are currently in the internal queue.
Scan Requests Submitted Per Second The number of scan requests submitted per second, including requests accepted and rejected by the scanning system.
SafetyNet Counters
Resubmission Latency Average time spent to resubmit each message when processing a Safety Net resubmit request.
SafetyNet Resubmission Rate The number of messages resubmitted from Safety Net. Resubmit requests are generally triggered by HA but can also be requested manually via New-ResubmitRequest.
SafetyNet Resubmit Request rate Number of resubmit requests that were made to safety net during the sampling period.
Shadow SafetyNet Resubmission Rate Total number of messages resubmitted from Shadow Safety Net. Shadow resubmit requests occur when a primary Safety Net server cannot be reached for several hours.
Shadow SafetyNet Resubmit Request rate Number of shadow resubmit requests that were made to shadow safety net during the sampling period.
Resubmit Request rate Total number of resubmit requests encountered per resubmit request state.

 

Unified Messaging Server Role CountersExchange Server Version
Attribute nameDescription20072010
Unified Messaging Counters
Percentage of failed mailbox connection attempts over the last hour The percentage of mailbox connection attempts that failed in the last hour.
Percentage of inbound calls rejected by the um service over the last hour The percentage of inbound calls that were rejected by the Microsoft Exchange Unified Messaging service over the last hour.
Percentage of inbound calls rejected by the um worker process over the last hour The percentage of inbound calls that were rejected by the UM worker process over the last hour.
Percentage of messages successfully processed over the last hour The percentage of messages that were successfully processed by the Microsoft Exchange Unified Messaging service over the last hour.
Percentage of partner voice message transcription failures over the last hour The percentage of voice messages for which transcription failed in the last hour.
Directory access failures The number of times that attempts to access Active Directory failed.
Calls disconnected on irrecoverable internal error The number of calls disconnected after an internal system error occurred.
Operations over six seconds The number of all UM operations that took more than six seconds to complete. This is the time during which a caller was waiting for UM to respond.
Calls disconnected by callers during um audio hourglass The number of calls during which the caller disconnected while Unified Messaging was playing the audio hourglass tones.
Total inbound calls rejected by the um service The total number of inbound calls that were rejected by the Microsoft Exchange Unified Messaging Service since the service was started.
Total inbound calls rejected by the um worker process The total number of inbound calls that were rejected by the UM Worker process since the service was started.
Call answer queued messages The number of messages created and not yet submitted for delivery.
Direct access failures The number of times that attempts to access Active Directory failed.
Hub transport access failures The number of times that attempts to access a Hub Transport server failed. This number is only incremented if all Hub Transport servers were unavailable.
Unhandled exceptions Per Second The number of calls that were disconnected after an internal system error occurred in the last second.
Queued ocs user event notifications The number of notifications that have been created and not yet submitted for delivery. Represents the number of missed call notifications that have been generated in the Office Communications Server environment and have not been submitted for delivery.
Mailbox server access failures The number of times the system did not access a Mailbox server.

 

Overview - Common CountersExchange Server Version
Attribute nameDescription20072010201320162019
Server Component States
Component Name Specifies the Exchange Component name.
State Specifies the Exchange Component Status.
Availability Specifies the availability of the Exchange Component.
Exchange Services
Service Name Specifies the Exchange Agent Service name.
Status Specifies the Exchange Agent Service Status.
Availability Specifies the availability of the Exchange Agent Service.
AD Access Domain Controllers
Domain Controller The specified domain controller
LDAP Read Time The time in milliseconds that a LDAP read request takes to be fulfilled. The average value should be under 50 milliseconds. Maximum values should not exceed 100 milliseconds.
LDAP Search Time The time in milliseconds that it takes a Lightweight Directory Access Protocol (LDAP) search request to be fulfilled. The threshold values should be below 50 ms. Maximum values should not be higher than 100 ms.
LDAP searches timed-out per Minute The number of LDAP searches that returned LDAP_Timeout during the last minute. The threshold value should be below 10 at all times for all roles. Higher values may indicate issues with Active Directory resources.
Long-running LDAP operations Per Minute The number of LDAP operations on this domain controller that took longer than the specified threshold per minute. The default threshold value is 15 seconds. The threshold values should be below 50 ms. Higher values may indicate issues with Active Directory resources.

 

Host Performance CountersExchange Server Version
Attribute nameDescription20072010201320162019
Disk Utilization
Percent Free Space The percentage of free Space on the disk.
Disk Reads Per Second Number of disk reads per second on the physical disk. This counter should be well under the maximum capacity for the disk device.
Disk Writes Per Second Number of disk writes per second on the physical disk. This counter should be well under the maximum capacity for the disk device.
Memory Utilization
Used Memory Space The memory space used by the server
Free Physical Memory The amount of free physical memory available.
Used Memory Percent The percentage of memory space used by the server
Total Visible Memory Size Total amount of physical memory available to the operating system.
Exchange Domain Controllers Connectivity Counters
Cache Hits Per Second The number of object found in cache events per second.
Cache Misses Per Second The number of objects not found in cache events per second.
LDAP Searches Per Second The number of LDAP search requests issued by a process per second.
Outstanding Asyncronous Reads The number of outstanding LDAP read requests.
Memory Pages
Pages Input/sec The rate at which pages are read from disk to resolve hard page faults.
Pages Output/sec The rate at which pages are written to disk to free up space in physical memory. A high rate of pages output might indicate a memory shortage.
Total Pages/sec The rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays.
Page Reads/sec The number of read operations, without regard to the number of pages retrieved in each operation.
Page Writes/sec The number of write operations, without regard to the number of pages written in each operation.
Transition Pages Repurposed/sec The rate at which the number of transition cache pages were reused for a different purpose.

 

QueuesExchange Server Version
Attribute nameDescription20072010201320162019
Exchange Queues *
Queue Name The identity of the queue in the form of <Server>\ <Queue>. Learn more
Status The current queue status. A queue can have one of the following status values: Active, Connecting, Suspended, Ready, or Retry. Learn more
Message Count The number of messages in the queue.
Velocity The drain rate of the queue, calculated by subtracting the value of Incoming Rate from the value of Outgoing Rate. Learn more
Delivery Type Represents how the Transport service intends to transmit the message to the next hop, which could be the ultimate destination of the message, or an intermediate hop along the way. The value External indicates the next hop for the queue is outside the Exchange organization. The value Internal indicates the next hop for the queue is inside the Exchange organization. Possible values: Internal, External, Undefined
Next Hop Domain The next domain where the messages hops from the current queue. For delivery queues, the value of this field is effectively the name of the queue. The value of NextHopDomain isn't always a domain name. For example, the value could be the name of the target Active Directory site or database availability group (DAG).

* Metrics for Exchange Queues are mapped under Settings → Performance Polling → Optimize Data Collection.

 

Exchange 2003 Performance Counters
Attribute nameDescription
SMTP Connections
Inbound Connections The total number of connections that were currently inbound.
Outbound Connections The number of connections that were currently outbound.
SMTP statistics
Local Retry Queue Length The number of messages in the local retry queue.
Remote Retry Queue Length The number of messages that were in the retry queue for remote delivery.
Remote Queue Length The number of messages that were in the remote queue.
Messages Pending Routing The number of messages that were categorized but not routed.
Messages in Local Delivery The number of messages that were currently being processed by a server event sink for local delivery.
Currently Undeliverable Messages The number of messages that were reported as currently undeliverable by routing.
Categorizer Queue Length The number of messages in the categorizer queue waiting to be categorized.
POP & IMAP Connections
Current POP Connections The total number of POP connections opened since the computer was last started.
Current IMAP Connections The total number of IMAP connections opened since the computer was last started.
Information Store Mailbox statistics
Receive Queue Size The number of messages in the mailbox's receive queue. The threshold value should be below 500 at all times.
Send Queue Size The number of messages in themailbox's send queue. In a server with no mail-enabled mailbox, it should be below 10. Otherwise, it should be below 500 at all times.
Active Client Logons The active client logons to Mailbox Stores during the specified time period.
Client Logons The client logons to Mailbox stores during the specified time period.
Logon Operations Per Min The number of logon operations to Mailbox stores per minute.
Message Recipients Delivered Per min The message delivery rate. System Monitor data should match the Exchange Load Generator predicted value for message received rate.
Messages Delivered Per min The message submission rate.
Messages Sent Per min The message sent rate.
Mailbox Used Space The amount of space used by the Mailbox.
Information Store Public Folder statistics
Send Queue Size The number of messag es in the public store's send queue.
Receive Queue Size The number of messag es in the public store's receive queue.
Active Client Logons The active client logons to Public Folder Stores during the specified time period.
Client Logons The client logons to Public Folder Stores during the specified time period.
Logon Operations Per Min The number of logon operations to Public Folders per minute.
Messages Delivered Per min The message delivery rate.
Message Recipients Delivered Per min The message recipients delivery rate. System Monitor data should match the Exchange Load Generator predicted value for message received rate.
Messages Sent Per minute The message sent rate.
Messages Submitted Per minute The message submission rate.
Public Folders Used Space The amount of space used by Public Folders.
Information Store Connections & Users
Information Store Active Connection Count The number of connections that have shown some activity in the last 10 minutes.
Information Store Connection Count The number of client processes connected to the information store.
Information Store Active User Count The number of user connectio ns that have shown some activity in the last 10 minutes.
MTA statistics
MTA Work Queue Length The number of messages in the MTA work queue. This indicates the number of messages not yet processed to completion by the MTA.
MTA Message Bytes Per Min The rate at which message bytes are processed.
MTA TCP/IP Received Bytes Per Min The rate at which bytes are received over a TCP/IP connection.
MTA TCP/IP Transmit Bytes Per Min The rate at which bytes are transmitted over a TCP/IP connection.
MTA Total Recipients Queued The maximum number of recipients permitted in the MTA queues.
MTA Work Queue Bytes The total volume of messages (in MB) stored in the message transfer agent (MTA).
Information Store statistics
Current Pending Local Delivery Messages currently pending in the MTA Queue
Current message from MSExchangeMTA Messages currently in transit from MSExchangeMTA to Exchange Store
Currentmessages to MSExchangeMTA Messages currently in transit to MSExchangeMTA from Exchange Store
Messages Received Per Min The number of messages received by the SMTP server each min
Messages Sent Per Min The number of messages sent by the SMTP server each min
HSOT Cache Hits The number of objects found in cache events per second.
Directory & Event Service statistics
Pending Replication Synchronizations The number of directory synchronizations that are queued for this server. This counter helps identify replication backlogs - the higher the number, the larger the backlog.
Remaining Replication Updates The number of directory synchronizations remaining. This counter helps identify replication backlogs - the higher the number, the larger the backlog.
Notify Queue The queue of store notifications waiting to be processed.
Address Lists Queue Length The number of entries in the Address List queue.
Message Transfer Agent Connections
MTA Queue Length The number of outstanding messages queued for transfer
MTA Queued Bytes The total volume of message content (in MB) that is stored in the queue of the Message Transfer Agent.
MTA Current Inbound Associations The number of inbound (remotely initiated) associations between the MTA and the connected MTA. MTAs can open multiple associations, if additional transfer throughput is necessary.
MTA Current Outbound Associations The number of inbound (locally initiated) associations between the MTA and the connected MTA. MTAs can open multiple associations, if additional transfer throughput is necessary.