Microsoft Skype for Business Server Monitoring


Overview

Microsoft Skype for Business is a unified communication application that enables users to use instant messaging (IM), audio and video calls, online meetings, availability information, sharing capabilities and other powerful collaboration tools for corporate users all from one, easy-to-use program. Each server running Skype for Business runs one or more server roles. A server role is a defined set of Skype for Business Server functionalities provided by that server.

Applications Manager lets you effectively monitor your Skype for Business Server, collect metrics pertaining to it's server roles and performance counters in one central location, detect issues, send alerts and thus prevent possible service outages or configuration problems. Users can proactively manage their Skype for Business servers and identify issues before they become critical.

Applications Manager gathers data related to the following Server Roles:

  • Front End Server - Monitor the Registrar, User Services, SIP related KPIs (peers,protocol,responses), Storage services, MCU performance and diagnose issues pertaining to all mobility related data.
  • A/V Conferencing Server - Monitor the overall performance and functionality of the Conferencing server in your deployment.
  • Edge Server - Track client communication over SIP, requests and messages.
  • Mediation Server - Monitor call failure between proxies and gateways as well as media connectivity checks.

Creating a new Microsoft Skype for Business monitor

Supported Versions: Microsoft's Lync 2013 and Skype for Business 2015.

Prerequisites for monitoring Microsoft Skype for Business metrics: To monitor a Microsoft Skype for Business Server the user must have "Administrator" privileges and WMI access enabled for that server.

Using the REST API to add a new Microsoft Skype for Business monitor: Click here

To create a new Microsoft Skype for Business monitor, follow the steps given below:

  1. Click on New Monitor link.
  2. Select Microsoft Skype for Business.
  3. Enter the Display Name of the monitor
  4. Enter the Hostname of the host where the Microsoft Skype for Business Server is running.
  5. You can enter your own credential details or select preconfigured credentials details in Credentials Manager. If you wish to enter your own credentials, specify username and password details for this monitor.
  6. Select the Enable Kerberos Authentication checkbox if you want to monitor Microsoft Skype for Business Server through Kerberos authentication.
  7. Select the Server Roles to be monitored from the drop-down menu.
  8. Set the Poll interval.
  9. If you are adding a new monitor from an Admin Server, select a Managed Server.
  10. Choose the Monitor Group from the combo box with which you want to associate Microsoft Skype for Business server (optional). You can choose multiple groups to associate your monitor.
  11. Click Add Monitor(s). This discovers Microsoft Skype for Business server from the network and starts monitoring them.

Microsoft Skype for Business Server - Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Microsoft Skype for Business under the Middleware/Portals Category. Displayed is the Microsoft Skype for Business Server bulk configuration view distributed into three tabs:

  • Availability tab, gives the Availability history for the past 24 hours or 30 days.
  • Performance tab gives the Health Status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.

Applications Manager supports monitoring of counters relevant to the Microsoft Skype for Business server roles under the following tabs:

Performance Overview

Attribute name Description
System Statistics
Available Physical Memory The amount of available physical memory in MB.
Web Components
ASP.NET Apps v2.0 - Requests Rejected The number of requests rejected because the request queue was full for ASP.NET Apps v2.0.
ASP.NET Apps v4.0 - Requests Rejected The number of requests rejected because the request queue was full for ASP.NET Apps v4.0.
Join Launcher Service Failures The number of join failures.
Failed File Requests Per Sec The per-second rate of failed Address Book file requests.
Failed Search Requests Per Sec The per-second rate of failed address book search requests.
Failed Validate Cert Calls to other Cert Authprovider The number of failed validate cert calls to the cert auth provider.
Timed out Active Directory Requests Per Sec The per-second rate of timed out Active Directory requests.
Failed Get Locations Requests The per-second rate of failed Get Locations requests.
HTTP 5xx Responses Per Sec The per second rate of responses with HTTP 5xx code.
Microsoft Skype for Business Server Services
Display Name Name of the Microsoft Skype for Business service.
Start Mode The start up type of the service.
State The current state of the service.

Front End

Attribute name Description
Registrar Module
REG DB Store - Queue Latency The average time a request is held in the request queue to RTC database.
REG DB Store - Sproc Latency The average time it takes to execute a sproc call against RTC database.
REG DB Store - Throttled Requests Per Sec The number of requests that were rejected with a retry since the database queue latency was high.
User Services Module
DB Store - Queue Latency The average time a request is held in the request queue to RTCDyn database.
DB Store - Sproc Latency The average time it takes to execute a sproc call against RTCDyn database.
DB Store - Throttled Requests Per Sec The number of requests that were rejected with a retry since the database queue latency was high.
Shared User Services Module
Shared DB Store - Queue Latency The average time a request is held in the request queue to RTC Shared database.
Shared DB Store - Sproc Latency The average time it takes to execute a sproc call against RTC Shared database in ms.
Shared DB Store - Throttled Requests Per Sec The number of requests that were rejected with a retry since the database queue latency was high.
SIP Peers
Authentication System Errors Per Sec The per-second rate of authentication failures caused by system errors (due to low memory conditions or otherwise).
Average Outgoing Queue Delay The average outgoing queue delay in seconds.
XmppFederation Failure IMDNs Sent Per Sec The failure IMDNs sent per sec.
Connections Active The number of connections that are currently established and active.
TLS Connections Active The number of connections that are currently established and active which are authenticated using Transport Layer Security (TLS) protocol.
Sends Outstanding The number of messages that are waiting in the outgoing queues.
Average Flow Control Delay The average delay time  due to messages waiting in the outgoing queue.
Incoming Requests Per Sec The number of request received per second by the server.
Incoming Responses Per Sec The number of responses received per second by the server.
Outgoing Requests Per Sec The number of request going out per second from the server.
Outgoing Responses Per Sec The number of responses going out per second from the server.
SIP Protocol
Outgoing Messages Per Sec The number of messages sent per second.
Incoming Responses Dropped Per Sec The per-second rate of incoming responses dropped because they could not be processed (due to bad headers, insufficient routing information, server resource allocation failure).
Average Event Processing Time The average time to process a SIP transaction or dialog state change event, in seconds.
Average Incoming Message Processing Time The average time (in seconds) it takes to process an incoming message.
Average Number of Active Worker Threads The average time to process an incoming message, in seconds.
Events in Processing This metric shows the number of SIP transactions or dialog state change events, that are currently being processed.
Incoming Messages Per Sec The number of messages received per second.
Events Processed Per Sec The number of SIP transaction or dialog state change events that were delivered for processing per second.
Messages in Server The number of messages that are currently being processed by the server.
SIP Responses
Incoming 503 Responses Per Sec The total number of incoming 503 responses per second.
Local 500 Responses Per Sec The total number of 500 responses generated by the server per second
Local 504 Responses Per Sec The total number of 504 responses generated by the server per second
SIP Load Management
Incoming Messages Timed Out The number of incoming messages currently being held by the server for processing for more than the maximum tracking interval.
Average Holding Time For Incoming Messages The average time that the server held the incoming messages currently being processed.
Page File Usage The available page file space currently in use by the server process, in percentage.
Routing Apps
Primary Registrar Timeouts Number of requests for which primary registrar timed out.
Backup Registrar Timeouts Number of requests for which backup registrar timed out.
Number of Incoming Failure Responses Number of times an Emergency Call failure response was received from Gateway.
Storage Service
Skype for Business Storage Service Stale Queue Items The current number of Storage Service queue items which are not owned and last attempted a long time ago.
Dataloss Events with State Change The total number of data loss events with state change.
Dataloss Events without State Change The total number of data loss events without state change.
Failures of Replication Operations Sent to other Replicas Per Sec The per-second rate of replication operation failures.
Server Connected to Fabric Pool Manager Indicates whether server is connected to fabric pool manager.
MCU Health and Performance
ASMCU - Health State The current health of the MCU (Multi-point Control Unit) responsible for Application Sharing.
  • 0 = Normal.
  • 1 = Loaded.
  • 2 = Full.
  • 3 = Unavailable.
AVMCU - Health State The current health of the MCU (Multi-point Control Unit) responsible for Audio/Video support.
  • 0 = Normal.
  • 1 = Loaded.
  • 2 = Full.
  • 3 = Unavailable.
DATAMCU - Health State The current health of the MCU (Multi-point Control Unit) responsible for data.
  • 0 = Normal.
  • 1 = Loaded.
  • 2 = Full.
  • 3 = Unavailable.
IMMCU - Health State The current health of the MCU (Multi-point Control Unit) responsible for instant messaging.
  • 0 = Normal.
  • 1 = Loaded.
  • 2 = Full.
  • 3 = Unavailable.
IMMCU Statistics
Throttled SIP Connections The number of throttled Sip connections.
Active Conferences The number of conferences that are currently active.
Connected Users The number of users which are connected in all conferences.

Mobility

Attribute name Description
Mobility Health
Push Notification Requests Failed Per Sec The per second rate of failed push notifications.
Push Notification Requests Throttled Per Sec The per second rate of throttled push notifications.
Requests Failed Per Sec The per second rate of failed requests.
Requests Rejected Per Sec The per second rate of rejected requests.

Conferencing

Attribute name Description
Conferencing Statistics
CAA Incomplete Calls Per Sec (only for Lync Server) The per second rate of incomplete calls to Conferencing Attendant. This includes calls disconnected by the user and by the system due to invalid conference id, passcode, etc.
Allocation Latency The average time (in milliseconds) taken to complete a full MCU allocation request.
Create Conference Latency The average time (in milliseconds) taken to complete a create conference call.

Edge Server

Attribute name Description
Edge Server Statistics
Bad Requests Received Per Sec The per-second number of bad requests received.
SIP Above Limit Connections Dropped Access (Proxies Only) (only for Lync Server) The total number of connections that were dropped because the limit on number of incoming connections from a federated partner or clearing house was exceeded.
SIP Sends Timed Out Per Sec The number of sends dropped per second because they stayed in the outgoing (send) queue for too long.
SIP Flow controlled Connections The number of connections that are currently being flow-controlled (no socket receives are posted).
SIP Incoming Requests Dropped Per Sec The per-second rate of incoming requests dropped because they could not be processed (due to bad headers, insufficient routing information, server resource allocation failure).
Average Incoming Message Processing Time The average time (in seconds) it takes to process an incoming message.

Mediation Server

Attribute name Description
Mediation Server Statistics
Total Failed Calls Caused by Unexpected Interaction from the Proxy The number of calls that failed because of unexpected interaction from the Proxy.
Total Failed Calls caused by Unexpected Interaction from a gateway The number of calls that failed because of unexpected interaction from the Gateway
Load Call Failure Index The scaled index between zero and 100 that is related to all call failures due to heavy load
Candidates Missing The number of times Media stack does not have Media relay candidates.
Media Connectivity Check Failures The number of media connectivity check failures.