Self-Monitoring


Self-monitoring functionality helps you detect issues across all the components of Applications Manager's services and ensures their health and performance to provide uninterrupted monitoring support. You are provided with critical information about the occurred problem to help you drill down to the root cause and thus prevent possible service outages.

Currently, Applications Manager performs a periodic checkup for the health of the following components:

Diagnostic Detail Configuration

User can modify the poll interval, consecutive poll count and threshold value for each attributes as follows:

  • Under Admin Tab, click on Self Monitoring under Tools.
  • Diagnostic Details table is displayed with a description of the diagnostic.
  • You can configure the Poll Interval, Consecutive Polls and Threshold Value by clicking on the edit icon ().

Diagnostics Alerts

Diagnostics Alerts and their current status are displayed in a band at the top of the Applications Manager window. You can also view the list of alerts under the Alarms Tab - click on Diagnostics Alert button to view a list of diagnostic alerts, their status, time of generation and description. Click on the alert message to view the message history and add comments.

  • All users with ADMIN role will receive mail notifications whenever a problem is raised and cleared/discarded.
  • When a problem is detected, it is shown in Error state [red].
  • When a corrective action is taken, manually or automatically, the Error state moves [automatically or manually] to a Clear state [Green].

List of attributes supported are categorized and described below :

Server Monitoring:

Server Monitoring
Attribute name
Description
CPU Usage This will monitor the CPU utilization of Applications Manager's server. By default, we will alert the user when the CPU usage exceeds threshold value 90% for the last 15 minutes (polling interval 5 minutes & consecutive polls count 3) with Top 10 process which are consuming more CPU.
Memory Usage This will monitor the Memory utilization of the APM running server. By default, we will alert the user when the memory usage exceeds threshold value 90% for the last 15 minutes (polling interval 5 minutes & consecutive polls count 3) with Top 10 process which are consuming more memory.
Disk Usage This will monitor the Disk (where APM is installed) utilization of the APM running server. By default, we will alert the user when the disk usage exceeds threshold value 90% for the last 60 minutes (polling interval 60 minutes & consecutive polls count 1).
Disk I/O Usage This will monitor the Disk busy time of the physical disk (where APM is installed). By default, we will alert the user when the disk busy time exceeds threshold value 90% for the last 15 minutes (polling interval 5 minutes & consecutive polls count 3).

Database Monitoring(currently supported for MSSQL only)

Database Monitoring
Attribute name
Description
Database Status This will monitor the DB Connectivity.
  • For DB Connectivity, alerts cannot be raised since DB Connection itself is lost. Diagnostic message will be there in Logs ( /logs/diagnostics/selfdiagnostics.txt).
  • This attribute will not be shown in Diagnostic configuration details page.
  • Default setting is that ,If DB is down for 2 minutes. Entry is made in logs.
Database File Size This will monitor the DB File size, By default, if file size exceeds 90% of the total size, alert is raised.
Database Log Size This will monitor the DB Log size. By default, if log size exceeds 90% of the total size, alert is raised.
Note If total size of DB File & Log is infinite in MSSQL v12 & above, DB installed disk's total size is considered, and alert is raised if used size of the disk exceeds threshold (By Default 90%).

JVM Monitoring

JVM Monitoring
Attribute name
Description
JVM Memory Usage This will monitor the JVM Memory Usage. By default, we will alert the user when the JVM Memory usage exceeds threshold value 90% for the last 15 minutes (polling interval 5 minutes and consecutive polls count 3).
JVM Thread Blocked This will monitor the JVM Thread Blocked details. By default, we will alert the user when the JVM Thread block exceeds threshold value 50% for the last 15 minutes (polling interval 5 minutes and consecutive polls count 3).

Load-specific performance attributes

Load-specific attributes
Attribute name
Description
Polling Delay This is based on load factor calculation. We will take top 50 server monitors which takes more time for datacollection, and then find the last 1 hour polled values based on the polling interval. If that polled vales are less than 70 % then we alert that there is some delay in the polling interval for that particular monitor. This will happen after 1 hour when the build has started.
Polling Stops This will alert when the polling has stopped for the past 1 hour for a particuar monitor. This check will happen after 1 hour when the build has started.
Syncing Delay When a particular managed server is not syncing data for the past 30 mins by default we alert in the Admin Server.
diagnosticconfig.properties 
This properties file is used to add the diagnostics entry in AM_DIAGNOSTICS_CONF table.