The terms incident and problem sound similar but play distinct roles in ITSM. An incident in incident management refers to an unplanned interruption to a service or a component of the IT infrastructure. For example, a login issue in an application due to a faulty code update is an incident, and the process of restoring the app to its normal operation is incident management.
Problem management refers to determining the cause of an incident and removing the potential bottleneck. For example, if an IT team observes a trend of recurring application slowdown incidents, this qualifies as a problem. The IT team performs a deep dive into the application logs to find frequent database deadlocks and query timeout errors during peak usage hours.
After the initial analysis, the team rules out network latency as a possible primary issue. After performing thorough RCA using methods like the five whys and fishbone diagrams, the IT team finds the SQL queries in a newly built reporting module to be inefficiently written. The IT team then deploys a permanent fix by optimizing the code and using materialized views for frequently accessed data, thereby reducing the load on the reporting module during peak usage hours.