Problem Management is concerned with identifying the real underlying causes of incidents and providing workarounds to restore service or to implement a permanent solution in order to prevent future recurrences. Problem Management also works proactively to prevent problems from occurring.
The goal of Problem Management is to minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure, and to prevent recurrence of Incidents related to these errors.
In order to achieve this goal we need to get to the root cause of incidents and then initiate actions to improve or correct the situation. This process involves both reactive and proactive aspects. The reactive aspect is concerned with solving problems in response to one or more Incidents. Proactive Problem Management is concerned with identifying and solving Problems and Known Errors before Incidents occur in the first place.
Problem Management differs from incident management in that its main goal is the detection of the underlying causes of an incident and their subsequent resolution and prevention. This goal can be in direct conflict with incident management where the goal is to restore service to the customer as quickly as possible rather than search for a permanent solution.
A Problem is an unknown underlying root cause of one or more incidents.
A Known Error is a Problem that is successfully diagnosed and for which a work-around has been identified.
Although not just for bugs, error handling is the prime activity. This in turn will lead on to the processes such as release management and change management to control the testing and issue of corrected code. Yet again, this process is the inevitable second step as part of implementing a Help Desk (until there are no errors generated in code, or QA discovers them all). |