Problem Management Process Flow
How does Problem Management work? ITIL Problem Management is about more than just resolving Incidents; it takes into account the entire life cycle of a Problem. The Problem Management life cycle process flow can be structured to manage Problems that are initially reported as Incidents by users or service desk technicians via a self-service portal, over the telephone, via email, in person or Potential Problems that are automatically detected by ITSM personnel or technology before any Incident occurs. The scope of the Problem Management process flow includes:
1) Problem Detection
Problems can be detected in a variety of ways, including as the result of an Incident report, ongoing Incident analysis, and automated detection by an event management tool, or supplier notification. A Problem is commonly detected when the cause of one or more Incidents reported to the service desk is unknown. It is possible that the service desk has resolved the Incident and it may occur again, but they are unsure of the underlying root cause and therefore create a Problem record. In other cases, it may be clear to the service desk that a reported Incident is associated to a Problem. This Problem may have already been recorded – Known Problem – and the Incident can be linked to the existing Problem record. If the Problem has not been recorded then a Problem record should be immediately created to help assure service performance.
2) Problem Logging
In order to maintain a complete historical record, all Problems, regardless of method used to identify and report to the service desk, must by logged with all relevant details, including date/time, user information, description, related Configuration Item from the CMDB, associated Incidents, resolution details and closure information.
- Categorization - Once logged, all appropriate categories must be selected in order to properly assign, escalate and monitor frequencies and Problem trends
- Prioritization - Assigning priority is critical in determining how and when the Problem will be handled by staff. It is determined by the impact - number of associated Incidents which can provide insight into the number of affected users or its impact on the business. In addition, the urgency of the Problem - how quickly resolution is required is taken into account to define the priority
3) Investigation and Diagnosis
An investigation into the root cause of the Problem will take place based on the impact, severity and urgency of the Problem in question. Common investigation techniques include reviewing the Known Error Database (KEDB) in an effort to find matching Problems and resolutions and/or recreating the failure to determine the cause
In some situations it is possible to provide a temporary fix or workaround to the user experiencing the Incident related to the Problem. However, it’s important to seek a permanent change resolution to the underlying error detected by Problem Management
5) Create Known Error Record
Once the investigation and diagnosis is complete, it’s important to create a Known Error record. If future Incidents or Problems arise, the investigating service desk technician will identify and provide resolution more quickly using the known error database (KEDB) and associated workaround(s)
Once resolved, the solution can be implemented using the standard change procedure and tested to confirm service recovery. However, if a normal change was required, an associated Request For Change (RFC) will be raised and approved before a resolution is applied to the Problem
Following confirmation that the Error has been resolved, the Problem and any associated Incidents can be closed. The service desk technician should ensure that the initial classification details are accurate for future reference and reporting.
- Major Problem Review - Major Problems are defined by an organization’s business impact analysis (BIA) and risk assessment (RA) to determine response and priority (impact, urgency and severity of the Problem). The goal of a major Problem review is to continually improve the Problem Management process for responding to major business issues. A review process may identify things done correctly, things done incorrectly, what can be improved, additional risks, how to prevent recurrence and the nature of any third-party’s responsibility. This review should not live in a silo; it should be shared with team members as part of training and awareness sessions.
- Problem Control and Error Control – In some situations, the terms Problem Control and Error Control may be used during the Problem Management lifecycle. Problem Control can be incorporated into the investigation phase with the goal of finding the root cause of the problem and turning it into a known error. This helps the service desk technician provide temporary workarounds to the user. Error Control on the other hand is part of the resolution phase with the goal of converting known errors into solutions and removing them from the known error database (KEDB) when necessary.