The release of the latest version of ITIL – ITIL 4 – has refreshed the IT service management (ITSM) best practice framework’s incident management guidance. While much of it is fundamentally unchanged – in that the existing best practice is still best practice – there’s also new guidance to consider and adopt as appropriate.
To help you to understand the impact of ITIL 4 guidance on your incident management capabilities, this blog looks at the latest ITIL incident management best practice before sharing six tips for better IT support.
The ITIL incident management basics
As you’d expect, the ITIL definition of incident management is pretty similar to previous versions (after all, an incident is still an incident):
“The purpose of the incident management practice is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible.”
Source: AXELOS, “ITIL Foundation: ITIL 4 Edition” (2019)
Major incidents are also called out as potentially requiring a separate process, as are information security incidents.
There’s still the concept of first-time resolutions at the service desk, and of escalations. But ITIL also recognizes the rise in adoption, and benefits, of “swarming” for incident management – there’s more on this shortly.
The detailed ITIL 4 incident management best practice guidance is now available as a PDF download via a My ITIL subscription rather than, as previously, hard copy and digital ITIL publications.
ITIL 4’s incident management best practice guidance
Each of the ITIL 4 management practices (the PDF downloads) follows a standard content layout. For incident management, this is:
General information – including the purpose and description for, and key terms and concepts in, incident management
The scope of incident management – including mentions of the related ITIL management practices such as service desk, service request management, problem management, change enablement, and continual improvement
Practice success factors for incident management – detecting incidents early, resolving incidents quickly and efficiently, incident prioritization, and continually improving the incident management approaches
The key incident management metrics – with the offered metrics aligning with the practice success factors
Incident management’s value stream contribution and key processes – incident handling and resolution, and periodic incident review
Organizational aspects – the roles, competencies, and responsibilities for incident management
The information technology for incident management – information exchange and automation and tooling
Partners and suppliers – how third parties are involved in the incident management practice.
An introduction to swarming for incident management
The concept of swarming is based on the “swarm intelligence” shown by “social insects” such as bees. Where collective intelligence exceeds that of individuals. Swarming has been successfully applied elsewhere, for example for problem-solving in the aerospace industry and in healthcare organizations for root cause analysis after serious patient incidents. And now it’s being applied to IT support, instead of the traditional tiered-support model.
With swarming, incident management is collaboration-based and there are no tiered support groups. Hence there’s no escalation between support groups. Instead, a single person owns an issue through to resolution, with the right people (to help resolve the issue) brought in to assist as needed.
The Consortium for Service Innovation – which you might recognize from its involvement with Knowledge-Centered Service (KCS) – calls this Intelligent Swarming and has been heavily involved in its development. It states that:
“While not appropriate for all support environments, Intelligent Swarming is most effective when solving new, complex problems. The goal is to get the issue to the best resource (or resources) who can resolve the issue on the first touch. And then, if the person who owns the issue needs help, swarming facilitates finding the best resource to assist. The right people work on new issues, together, as quickly as possible. Collaboration between the right knowledge workers leads to faster and more creative resolutions, as well as rapid skills development. In contrast to the escalation model, where the person who first works the issue never learns the resolution if the issue is escalated, the swarming model proposes the owner of the issue retain ownership until the issue is resolved – even if they need help in resolving it. In this way, they will learn the resolution for every issue they work.”
6 quick tips for better incident management
Ensure that everyone involved in incident management understands the motivation for it. For example, in your organization, are IT support staff focused on fixing the IT or is it instead about getting people (employees and customers) back able to do what they need to do as quickly as possible?
Recognize the importance of knowledge management to incident management (and IT support in general). There’s also the need to invest in the people change, processes, and technology that will truly make knowledge sharing part of your day-to-day incident management operations.
Ensure that your IT support staffing levels are appropriate. Erlang C calculators offer a statistical method for doing this – for both assessing the suitability of the status quo and undertaking what-if analyses based on potential future workload changes.
Automate the high volume, low-value incident management tasks. For example, password resets and ticket triage. This not only speeds up resolutions, saves costs, and offers a better customer experience, it also takes the pressure off IT support staff. Self-service capabilities should also be considered as a people-centric frontend to some of the automation capabilities.
Ensure that incident management metrics drive the right behaviors. Including how different measures will cause conflicting behaviors. For example, first-contact resolution targets versus the ticket handling time – with one metric making calls longer and the other making them shorter.
Look to the applicability of swarming for certain types of incidents. While swarming might seem a big jump away from decades-old incident management best practice, it will likely have a number of beneficial applications within your organization. So, investigate the possibilities and trial swarming in a select group of incident-type scenarios.
What else would you call out from ITIL 4 as an important change for incident management? Plus, what other incident management tips do you have that would help others? Please let me know in the comments.