If a tree falls in a forest and no one is around to hear it, does it make a sound?

In my article about incidents, I wrote:

“Raise an incident when you think a postmortem would be useful for what you are seeing.”

This also includes raising an incident for a near miss. A near miss refers to an event that could have caused an incident, outage, security breach, or service degradation — but ultimately did not. Near misses can be a great source of learning, and reducing near-misses can reduce actual incidents. Making them visible can also be an integral part in making your organisation more resilient to failures.

Using your incident processes also for a near miss can be great. In my previous article about incident priorities, I left out a crucial incident priority – the one for near misses. As such, I usually recommend organisations also include a priority 4 (for near misses) in their incident priority matrix. This is what the revamped matrix would look like:

Impact \ SeverityNone/Near MissLowMediumHigh
None/Near MissPriority 4Priority 4Priority 4Priority 4
LowPriority 4Priority 3Priority 3Priority 2
MediumPriority 4Priority 3Priority 2Priority 1
HighPriority 4Priority 2Priority 1Priority 1

I usually recommend only paging/alerting when customers actually are impacted. As such, near misses are usually not paged for at all. Instead, they are discovered through dashboards, metrics, and logs - or simply brought up when someone discovers they were close to making a mistake. The latter obviously requires a large amount of psychological safety.

Tip
Heads up! I offer consultancy services in this space. Don’t hesitate to reach out if you would like me to help your company improve when it comes to reliability, resiliency, architecture feedback, on-call, alerting, or incident training. 👋