A lightweight on-call handoff template
If you run on-call for a team, your job is to transfer risk, not to write a novel. Handoffs fail when they’re long, vague, or both.
Use this 10‑minute template to prevent hours of confusion.
The template
1) Known risks this week
List the top 1 to 3 risks that might wake someone up:
- a flaky job
- a fragile integration
- a recent risky deploy
Keep it short and blunt.
2) Recent changes that matter
What changed that could cause a surprise?
- deploys that touched core paths
- config changes
- dependency upgrades
If nothing changed, say “none.”
3) Active investigations
If there’s an open issue, include:
- the issue link
- current hypothesis
- next action
- who to page
4) Escalation map
Who can help if it gets weird?
- primary engineer on the area
- backup contact
- external vendor contact if needed
5) Quick links
Only include links that will be used in the first five minutes:
- primary dashboard
- runbook
- alert policy
A short example
- Known risks: payments retry queue is spiky during EU peak.
- Recent changes: checkout rate limiter updated Tuesday.
- Active investigations: issue #123, hypothesis is cache churn; next action is to compare node latency.
- Escalation: page Alex for payments, Priya for infra.
- Quick links: payments dashboard, checkout runbook.
Why this works
It’s short enough to complete and specific enough to be useful. The next responder starts with context, not a blank screen.
If you want to go deeper, add a weekly retro. But keep the handoff clean.
How we run this at DawnOps
We keep the handoff aligned to real signals:
- a short list of known risks from repeat questions and recent escalations
- the exact dashboards and runbooks used in the last incident
- named owners so paging is unambiguous