DawnOps

A lightweight on-call handoff template

If you run on-call for a team, your job is to transfer risk, not to write a novel. Handoffs fail when they’re long, vague, or both.

Use this 10‑minute template to prevent hours of confusion.

The template

1) Known risks this week

List the top 1 to 3 risks that might wake someone up:

  • a flaky job
  • a fragile integration
  • a recent risky deploy

Keep it short and blunt.

2) Recent changes that matter

What changed that could cause a surprise?

  • deploys that touched core paths
  • config changes
  • dependency upgrades

If nothing changed, say “none.”

3) Active investigations

If there’s an open issue, include:

  • the issue link
  • current hypothesis
  • next action
  • who to page

4) Escalation map

Who can help if it gets weird?

  • primary engineer on the area
  • backup contact
  • external vendor contact if needed

Only include links that will be used in the first five minutes:

  • primary dashboard
  • runbook
  • alert policy

A short example

  • Known risks: payments retry queue is spiky during EU peak.
  • Recent changes: checkout rate limiter updated Tuesday.
  • Active investigations: issue #123, hypothesis is cache churn; next action is to compare node latency.
  • Escalation: page Alex for payments, Priya for infra.
  • Quick links: payments dashboard, checkout runbook.

Why this works

It’s short enough to complete and specific enough to be useful. The next responder starts with context, not a blank screen.

If you want to go deeper, add a weekly retro. But keep the handoff clean.

How we run this at DawnOps

We keep the handoff aligned to real signals:

  • a short list of known risks from repeat questions and recent escalations
  • the exact dashboards and runbooks used in the last incident
  • named owners so paging is unambiguous

Keep going