A practical on-call ramp for new engineers
Most teams either ramp new engineers too fast (and burn them) or too slow (and pay for it in bottlenecks).
A good on-call ramp is not a time-based checklist. It’s a competency-based progression.
What new engineers actually need
They need reps with:
- your service architecture
- your observability conventions
- your common failure modes
- your mitigations and guardrails
- your comms habits
They do not need 40 pages of docs.
A 4-stage ramp that works
Stage 1: Shadowing with intent
- pick 2–3 representative incidents (real postmortems)
- replay them using dashboards/logs
- ask: “what would you do next, and why?”
Stage 2: Guided simulations
- short exercises (20–40 minutes)
- clear objectives (stabilize SLO, reduce lag, restore throughput)
- a mentor watching decision-making, not typing speed
Stage 3: Assisted on-call
- the new engineer is primary on low-risk incidents
- a senior is explicitly “co-pilot”
- comms templates and runbooks are required, not optional
Stage 4: Independent on-call with periodic drills
- quarterly simulations to prevent skill atrophy
- new failure modes added over time
What managers should measure
- time-to-diagnose/mitigate trend
- number of “stuck” moments per incident
- runbook reliance (good) vs improvisation (risky)
- confidence without complacency
The goal is not to create heroes. It’s to create a reliable baseline of competence across the team.
If you want to shorten your ramp without increasing risk, start by turning two real postmortems into two repeatable simulations.