handbook
An Operator's Handbook
Notes from the invisible profession
03:47:12 UTC CRITICAL — bgp.session.down
03:47:13 UTC
checking failover path...
03:47:14 UTC
BFD session active, rerouting
03:48:34 UTC RESOLVED — 48/48 peers established
A Operator's
Handbook
Handbook
Notes from the invisible profession
First edition
Uptime: ongoing
"The infrastructure does not care who maintains it. But someone must, and that someone learns to love the silence — not because it is peaceful, but because it means the thing they built is still breathing." — from a post-incident Slack thread, 4:12 a.m., never acknowledged
A book about what operators actually do — the 3 a.m. pages, the meetings where someone proposes cutting the redundancy that prevented the last outage, the weight of a green dashboard that nobody will ever praise you for keeping green.
Contents
Part I: First light
- The first page
- The map is not the territory
- The weight of green
Part II: The cycle
- Not this again
- Screaming into the void
- The ones who leave
Part III: The language
- The incident report
- Blameless
Part IV: The long haul
- The thing you built
- Passing the pager
Part V: And still
- Go fuck yourself
- Moments of clarity
From the margins
"You'll learn what normal looks like." — your first manager, week one, the only useful thing she said
"The map is not the territory, but the operator who has walked the territory is the map." — scrawled on a whiteboard in the NOC, erased by facilities during a weekend deep clean
"All systems nominal." — the most expensive sentence in operations, delivered free of charge, every fifteen minutes
"The document exists. The tickets exist. The post-mortems exist. The problem was never a lack of documentation. The problem is that organizations don't read — they rediscover." — annotation in a margin of a design doc, author unknown, last modified 2021, last viewed 2021