handbook

An Operator's Handbook

Notes from the invisible profession

03:47:12 UTC CRITICAL — bgp.session.down
03:47:13 UTC checking failover path...
03:47:14 UTC BFD session active, rerouting
03:48:34 UTC RESOLVED — 48/48 peers established
A Operator's
Handbook

Notes from the invisible profession

First edition
Uptime: ongoing

"The infrastructure does not care who maintains it. But someone must, and that someone learns to love the silence — not because it is peaceful, but because it means the thing they built is still breathing." — from a post-incident Slack thread, 4:12 a.m., never acknowledged

A book about what operators actually do — the 3 a.m. pages, the meetings where someone proposes cutting the redundancy that prevented the last outage, the weight of a green dashboard that nobody will ever praise you for keeping green.

Contents

Part I: First light
  • The first page
  • The map is not the territory
  • The weight of green
Part II: The cycle
  • Not this again
  • Screaming into the void
  • The ones who leave
Part III: The language
  • The incident report
  • Blameless
Part IV: The long haul
  • The thing you built
  • Passing the pager
Part V: And still
  • Go fuck yourself
  • Moments of clarity
Read the PDF →

From the margins

"You'll learn what normal looks like." — your first manager, week one, the only useful thing she said
"The map is not the territory, but the operator who has walked the territory is the map." — scrawled on a whiteboard in the NOC, erased by facilities during a weekend deep clean
"All systems nominal." — the most expensive sentence in operations, delivered free of charge, every fifteen minutes
"The document exists. The tickets exist. The post-mortems exist. The problem was never a lack of documentation. The problem is that organizations don't read — they rediscover." — annotation in a margin of a design doc, author unknown, last modified 2021, last viewed 2021