Every few months, something shiny shows up. A new database that “scales infinitely.” A framework that promises to solve distributed consensus in 10 lines. A storage engine written in Rust that’s “blazingly fast.”
And every time, I have the same thought: will this thing still be maintained when I need to debug it at 2 AM on a Saturday?
The Case for Boring
The most reliable systems I’ve built — and more importantly, the ones I’ve operated — are boring. Not boring as in “nobody thought about them.” Boring as in the choices were so obvious they didn’t generate any debate.
Postgres over the latest NoSQL flavor. A simple REST API over a GraphQL layer nobody asked for. Cron jobs over a custom scheduler. A monolith before microservices.
These aren’t sexy choices. But here’s what they buy you:
- Debugging is straightforward. Google your error and there are 400 Stack Overflow answers from 2014.
- Onboarding is fast. New engineers don’t need a PhD in your custom abstraction layer.
- Failure modes are well-understood. Postgres connection pooling issues? There’s a playbook. Your bespoke event mesh melting down? Good luck.
A Real Example
I once worked on a system that processed millions of messages a day. The original design had four message queues chained together, each with its own retry logic, dead-letter handling, and serialization format.
It was impressive on a whiteboard.
In production, debugging a single failed message meant jumping across four dashboards, correlating timestamps, and praying the logs hadn’t rotated.
We rewrote it with a single Kafka topic and a consumer that wrote to Postgres. One queue, one database, one log stream. The whole pipeline became something a new hire could trace end-to-end in 30 minutes.
Was it less “architecturally elegant”? Sure. But incidents that used to take hours to debug now took minutes.
When Boring Gets Hard
The tricky part is that boring takes discipline. It’s easy to say “just use Postgres” when you’re starting fresh. It’s harder when someone brings a compelling argument for a new technology in a design review.
Here’s my mental filter:
- Will someone besides me operate this at 3 AM? If yes, pick the thing they already know.
- Does this solve a problem we actually have, or one we might have? Premature optimization isn’t just about code — it applies to architecture too.
- How many people in the world have debugged this in production? The size of that community is directly proportional to how fast you’ll recover from outages.
Boring Scales
There’s a misconception that boring tech doesn’t scale. That’s backwards. Boring tech has already scaled — that’s why it’s boring. Postgres handles more production traffic globally than most hyped databases ever will. Kafka has been battle-tested at LinkedIn, Netflix, and thousands of companies that never wrote blog posts about it.
The systems that survive aren’t the ones that trended on Hacker News. They’re the ones that kept running quietly while everything around them caught fire.
Boring systems don’t get applause.
But they don’t page you at 3 AM either.