Oncall & Reliability

The Oncall Paradox:
When AI Writes the Code but You're Still On the Hook

AI is generating more of your codebase every sprint. Your oncall rotation hasn't changed. Neither has your pager. But the nature of the incidents? Completely different. Here's why the oncall paradox is quietly burning out the engineers who thought AI would make their jobs easier.

Published May 26, 2026 · ~4,100 words The Clearing

It was 2:47am when Marcus's pager went off. Production database latency had spiked. Response times were climbing. His phone showed seventeen Slack messages in the #incidents channel. By the time he got to his laptop, his hands were already shaking.

Marcus had been oncall for this service for three years. He'd written most of the original architecture. But over the past eight months, AI tools had accelerated feature development so much that the codebase had essentially been rewritten three times over. He'd reviewed the PRs. He'd approved them. He hadn't written them.

At 3:30am, staring at a stack trace he couldn't fully trace, Marcus realized he didn't know this system anymore. Not really. Not well enough to debug it alone at 3am. He reached for ChatGPT. The AI helped him form hypotheses. It helped him narrow down the search space. But it couldn't tell him which engineer had introduced the subtle race condition in the new async handler — because it didn't know either.

He fixed the incident at 4:15am. He went to bed. He didn't sleep.

Three weeks later, Marcus quit. Not because the job was hard. Because the job had become impossible in a new way he didn't have words for.

He calls it the oncall paradox: AI generates the code. You're still responsible for what it does at 3am.

What the Paradox Actually Is

The oncall paradox has three layers that compound on each other:

Layer 1: Ownership without authorship. The classic engineering contract was clear: if you build it, you own it. You wrote the code. You understand it deeply. When it breaks, you can fix it. This was the psychological foundation of oncall responsibility — a form of legitimate authority earned through authorship.

AI severs that link. Your team ships features faster because AI writes them. But you're still on the oncall rotation for the resulting system. You have ownership without the authorship that used to justify that ownership. At 2am, you're responsible for understanding a system you didn't write — and the AI that helped write it can't answer your questions about why it's failing.

Layer 2: Incidents without learning. In classical oncall, incidents were uncomfortable but educationally valuable. Debugging a system you'd built yourself reinforced your mental model. You'd make a mistake, trace it back, understand the failure mode, and update your intuition. The loop closed.

AI-assisted incidents break that loop. When AI generates the code, your debugging is often a search problem — finding which AI-generated component is misbehaving — rather than a reasoning problem. You resolve the incident. But you don't always learn why it happened, because the failure mode was embedded in AI-generated logic you don't have an intuitive model for. You fix the symptom. The underlying pattern stays invisible.

Layer 3: Cognitive debt that accrues at night. The combined result is a specific form of cognitive debt that manifests most painfully during incidents. Your mental model of the system is shallower than it looks. Your confidence in your ability to handle production failures is misaligned with reality. And that gap — between your confidence and your actual understanding — is largest at 3am when your System 1 thinking is compromised and you're working from pattern matching rather than deep comprehension.

The Asymmetry Nobody Talks About

Here's what's strange about the oncall paradox: the people most likely to experience it are senior engineers. Not junior engineers. Seniors.

Junior engineers are oncall for systems they also don't fully understand — but that's always been true for juniors. They escalate. They learn. The system has always had a gradient of expertise.

Senior engineers are the ones who used to have the deepest ownership. They architected the systems. They understood every layer. But when AI accelerates the codebase faster than any single person can absorb, seniors are suddenly oncall for systems that have been extensively modified — often without their direct authorship — by AI tools operating at a pace they couldn't track.

This is a genuinely new asymmetry:

Junior engineers are oncall for systems they don't fully understand → normal, expected, part of growth
Senior engineers are oncall for systems they don't fully understand → new, caused by AI-accelerated churn, much harder to admit or escalate

Seniors can't easily say "I don't understand this part of our system" without signaling a kind of incompetence that's traditionally been associated with underperforming engineers. So they absorb the anxiety. They live with the quiet dread of the next page. They tell themselves they'll read the code more carefully. They never do, because there's always more AI-generated code to review.

What Oncall Looks Like in the AI Era

To understand the practical dimensions of the paradox, it helps to contrast traditional oncall with AI-era oncall:

👤 Traditional Oncall

Incidents rooted in code you wrote or reviewed
Failure modes map to design decisions you understand
Debugging reinforces mental models
Ownership is earned and maintained through authorship
Senior engineers have the deepest system knowledge
Oncall is stressful but generates learning

🤖 AI-Era Oncall

Incidents in AI-generated code you reviewed but didn't write
Failure modes may be in AI-generated logic with no design rationale
Debugging is often search, not reasoning
Ownership is assigned regardless of authorship
System knowledge decays faster than it can be maintained
Oncall is stressful and generates anxiety without learning

The right column describes a worse oncall experience by almost any measure — and it's the one that more and more senior engineers are living in.

The Three Diagnostic Signals

How do you know if you're experiencing the oncall paradox? These three signals appear reliably in engineers who are.

📞 The DREAD Score

You check your oncall schedule and feel disproportionate dread. Not the normal "I hope nothing breaks" energy — a specific anxiety tied to the sense that if something breaks, you won't be able to fix it without AI's help, and the AI's help won't be enough.

🔍 Surface-Level Debugging

You find yourself applying fixes rather than understanding root causes. You identify the failing component, patch it, and close the incident — but you couldn't explain why it failed in a way that would let you prevent recurrence.

📱 Escalation Hesitation

You delay paging the next level because you're not sure the issue warrants it — but the real reason is that you're not sure you can articulate what's happening well enough to escalate cleanly. You're embarrassed to admit you don't understand a system you own.

If two of these three resonate, the oncall paradox is probably affecting you. If all three are present, it's significantly impacting your wellbeing and your team's safety.

Who Falls Fastest

The oncall paradox doesn't affect all engineers equally. Three profiles experience it most acutely:

The mid-career architect (6-12 years): You built the original system. You know its bones. But AI has accelerated the codebase past what you can maintain in your head. You're functionally oncall for a system that's been significantly rewritten without you. Your mental model is outdated in ways you can't easily see.

The velocity-obsessed team lead: You measure success by shipping. You greenlit AI-assisted development across your team. You approved the PRs. And now you're oncall for a system that's fundamentally different from the one you thought you were shipping. The irony is that your success at adopting AI tooling is partially responsible for the complexity you're now managing.

The solo oncall senior: In smaller organizations, senior engineers often carry the oncall burden alone. There's no escalation above you. If something breaks in AI-generated code that you didn't write or fully review, there's nowhere to go except AI assistants and documentation you may have helped write and can't now interpret. The isolation amplifies the anxiety.

The Compounding Math

The paradox gets worse over time, not better. Here's why:

AI-generated code tends to have a specific characteristic: it's functional but not always intentional. It solves the stated problem. It doesn't always encode the design rationale, the edge cases considered, the constraints that shaped the solution. When you wrote code, your commits told stories. When AI writes code, the commits are clean — and silent.

This means that each sprint, the gap between "code in production" and "understanding in your head" tends to widen, not narrow. The more AI accelerates development, the larger the ownership gap grows. At some point, the gap becomes large enough that oncall becomes a form of performance anxiety rather than engineering problem-solving.

Engineers who experience this consistently describe a pattern: they start treating oncall as something to survive rather than something to learn from. They become reactive. They stop doing the deep debugging that would close the knowledge gap. They lean harder on AI during incidents — which helps them close incidents faster but doesn't help them understand the system better. The loop accelerates in the wrong direction.

The Framework: Owning What You Didn't Author

The oncall paradox doesn't have a clean solution — but it has a framework for managing it. The goal isn't to eliminate AI from the development process (that's not happening) but to maintain enough genuine understanding of production systems that oncall remains survivable and educational.

Map the Ownership Gap

Once a month, identify which parts of your production system were primarily authored by AI in the last 90 days. This isn't about blame — it's about awareness. You need to know where your mental model is most likely to be inaccurate.

Do a Post-Incident Understanding Audit

Every incident is a learning opportunity — but only if you close the loop. After any incident in AI-authored code, spend 30 minutes tracing the failure mode to its origin. Not to fix it (it's already fixed) but to understand it. Write a one-paragraph explanation and share it with the team.

Audit Before You Ship, Not Just After

AI-generated features should come with a mandatory understanding review before they go to production. This doesn't mean re-reading every line — it means a 20-minute session where the author (human or AI) explains the system's behavior in the new region to whoever is oncall for it.

Track Understanding, Not Just Incidents

Add a metric to your team's health tracking: "Oncall Confidence Score" (1-5, self-assessed, monthly). If average confidence is declining while incident frequency is stable, that's the oncall paradox. You can't fix what you don't measure.

Protect Deliberate Practice Time

Schedule 2 hours per week for intentional system exploration — not bug fixing, not code review, not AI-assisted development. Just you and the codebase, understanding something you don't currently understand. This is the only known antidote to the ownership gap.

The Organizational Problem

The oncall paradox isn't just an individual experience — it's an organizational design failure that AI tooling has exposed.

Traditional oncall models were built around the assumption that the people oncall had authored or directly understood the systems they were responsible for. That assumption is increasingly false, and organizations haven't updated their models to match.

What needs to change at the team and organizational level:

Oncall assignment should account for authorship. If a feature was primarily authored by AI tools and reviewed by someone who didn't write the core logic, that person shouldn't be primary oncall for that feature unless they've done a genuine understanding review.
Incident reviews need an "understanding" dimension. Currently, most post-mortems focus on what failed and how to prevent recurrence. They should also ask: did the oncall engineer understand this part of the system? If not, why not?
The velocity metric needs a complexity offset. Shipping faster with AI shouldn't just be measured in features per sprint — it should also be measured in ownership debt accrued. If you're generating ownership debt faster than engineers can absorb it, you're building toward an oncall crisis.
Psychological safety around "I don't understand this" needs to be explicit. This is especially important for senior engineers who feel they can't admit knowledge gaps without appearing incompetent. Managers need to create explicit space for this admission — and reward engineers who surface ownership gaps proactively.

The organizations that figure this out first will have a significant retention advantage. The engineers who are quietly suffering through the oncall paradox are, disproportionately, the most experienced engineers on their teams. They're also the hardest to replace.

The Quiet Cost

Marcus isn't unusual. The engineers who experience the oncall paradox most acutely rarely talk about it openly. There's shame attached to it — the sense that you should be able to understand your own systems, that senior engineers don't get confused, that admitting you can't debug your own production service is a kind of failure.

But the oncall paradox isn't a failure of individual engineers. It's an emergent property of a system that generates code faster than humans can maintain genuine understanding of it. It affects competent, experienced, careful engineers. It affects the people who are best equipped to handle complex incidents — except when those incidents occur in code they didn't write.

The quiet cost is engineers who stop sleeping well. Engineers who check their phones compulsively during off hours. Engineers who start looking for exits not because they're burnt out in the classical sense, but because they've lost the feeling of legitimate authority over the systems they're responsible for.

The fix isn't to use less AI. The fix is to be honest about what AI-generated code costs in terms of system ownership — and to redesign oncall practices, team structures, and individual habits to account for that cost.

Your pager will keep going off. The code it asks you to debug will keep being written by something other than you. The question is whether you and your organization have been honest enough about that gap to actually manage it — or whether you're just waiting for the next 2:47am page to tell you something is wrong.

Frequently Asked Questions

Is the oncall paradox different from general burnout? +

Yes, but they're related. Classical burnout is typically caused by overwork, chronic stress, and emotional exhaustion. The oncall paradox is more specific — it's driven by a mismatch between responsibility and understanding. You own a system you don't fully comprehend, and that gap manifests most painfully during incidents. The anxiety is epistemic (about knowledge) rather than purely energetic (about hours worked). Both are serious. Both require different interventions.

Should I stop using AI coding tools to avoid this? +

No — that's not a viable path, and it's not the right fix. AI coding tools provide real productivity benefits. The solution is to build practices around AI-assisted development that account for the ownership gap. The framework above (map the gap, audit understanding, schedule deliberate practice) is designed to let you keep using AI while maintaining genuine system ownership. The goal is sustainable velocity, not velocity maximization at the cost of oncall wellbeing.

I'm a senior engineer and I'm experiencing this. Am I alone? +

You're not alone, but you might feel like you are — and that's part of the problem. Senior engineers are socialized not to admit knowledge gaps. Admitting you don't understand a system you own feels like admitting incompetence. It isn't. It's an honest assessment of a genuine mismatch. The engineers experiencing the oncall paradox most acutely are often the most experienced, most conscientious engineers on their teams. That's not a coincidence — those are exactly the engineers who care most about genuine understanding, and who feel the gap most painfully when it exists.

How do I bring this up with my manager? +

Frame it as a team-level risk, not a personal limitation. The framing that works best: "I've noticed that as we've accelerated development with AI tools, our oncall load has changed in character — more incidents are occurring in code that wasn't directly authored by the oncall engineer. I think this is a pattern we should track and design around." This is a legitimate engineering risk with data to back it up. It positions you as identifying a systemic problem rather than admitting a personal failure.

Does this affect SRE and platform teams differently? +

Yes, and often more acutely. SRE and platform teams typically own the infrastructure that application AI tools sit on top of. They're oncall for increasingly complex systems built on increasingly abstract foundations — foundations that application developers are generating with AI at a pace that platform teams can't fully absorb or review. Platform engineers are often the ones catching the cascading failures from AI-generated code that doesn't account for rate limits, fallback behavior, or dependency failure modes. The paradox is especially sharp for teams that own shared infrastructure.

Feeling the oncall paradox in your own work?

Take the AI Fatigue Quiz →

Understand your triggers. Get a recovery plan built for your engineering context.

The Oncall Paradox:When AI Writes the Code but You're Still On the Hook