June 1, 2026·14 min read

Your coding agent goes quiet at 3am. That's the moment that matters.

They were built for the joyful part. Nobody built for the operational reality senior engineers actually live in.

Hand-drawn incident response diagram titled 'Daniel's Night'. At the top, a SEV 1 incident asks, 'Why is checkout failing?'. Five information sources surround a stressed engineer sitting at a laptop: Slack, Docs/Runbooks, Jira, Datadog, and Logs. Each source contains partial or incomplete information, such as unclear answers in Slack, outdated runbooks, missing Jira links, excessive Datadog alerts, and difficult-to-scan logs. Dashed arrows connect the tools in a loop around the engineer, illustrating constant context switching. Callouts highlight problems including partial context, outdated information, missing links, signal buried in noise, and manual digging. At the bottom, the engineer asks, 'Which one is right?', leading to the conclusion: 'Still no trusted answer — More time. More risk. More frustration.' The diagram visualizes how engineers hop between tools during incidents but still struggle to reach a reliable answer.

Most tools being built show up when you're writing new code. None of them show up when you're trying to save production at 3am. The gap is not an accident, and it is about to become the most expensive thing in your engineering org.

It's 3.14 am, and Daniel's (name changed) phone is doing something unusual; instead of a polite buzz of a Slack mention, it is the incessant alarm of a page that has already gone unacknowledged once. He's the senior on call this week, as he is a week every month.

Checkout latency is through the roof with the error rate rising. He's awake, laptop open, and the clock that matters — the one that turns into refunded orders and a Monday-morning conversation with the VP — has already started.

Here is what the next ninety minutes actually look like. Not the sanitised retro version, the real one.

He starts where everyone does: "what changed?" The question sounds simple but is not, as the answer is scattered across six tools that don't talk to each other. He checks the deployment dashboard; three services shipped in the last few hours. He scrolls the team Slack, two hundred messages deep, looking for whether anyone mentioned touching the payments path. He pulls up the company's internal search (they pay for a good one) and types "checkout retry timeout", which returns a wall of documents — a runbook from 2023, two design docs, and a half-finished Confluence page. The search technically worked, but it did not answer the question. He still opens each one and glances through, decides if it is current, and scribbles it as he opens the next. He is manually synthesising by hand at 3 am with about three hours of sleep.

Thirty minutes in, he is yet to fix anything. He is assembling a picture of his own system.

He's not slow; this is just the job. Analysis of incident response reveals that 40% to 60% of the time is lost reconstructing context across tools, with engineers burning 15 to 25 minutes gathering scattered information before investigation begins. A typical outage involves 5 minutes on Slack, 10 minutes checking recent commits, 5 minutes reviewing dashboards, and only then — 25 minutes in — finally understanding a specific deployment caused the spike. The technical fix, once you know what to do, is often trivial. Getting to the point where you know what to do is where the night goes.

Eventually Daniel narrows it through time-series shape — latency creeping in a sawtooth that smells like retries stacking up. Now he needs the data to confirm it, which is not just a different tool but a different mode of work entirely: he is now writing ad hoc queries against the metrics backend, massaging time windows, and building the throwaway script that pulls the numbers he needs, because the dashboard that nobody built in advance doesn't exist. Terminal, browser, metrics tool, back to terminal. Every tool switch breaks the flow and forces him to reorient.

He finds it. Someone raised the retry cap in one of those afternoon deploys. There was a reason the cap was where it was — there always is — but it lives in a code-review comment from eight months ago and in the head of the staff engineer who is, mercifully for her, asleep. Daniel makes the call, ships the fix, watches the graph normalise, and posts the all-clear in the channel.

Then comes the part he hates the most.

He has to write it all down. The postmortem — the timeline, root cause, the decisions he made, and the follow-ups — so that the next person who hits this should not have to re-run his entire night. At 5 am, he also knows that he is going to write a thin, joyless version of it because he is exhausted, and three months from now someone will hit something adjacent and will have to start their own 3 am from scratch. How are they going to find this anyway?

Here's the part worth sitting with.

At no point during that entire night did any of the AI tooling his company had invested in have anything to offer him.

The coding agents that can scaffold a service in seconds? Silent. The coworker agents drafting PRDs and summarising meetings? Nowhere.

The entire promise of "AI is transforming software development" was absent during the ninety minutes when Daniel's job was the hardest, the stakes were the highest, and the company was actively losing money. The agents are present for the joyful part of writing happy-path code and are absent for the operational reality that senior engineers spend most of their lives in.

That should bother you more than it probably does.

This was never a coding problem

Let's replay Daniel's night and ask at each step, "What was actually missing?"

It wasn't the ability to write code. He can write code in his sleep, nearly literally. What was missing, every single time, was knowledge that existed somewhere in the organisation but not where he needed it, when he needed it.

What changed in the last four hours? That existed in the deploy logs and the PRs. What the retry cap was set to, and why? That existed in a review thread. Whether anyone had seen this shape before? That existed in a past incident that no one could surface. The fix existed the moment he reconstructed enough context to see it.

Every step of the incident was a knowledge problem wearing incident's clothes. The search tool didn't fail because it was bad; it failed because finding a pile of documents to read is not the same as answering a question. At 3 am the synthesis tax is paid in the most expensive currency: a senior engineer's attention, under pressure, against the clock.

This is the quiet truth about senior engineers that no org chart shows: the most experienced people function as the company's living knowledge layer. The reason the team "asks Daniel" is that Daniel has, in his head, the mapped, reconciled, current, sourced understanding of how things actually work and why. This is the stuff that was decided in a meeting, argued out in a review, learnt the hard way during a previous outage, and never written down anywhere a machine or a newcomer could reach. He is the knowledge base. Which is only wonderful right up until he's asleep, or on vacation, or has left the company and taken the only copy with him.

This is not a small or exotic problem — just a median engineering day. Multiple studies converge on the finding that developers only spend ~16% of their time actually writing application code, with the rest going to operational and supportive work: monitoring, maintenance, coordination, and the endless hunt for context. Around 64% of them report spending more than 30 minutes a day just searching for stuff, and a third spend over an hour. The real bottleneck was never writing code — it was to know.

AI got dramatically better at the 16% while the other 84% barely moved

This is where it gets complicated for anyone betting their roadmap on "AI will fix engineering velocity".

The coding harnesses have gotten really good very fast, aside from the occasional stub and hallucinated API. In a controlled trial, experienced developers working in codebases they knew well were measurably slower when using AI tools — and, more tellingly, believed that they had been faster. The people running the study and developers themselves had both predicted a speedup. The reality was contrary, and the participants just couldn't feel it.

You should hold that finding honestly, with caveats — a small study on matured codebases with early-2025 tooling. The gap has likely narrowed as models, coding harnesses, and workflows have matured. The point is the perception gap: engineers will feel faster whether or not they really are, which means the feeling is not a metric you can run an engineering org on.

Stepping back from coding, the larger pattern shows up. Enterprise analysis of why agent-assisted work fails keeps arriving at the same culprit: not raw model capability, but missing context and planning — the agent didn't know the constraint, the knowledge of how the org functions, or the thing that wasn't explicitly written down or prompted. The capability curve grew vertically, but the curve for team-specific operational knowledge didn't move at all. The retry rationale, the co-deploys that must be done together, the operational gotchas — they still live only in Daniel's mind. Trapped in the minds of people and scattered across tools.

The gains therefore stall at a very specific place: where knowledge lives in someone's head. Which is exactly where Daniel was standing at 3:14 am.

It gets worse as you adopt more agents

The instinct is: "so we'll point the agent at the operational work too." Good instinct. But notice what happens to the knowledge gap as you add agents.

Every agent is a brand-new hire with no memory. It has never sat in your architecture review, it doesn't know that payments and inventory must deploy together, or that the "temporary" rate limiter from last spring is now load-bearing. So every agent, every time, re-asks the same questions that new engineers ask — and the answers still live in the minds of your seniors. You haven't reduced the load on Daniel for context; you have multiplied it by the number of things asking him for context and pointed them all at the same undocumented bottleneck.

Meanwhile, the underlying decay continues. The tribal understanding erodes with every departure and every reorg. New engineers — humans and agents alike — ramp slowly because onboarding is the transfer of exactly this unwritten knowledge, and that transfer doesn't scale by hiring more or spinning up more agents. With operational load having gone up by 30% for the first time in five years, the load is going up while the knowledge to handle it gets thinner and more thinly spread.

The more agentic the engineering org becomes, the more unwritten, unreconciled, person-trapped knowledge becomes the bottleneck for everything. You can buy more capability, but you cannot buy back the context your team never wrote down.

What good teams are starting to do about it

The teams getting ahead are treating their working knowledge as a real infrastructure layer — not a byproduct that happens to accumulate in the minds of people. Here are a few moves:

Capture the why, not just the what. A merged PR records what changed. The reasoning of why the retry cap is and must stay at 2 usually evaporates into a review thread and then human memory. The decisions that cause incidents are the ones whose rationale was never written where the next person would look.

Put the constraints where the work happens. A rule that lives in a Confluence page nobody opens during an incident may well not exist. Knowledge has to surface at the moment of action — when the diff is being written, when the change is being reviewed, when the page is firing — not in a wiki that you'd need to remember to consult.

Treat context as the first-class artefact. The same reconciled answer that a senior would give — current, sourced, and specific to your system — is what teams need to build and maintain deliberately.

Close the loop from incidents back to knowledge. The postmortem Daniel hates writing at 5 am exists to feed exactly this layer. The tragedy is that it's manual tax paid by an exhausted human, when most of what it needs to capture — timeline, deploys, decisions, threads — already exists in the system that watched it happen.

The last point is the tell. The reason on-call is a knowledge problem is also the reason it is solvable — everything the next responder needs was already produced and recorded somewhere. It just needed to be reconciled into one current answer at the moment it mattered.

The shape of the fix

Picture Daniel's night with one thing changed.

The page fires. Before Daniel opens his laptop, an agent queries: "What changed in checkout in the last 4 hours?" and "What are the hard constraints in checkout?" Before he opens the six tools, the reconciled answers are waiting for him: the three deploys, the retry-cap change, the reason that cap existed, the owner who set it, the one prior incident with a similar sawtooth shape, and the rollback step that resolved it. With sources, so he can verify before trusting.

Not a pile of documents to synthesise at 3 am — the synthesis already done, drawn continuously from across tools: repo, Slack, deploys, incident history, anything else.

He still makes the call. A human still decides the intervention, because judgement is the actual job. But he makes it in five minutes instead of fifty — fully informed, without paying the context tax on the front end, and without paying it again on the back end, because the layer that answered him is built from the same sources that record what he just did. So the next person inherits tonight instead of repeating it.

That's the bet behind Ardelio: a continuously-current, reconciled knowledge surface. Your team's real working knowledge, with sources, wired into the moment of action through the coding agents engineers already use. Not another place to search — the answer that a staff engineer would give, available when the staff engineer is asleep. It runs in your environment on read-only; the knowledge stays with you. The coding agents act like they've read every review thread, and an on-call engineer is never alone at 3 am.

If you want to know the part under the hood — why reconciliation beats search-that-hands-you-a-haystack, why a system that answers is a fundamentally different thing from one that finds, and how it plugs into your stack — let's talk.

For now, the takeaway is this: the AI wave optimised the hour of the engineer's day that hurt the least. The other seven hours — the searching, the context-switching, the 3 am reconstruction, the postmortem that nobody wants to write — are waiting. They were never coding problems. They were knowledge problems all along.

(written with insights from real user interviews with 'human' engineers)

Ardelio is built for the way engineering teams actually work, not just the way they write code. Get early access →

Sources

Sherlocks.ai, "How to Reduce MTTR in 2026: From Alert to Root Cause in Minutes" (February 2026). https://www.sherlocks.ai/how-to/reduce-mttr-in-2026-from-alert-to-root-cause-in-minutes — Note: Sherlocks.ai is a vendor in this space; the 40–60% figure is widely corroborated by incident.io and other incident management platforms, but should be treated as an industry estimate rather than an independent academic finding.
OpsBreif, "How to Reduce MTTR: A Complete Guide" (January 2026). https://opsbrief.io/blog/how-to-reduce-mttr-a-complete-guide-to-cutting-incident-response-time-by-70-percent
Phoenix Incidents, "Context Switching: Why It Slows Incident Resolution and How to Fix It." https://phoenixincidents.com/blog/context-switching-why-it-slows-incident-resolution
Atlassian, "State of Developer Experience 2025," based on IDC survey of 3,500 developers and managers worldwide. Reported in Atlassian's blog (January 2026): https://www.atlassian.com/blog/development/how-tech-leaders-can-turn-ai-hype-into-real-team-productivity and ShiftMag: https://shiftmag.dev/the-number-one-productivity-killer-for-devs-finding-information-5700/
Ibid. "50% of developers report losing 10 or more hours per week to non-coding tasks driven by poor information access, fragmented tools, and constant context switching."
Stack Overflow, "Your developers deserve better: Insights from the 2024 Developer Survey" (August 2024). https://stackoverflow.co/internal/resources/your-developers-deserve-better-insights-from-the-2024-developer-survey/ — "More than 60% of respondents spend 30 minutes or more a day searching for solutions, with one in four devs spending at least 60 minutes looking for answers."
METR, "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity" (July 10, 2025). https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ — RCT with 16 experienced developers; 19% slower with AI tools; developers believed they were 20% faster; had predicted 24% speedup beforehand. The confidence interval was wide (−26% to +9%), the study involved mature codebases with strict quality standards, and the tooling has since evolved — all of which the authors themselves flag as caveats.
METR study ibid; also corroborated by the 2025 Stack Overflow Developer Survey, where the top frustration for 66% of developers using AI tools was code that was "almost right, but not quite." Reported in: https://medium.com/@tuguidragos/the-surprising-truth-ai-made-developers-19-slower-and-how-to-actually-use-cursor-copilot-right-7913fee34f51
Catchpoint, "The SRE Report 2025" (January 13, 2025), based on 301 SRE professionals surveyed July–August 2024. https://finance.yahoo.com/news/sre-report-2025-highlighting-critical-133000036.html — "After five years of steady decline, the median reported percentage of work spent on toil has increased to 30% from 25% in 2024." Also synthesised in Runframe's "State of Incident Management 2026" (March 2026), drawing on 20+ industry reports and 25+ team interviews: https://runframe.io/blog/state-of-incident-management-2025