LIVE NEWS
  • Kosovo president moves to dissolve Parliament for early election as country plunges into new crisis
  • Western Union Partners with Crossmint to Launch USDPT Stablecoin on Solana
  • Middle East crisis live: US submarine sank Iranian warship, Hegseth says; Israel launches fresh strikes on Tehran | US-Israel war on Iran
  • Calls for Global Digital Estate Standard as Fraud Risk Grows
  • An ode to craftsmanship in software development
  • Global economy must stop pandering to ‘frivolous desires of ultra-rich’, says UN expert | Environment
  • Some Middle East Flights Resume but Confusion Reigns From Iran Strikes
  • Clinton Deposition Videos Released in Epstein Investigation
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • See More
    • Artificial Intelligence
    • Climate Risks
    • Defense
    • Healthcare Innovation
    • Science
    • Technology
    • World
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • Artificial Intelligence
  • Climate Risks
  • Defense
  • Healthcare Innovation
  • Science
  • Technology
  • World
Home»Artificial Intelligence»AI agents and bad productivity metrics
Artificial Intelligence

AI agents and bad productivity metrics

primereportsBy primereportsFebruary 23, 2026No Comments7 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
AI agents and bad productivity metrics
Share
Facebook Twitter LinkedIn Pinterest Email


Here’s a little bit of snark from developer John Crickett on X:

Software engineers: Context switching kills productivity. Also software engineers: I’m now managing 19 AI agents and doing 1,800 commits a day.

Crickett’s quip lands perfectly because it is not actually a joke. It’s a preview of the next management fad, wherein we replace one bad productivity proxy (lines of code) with an even worse one (agent output), then act surprised when quality collapses.

And yes, I know, nobody is doing 1,800 meaningful commits. But that’s the point. The metric is already being gamed, and agents make gaming effortless. If your organization starts celebrating “commit velocity” in the agent era, you are not measuring productivity. You are measuring how quickly your team can manufacture liability.

The great promise of generative artificial intelligence was that it would finally clear our backlogs. Coding agents would churn out boilerplate at superhuman speeds, and teams would finally ship exactly what the business wants. The reality, as we settle into 2026, is far more uncomfortable. Artificial intelligence is not going to save developer productivity because writing code was never the bottleneck in software engineering. The true bottleneck is validation. Integration. Deep system understanding. Generating code without a rigorous validation framework is not engineering. It is simply mass-producing technical debt.

So what do we change?

Thinking correctly about code

First, as I argued recently, we need to stop thinking about code as an asset in isolation. Every single line of code is surface area that must be secured, observed, maintained, and stitched into everything around it. As such, making code cheaper to write doesn’t reduce the total amount of work but instead increases it because you end up manufacturing more liability per hour.

For years, we treated developers like highly paid Jira ticket translators. The assumption was that you could take a well-defined requirement, convert it to syntax, and ship it. Crickett rightfully points out that if this is all you are doing, then you are absolutely replaceable. A machine can do basic translation, and a machine is perfectly happy to do it all day without complaining.

What a machine cannot do, however, is understand critical business context. AI cannot feel the financial cost of a compliance mistake or look at a customer workflow and instinctively recognize that the underlying requirement is fundamentally wrong. For this we need people, and we need people to thoughtfully consider exactly what they want AI to do.

Crickett frames this transition as a necessary move toward spec-driven development. He’s right, but we need to be incredibly clear about what a specification means in the agent era. It’s not one more Jira ticket but, rather, a set of constraints tight enough to ensure an LLM can’t escape them. In other words, it’s an executable definition of done, backed entirely by tests, API contracts, and strict production signals. This is the exact type of foundational work we have underinvested in for decades because it doesn’t look like raw output; it looks like process. You know, that “boring stuff” that slows you down.

You can see the friction playing out in real time just by looking at the comments to Crickett’s tweet. You’ll find people desperately trying to square the circle of agentic development. One commenter tries to reframe the chaos by calling it architecture versus engineering. Another insists that managing 19 agents is actually orchestrating, not context switching. A third bluntly states that running more than five agents simultaneously starts to look like vibe coding, which is merely a polite phrase for gambling with production systems. They are all highlighting the core issue: You haven’t eliminated the work. You’ve just moved it from implementation to supervision and review.

The more you parallelize your code generation, the more “review debt” you create.

Observability to the rescue

This is where Charity Majors, the co-founder and CTO of Honeycomb, becomes frustrated. Majors has argued for years that you can’t really know if code works until you run it in production, under real load, with real users, and real failure modes. When you use AI agents, the burden of development shifts entirely from writing to validating. Humans are notoriously bad at validating code merely by reading large pull requests. We validate systems by observing their behavior in the wild.

Now take that idea one step further into the agent era. For decades, one of the most common debugging techniques was entirely social. A production alert goes off. You look at the version control history, find the person who wrote the code, ask them what they were trying to accomplish, and reconstruct the architectural intent. But what happens to that workflow when no one actually wrote the code? What happens when a human merely skimmed a 3,000-line agent-generated pull request, hit merge, and moved on to the next ticket? When an incident happens, where is the deep knowledge that used to live inside the author?

This is precisely why rich observability is not a nice-to-have feature in the agent era. It’s the only viable substitute for the missing human. In the agent era, we need instrumentation that captures intent and business outcomes, not just generic logs that say something happened. We need distributed traces and high-cardinality events rich enough that we can answer exactly what changed, what it affected, and why it failed. Otherwise, we’re attempting to operate a black box built by another black box.

Majors also offers essential operational advice: Deploy freezes are a complete hack. The common human instinct when change feels risky is to stop deploying. But if you keep merging agent-generated code while not deploying it, you’re simply batching risk, not reducing it. When you finally execute a deploy, you’ll have absolutely no idea which specific AI hallucination just took down your payment gateway. So if you want to freeze anything, freeze merges. Better yet, make the merge and the deploy feel like one singular atomic action. The faster that loop runs, the less variance you have, and the easier it is to pinpoint exactly what broke.

Golden paths are the way

The fix for this impending chaos is not to rely on heroic engineers. As Majors points out, resilient engineering requires a commitment to platform engineering and golden paths (something I’ve also argued). Such golden paths make right behavior incredibly easy and the wrong behavior incredibly hard. The most productive teams of the next decade will not be the ones with the most freedom to use whatever framework an agent suggests, but instead those that operate safely inside the best constraints.

So how do you measure success in the agentic era?

The metrics that matter are still the boring ones because they measure actual business outcomes. The DORA metrics remain the best sanity check we have because they tie delivery speed directly to system stability. They measure deployment frequency, lead time for changes, change failure rate, and time to restore service. None of those metrics cares about the number of commits your agents produced today. They only care about whether your system can absorb change without breaking.

So, yes, use coding agents. Use them aggressively! But don’t confuse code generation with productivity. Productivity is what happens after code generation, when code is constrained, validated, observed, deployed, rolled back, and understood. That’s the key to enterprise safety and developer productivity.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleBitcoin falls to nearly $64,000 as 2026 crypto woes continue
Next Article Nigerian man sentenced to 8 years in prison for running phony tax refund scheme
primereports
  • Website

Related Posts

Artificial Intelligence

An ode to craftsmanship in software development

March 4, 2026
Artificial Intelligence

The Greatest AI Show On Earth

February 25, 2026
Artificial Intelligence

Judge Dismisses Elon Musk’s XAI Trade Secret Lawsuit Against OpenAI

February 25, 2026
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Global Resources Outlook 2024 | UNEP

December 6, 20255 Views

The D Brief: DHS shutdown likely; US troops leave al-Tanf; CNO’s plea to industry; Crowded robot-boat market; And a bit more.

February 14, 20264 Views

German Chancellor Merz faces difficult mission to Israel – DW – 12/06/2025

December 6, 20254 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

PrimeReports.org
Independent global news, analysis & insights.

PrimeReports.org brings you in-depth coverage of geopolitics, markets, technology and risk – with context that helps you understand what really matters.

Editorially independent · Opinions are those of the authors and not investment advice.
Facebook X (Twitter) LinkedIn YouTube
Key Sections
  • World
  • Geopolitics
  • Popular Now
  • Artificial Intelligence
  • Cybersecurity
  • Crypto
All Categories
  • Artificial Intelligence
  • Climate Risks
  • Crypto
  • Cybersecurity
  • Defense
  • Economy
  • Geopolitics
  • Global Markets
  • Healthcare Innovation
  • Politics
  • Popular Now
  • Science
  • Technology
  • World
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Disclaimer
  • Cookie Policy
  • DMCA / Copyright Notice
  • Editorial Policy

Sign up for Prime Reports Briefing – essential stories and analysis in your inbox.

By subscribing you agree to our Privacy Policy. You can opt out anytime.
Latest Stories
  • Kosovo president moves to dissolve Parliament for early election as country plunges into new crisis
  • Western Union Partners with Crossmint to Launch USDPT Stablecoin on Solana
  • Middle East crisis live: US submarine sank Iranian warship, Hegseth says; Israel launches fresh strikes on Tehran | US-Israel war on Iran
© 2026 PrimeReports.org. All rights reserved.
Privacy Terms Contact

Type above and press Enter to search. Press Esc to cancel.