The AI Inventory Trap: A Framework Built on a Premise Nobody Verified

A Medium article argues AI has made engineers so fast that downstream stations can't absorb the throughput, so the gains accumulate as queue depth. The framework is well-constructed. The premise is unsupported. Without the premise, there isn't much left to do with it.

Eric Bowman’s “The AI Inventory Trap” argues that AI has made engineering so much faster that downstream stations — code review, QA, security audit, product evaluation — now form the binding constraint on software delivery [1]. The prescription that follows is Theory of Constraints applied to a software org: cap upstream work-in-process, find where the constraint has actually moved, point AI at that station, and reshape the team around the new bottleneck.

I read it twice, and the second reading is where the discomfort sets in. The argument is internally coherent and operationally specific. What it never does is answer the prior question: how much faster has AI actually made engineering at the org level? The article opens with “AI made your engineers faster — so why does everything feel slower?” and treats the antecedent as a given. The available research, when you go looking, doesn’t support taking it for granted. Once that premise wobbles, much of the framework wobbles with it.

This isn’t strictly StorageMath territory; there’s no storage vendor here and no benchmark to inspect. But the diagnostic move is the same one we apply to vendor whitepapers — a confident framework, an authoritative tone, and load-bearing claims that turn out to be unsourced. The clothing changes; the epistemic problem doesn’t.

What the article assumes, and what we actually know

The load-bearing claim is that AI has dramatically increased engineering throughput at the org level. Not at the keystroke level, where the effect is real if modest. At the level where it would create the kind of upstream surplus the framework needs to do its work. Bowman writes that “features that took sprints now take days” and stacks five mistakes and five prescriptions on top of the assertion.

The article cites no internal data, no developer survey, no DORA report, no industry benchmark, no controlled study, and no team by name. The single quantified observation is that the author has seen “organizations with over a hundred initiatives in flight where fewer than 10% reach meaningful validation per quarter.” That number would be interesting if we knew the methodology, the sample, the industries it covers, and what the same number was before AI tools existed. The article reports none of those things.

I’m not making a stronger claim than this: the article’s empirical foundation is one unsourced statistic, used to support an organizational-redesign prescription. A serious version of the argument would either lead with the data or admit it’s a hypothesis.

What the actual evidence looks like

Independent research on AI’s effect on engineering productivity is genuinely messy, which is part of why the article’s confidence sits wrong with me. The honest summary is “we don’t know yet, and what we do know is mixed.”

The closest thing to a controlled measurement is METR’s July 2025 study, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity” [2]. METR randomized real engineers working on real issues in mature open-source codebases. The headline finding was that developers using AI tools were 19% slower than developers without them. The same developers, asked afterward, self-reported being 20% faster. The clock and the participants disagreed by about 40 percentage points.

A single study isn’t a verdict, and METR is careful about the limits of their setup. Open-source maintenance is not greenfield startup work; the AI tools have moved on since the study window; the developer pool was experienced and may have had less to gain from autocomplete than a junior would. I take all of that seriously. But METR is the strongest published evidence we have, and it points away from the article’s premise rather than toward it. Google’s 2024 DORA report, the largest public dataset on software delivery performance, found that AI adoption correlated with small individual productivity gains but with reduced team-level throughput and stability [3]. GitHub’s own research on Copilot, which has every commercial reason to skew favorable, reports task-completion speedups in the 26-55% range on benchmark exercises [4], not the order-of-magnitude shift Bowman’s framework requires.

What that adds up to, for me, is a picture where AI helps with the keystroke-level parts of programming, may or may not help at the team level, and almost certainly has not transformed end-to-end delivery throughput by enough to create the inventory pressure the article describes. If you’ve personally seen the effect Bowman is talking about, I’d be interested in your data. As of mid-2026, the public record doesn’t show it.

Even on its own terms, the factory metaphor leaks

Suppose for the sake of argument that the productivity claim is right and engineering really has gotten dramatically faster. The framework still has problems, because Theory of Constraints was built for factories, and the analogy to software has known failure modes that the article doesn’t address.

Goldratt’s WIP-limit argument depends on inventory carrying cost. In a factory, work-in-process inventory ties up cash, occupies floor space, accumulates damage, becomes obsolete, and keeps the line from running a different mix of products. That’s why uncapped WIP destroys money, and why throttling upstream to match downstream is worth the cost of idle capacity.

A pull request waiting on review doesn’t tie up cash. A feature flag dark-launched to one percent doesn’t take up floor space. A merged-but-not-released change doesn’t rot in a warehouse. The “inventory” is bytes in a git repository, and the carrying cost rounds to nothing. When the central economic argument for limiting WIP — that idle inventory destroys value faster than idle capacity — goes away, the prescription “cap upstream, even if engineers idle” becomes a much harder sell. I don’t think it’s wrong in principle, but it stops being the obviously-correct trade and starts being a judgment call that depends on factors the article doesn’t engage with.

The other failure mode is that Bowman’s framework treats constraints as stationary, the way a factory station is. Engineering speeds up, so QA becomes the binding constraint, so we direct AI there next, and so on. But the “stations” Bowman names — code review, QA, security review, product evaluation — are themselves knowledge work. The same AI tools that accelerated coding are now generating tests, summarizing PRs, triaging security findings, and drafting evaluation rubrics. In practice, the constraint moves faster than the redesign cycle. By the time you’ve reorganized the company around “QA is the new bottleneck,” QA isn’t the bottleneck anymore.

The math the article skips

The diagnostic Bowman recommends — “rising lead time despite falling cycle time” — needs arithmetic to interpret, and the article doesn’t supply any.

If the metric is “elapsed time from idea to impact for any given idea,” the team in that example is four days slower per item. If the metric is “rate at which proven impact reaches users,” they’ve lapped the field. The article never confronts the distinction, which is the one that actually tells you whether the org got better or worse. Picking lead-time-per-ticket as the system measure produces the framework’s preferred verdict almost regardless of what’s happening underneath. That’s the move I want to flag, more than the framework itself.

A second thing the framework assumes is that the set of work to be done is fixed — same SKU, faster. Software doesn’t work that way. Some of what AI does is make existing work cheaper to produce. Some of it is to change what’s worth doing at all: tickets that weren’t worth a week of engineering become worth an afternoon, internal tools that were never going to ship get built, and the denominator silently shifts under “lead time per ticket.” Whether that’s good or bad depends on the work. Either way, the per-item metric stops measuring the system.

A test for organizational frameworks

Here’s a test I find useful when a framework arrives with this much confidence: imagine running it for six months. What outcome would convince you it was wrong?

For the AI Inventory Trap, the outcomes are: rising lead time means you’re in the trap; falling lead time means the framework worked; growing queues are inventory accumulating; shrinking queues are WIP limits succeeding; the team feeling slower confirms the diagnosis; the team feeling faster while metrics disagree reveals the diagnosis. There is no observable outcome that would tell a reader “this isn’t happening at my org.” That’s not a property of a theory. It’s a property of a vocabulary kit.

I’m not claiming this is unique to the AI Inventory Trap. Most organizational frameworks share it, including some I’ve found useful in practice. But it is a reason to read this one with calibrated skepticism rather than as a how-to guide.

The diagnosis is backwards

There’s a more useful reading of Bowman’s observation, which the article never reaches.

If queues are building up after AI adoption, the queues aren’t AI’s fault. They’re the visible artifact of process and decision-making that was always slow, hidden until now by slow engineering. The internal-pipeline stations Bowman lists — code review, QA, security audit, product validation, change advisory — were never optimized for throughput. They were calibrated to the speed of the humans feeding them, which used to be the rate-limit. Now that engineering can produce more reviewable units per day, the gates show their actual capacity, and that capacity is constrained by calendar invites, weekly cadences, sprint planning, “let’s circle back next week,” and the political weight of every approval. AI didn’t break the pipeline. It exposed the pipeline.

When the queues fill, the framework’s instinct is to throttle the engineers back to the queue’s pace. That preserves the comfort of the existing process and asks nothing of the people running it. The opposite move is the one the AI-native companies are actually making: keep the engineering velocity, and rebuild the gates to handle it. Use AI for code review, regenerate the test suite, automate the security checks that automated tools can do, and reserve human attention for the calls that genuinely require it. Throttling the engineer to fit the calendar of the security review is solving the wrong problem.

The clearest evidence this is the right reading is that the companies building AI are also the companies most aggressively using it internally, and they are not idling their engineers to honor downstream. They’re shipping constantly, generating record revenue, and treating the friction Bowman describes as something to remove rather than to ration into. If the prescription were correct, those companies should be drowning in their own inventory. They aren’t. They’re the most profitable software businesses ever assembled. That isn’t a coincidence and it isn’t survivor bias; it’s what happens when an organization actually adapts to a faster tool instead of throttling the tool to fit the slower organization.

The other thing Bowman’s framing gets wrong is that it assumes a speed-versus-quality tradeoff: more upstream output means more pressure on review and security, so quality must suffer unless we throttle the engineers. That’s a false dichotomy. AI applied across the whole pipeline doesn’t just produce more code; it produces better code, more thorough reviews, more comprehensive tests, deeper security analysis, and a clearer end-to-end picture of what the product is actually supposed to do. Software built this way becomes more stable and more secure in the process, not less. The evidence of that is already showing up in the vulnerability reports landing in public CVE databases — AI-assisted research is finding bugs that human-only review missed for years, in code that had been audited multiple times. The transformation isn’t about shipping more units faster. It’s about transforming how an organization delivers what the market actually needs — with more quality, more stability, more predictability, and a fuller understanding of the product scope. Throttling the engineer to “protect quality” operates on a model of the work that AI has already obsoleted.

We’re not in a debate about whether AI is good or bad for software. We’re in the figuring-out phase, where the people doing the work are learning how to use the tool correctly. Frameworks that tell organizations to slow down and protect their old processes are noise from the same phase. They will look quaint in five years.

The longer arc is bigger than any of this. AI is going to end up in every home and on every desk, the way the phone did — eventually so common that it stops being remarkable. The global economy is going to be stronger for it. But the strength is conditional on adoption, and adoption is the part that’s on us. Every organization, every team, every pipeline has to figure out how to use this tool well. The companies that take that figuring-out work seriously will define the next decade of software. The ones that publish frameworks for not changing will be a footnote.

The deeper point this rebuttal has been circling is that the productivity debate isn’t really the interesting question. The interesting question is which organizations are going to make the transition and which aren’t. The rate of change is faster than it was; the rate of human adaptation is what it has always been, which is the human condition. The orgs that figure out how to leverage and master this tool will be the AI-age companies — not because they had better engineers, but because they were willing to confront their own dogma, retire process that had stopped earning its keep, and rebuild the pipeline around the new pace. The orgs that don’t will keep producing frameworks like this one, dressed up as analysis but functioning as permission to leave everything alone.

There’s no free version of this. Removing process is harder than installing a WIP limit, because process is usually attached to someone’s territory or someone’s incident. AI value isn’t a side-effect of better tools; it’s a side-effect of the human work it forces and most organizations have not yet done. Theory of Constraints offers a vocabulary for keeping the old gates, throttling the new tools, and pretending the org is keeping up. That’s a comfortable lie. The honest move is to confront the dogma the queues are exposing, and then to actually update the parts of the company that aren’t moving at the new pace.

AI will help humanity to a degree that’s hard to overstate, on the condition that humanity is willing to update the parts of itself that don’t move at AI’s pace. That’s the work. The framework lets people skip it.

On the prose itself

A note about style, because the subject matter invites it. The article carries fingerprints of heavy AI assistance: high em-dash density, parallel triadic structures, antithetical sentence pairs (“the dashboards show speed; the outcomes show inventory”), a five-mistakes-then-five-fixes scaffold of identical-altitude items, and — most diagnostically — the absence of grounding. In roughly 2,000 words on AI changing engineering throughput, the article names no company, no team, no tool, no incident, no benchmark, and no number aside from the one unsourced figure.

This isn’t a gotcha. AI-assisted writing is a normal part of how text gets drafted now, including text I’ve written. The relevant point is that the absence of grounding is also why the argument doesn’t hold up. Drafts generated from prior frameworks tend to read like the framework rather than like the territory, and the territory is where you’d expect to find the data the article needs.

I’d be careless not to flag the same risk in this rebuttal. I drafted it, sat with it, asked Claude for a second pass, and rewrote sections where I caught myself doing exactly the things I was calling out — the parallel triads, the antithetical pairs, the listicle altitude. Some of it remains; it’s hard to scrub a draft entirely clean of its assistance. Treat anything I publish, including this, with the same skeptical reading you’d apply to anything else.

Why this matters outside of storage

StorageMath exists to call out vendor whitepapers that argue without data. The AI Inventory Trap isn’t a vendor whitepaper, but it commits the same epistemic moves: a strong empirical claim presented without a source, a prescriptive framework built on top of the unverified claim, diagnostics that fit any outcome, and prose that signals confidence rather than evidence. The hygiene we’d apply to a VAST data-reduction whitepaper applies here too: where’s the data, what’s the methodology, what outcome would prove this wrong?

But the bigger reason to push back on this kind of framework isn’t epistemic. It’s that it gives organizations a sophisticated-sounding excuse to not change. Theory of Constraints is real and useful where it applies, which is to systems with stationary bottlenecks and costly inventory. Software delivery in the AI era isn’t one of those systems, and the framework’s prescription — throttle the new tool to honor the old process — is a way to keep the org comfortable while the rest of the industry moves on.

The right question isn’t whether AI is making engineers more productive than QA can absorb. The right question is whether your organization is going to transform how it delivers what the market needs — with more quality, more stability, more predictability, and a fuller end-to-end understanding of the product scope — or whether it’s going to install WIP limits and call that a strategy. The companies that figure this out won’t be debating the productivity question; they’ll be too busy shipping software that’s better and more stable than what came before. That’s the inventory trap that actually matters, and it has nothing to do with engineers being too productive.


References

[1] Bowman, E., “The AI Inventory Trap: Why Faster Upstream Makes You Slower End-to-End,” Medium, March 2026. https://medium.com/@ebowman/the-ai-inventory-trap-why-faster-upstream-makes-you-slower-end-to-end-e087ed6e6ab8

[2] METR, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity,” July 2025. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

[3] Google Cloud / DORA, “Accelerate State of DevOps Report 2024,” October 2024. https://cloud.google.com/devops/state-of-devops/

[4] GitHub Research, “Quantifying GitHub Copilot’s impact on developer productivity and happiness,” September 2022. https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/