How we measure the hours saved

Every time figure on this site is either measured from real timestamps or an honest, labelled estimate, and we tell you which. Here is exactly how we draw that line, so you can trust the numbers.

The internet is full of “AI made us 10× faster.” Almost none of it shows its working. We think a consultancy that sells time recovered has to be answerable for every number it prints, so here’s our method, including the parts that are estimates.

The rule is simple: every figure on this site is either measured or estimated, and we always say which.

The three tiers we compare

Most teams aren’t choosing between “no AI” and “AI.” They’re already somewhere on a ladder. We compare three rungs:

Basic AI: chatbots in a browser. They help draft and explain, but a human still does all the connective work: triage, integration, review, shipping.
Agentic AI: a coding or task agent does real work, but a person drives every hand-off: writing each prompt, reviewing each diff, running tests, opening and merging PRs, deploying.
Advanced AI: the workflows we build, where the hand-offs themselves are automated and guarded. People spend their time on judgement and approvals.

The interesting gap for most teams isn’t Basic-vs-nothing. It’s how much time is still on the table once you already use AI, which is why our headline comparisons measure Advanced against Basic AI, not against a pretend “fully manual” team.

Where most companies actually are

The three tiers aren’t evenly populated. They’re a funnel, and the public data shows how steep it is:

~70% of organizations use generative AI in at least one business function (and ~88% use AI of some kind), but for most, that means chatbots and assistants. (Stanford HAI, AI Index 2026.)
~23% are scaling agentic AI somewhere in the business, with agent use still in the single digits within any given function. (McKinsey, The State of AI, 2025–26; Deloitte’s State of AI in the Enterprise 2026 independently lands on the same ~23%.)
~1% describe their AI rollout as “mature”: fundamentally changing how work is done and driving real business outcomes. (McKinsey, 2025.) Independent 2026 figures agree the top tier stays in the low single digits: Deloitte projects that even by 2027, only ~5% of companies expect to fully integrate agentic AI as a core part of operations.

That top ~1% is the tier we build. “Become one of the 1%” isn’t a slogan. It’s roughly where the numbers sit.

Two kinds of numbers

Measured: facts from systems we operate

When a number describes the Advanced tier, it comes from real timestamps on real systems we run:

The pull-request figures in inside 452 AI-built PRs are computed from GitHub’s created and merged timestamps across every merged PR, not a hand-picked sample.
The build of this site is read straight from its git history.

These are facts. We anonymise them (timings and diff sizes only, never a client’s product or data), but we don’t round them in our favour.

Estimated: honest baselines, clearly labelled

The Basic and Agentic numbers are estimates: how long the same work takes at a lower level of AI maturity. We build them from doing the work both ways and timing it, plus years of having done it the manual way. They’re genuinely our best honest guess. And because they are estimates, we label them.

That’s why the workflow comparison on our homepage carries the words “illustrative ranges” in plain sight. It’s a model of where the hours go, not a measurement of your team.

The lines we refuse to blur

“Open → merge” is not “time to build.” Our fastest, headline-friendly number (the ~1.8-hour median from a PR opening to merging) measures the review-and-merge loop, not how long a feature took to design and build. We say so every time we use it.
Illustrative is never dressed up as measured. If a figure is modelled, it says so. If it’s measured, we’ll tell you the source.
No invented social proof. We have no fabricated testimonials, logos, or star ratings. The testimonials section and its review structured data on this site render only when a real client has approved a real quote. Until then, there’s simply nothing there.

How your audit turns estimates into your numbers

Our public figures are about our systems. Your audit produces yours, and they’re all measured:

Baseline. We map a target workflow and time it as it runs today: frequency × duration × people involved, including reviewers and approvers. That’s your real “before.”
After. We build the AI-assisted version and measure it the same way, against the same workflow.
Delta. The difference is the recovered hours: both sides measured in your tools, on your work, with the method above applied to your numbers instead of ours.

No projections dressed as results. A before and an after, both real.

If that’s the kind of rigour you want behind an AI rollout, book a strategy call and we’ll start with where your hours actually go.