Writing

The five levels of AI, and where the money actually shows up

I build my own AI tools on weekends, and I have run large commercial operations for 27 years. The return is far easier to see in the small tools, and that gap is the whole story.

The most expensive question in business right now is five words long: did the AI pay off?

Bain put a version of it to 951 companies this year. The most common savings target was 11 to 20 percent. Around 40 percent of companies landed under 10. Nine in ten plan to raise their AI budgets anyway. MIT ran the same question past 300 enterprise deployments and found that 95 percent of GenAI pilots delivered zero measurable P&L impact, while the same generic tools kept producing value in the hands of individual employees. Gartner saw it coming: it predicted that at least 30 percent of GenAI projects would be abandoned after proof of concept by the end of 2025, and listed unclear business value among the leading causes.

Hold those findings side by side and a strange picture forms. The technology works. People at their desks pull value out of it every day. Companies pour billions into the same technology and struggle to show anything for it. Somewhere between one person and ten thousand, the return stops being visible. Unclear business value is a visibility problem before it is anything else. The company loses sight of the number first and the money second.

I know both ends of this picture. I have spent 20+ years in commercial roles, with the budget cycles, the steering committees, and the post-mortems that come with them. On Sunday evenings I build my own AI tools, small ones, for my own work. The weekend tools prove themselves before dinner. The big projects I am shown take a quarter of arguing to prove anything at all. Same technology, opposite experience of the return. The pattern that explains both fits on one ladder, and it has changed how I judge every AI project put in front of me.

The ladder, and the one thing that sets each rung

The ladder runs from the smallest scope of AI use to the largest. I rank them by a single question: when the work is done, can one person look at one number and know it worked?

That visibility rests on three things. One owner who can see the whole loop. A baseline stable enough to measure against. A feedback loop short enough that the result arrives before the world changes. Hold all three and the return is obvious. Lose them and it goes dark, however real the value underneath.

One variable that sets apart the levels has less to do with scope than with ownership: the number of separate P&L owners who can each claim the win. One owner over five connected tools still has perfect line of sight. Two budget owners over a single shared system are already arguing about whose number moved. Connected systems add friction; divided ownership splits the credit, scatters the data, and split credit is what turns a real result into an unprovable one. Watch the ownership multiply as we climb, and watch the line of sight fade with it.

The five levels of AI framework, from personal use up to enterprise with external sources
The five levels, from one person at a desk to the whole enterprise reaching past its own walls.

Level 1: personal

One person, one task, no handoff. This is my Sunday. I drop 10 pages of meeting notes into Claude and get back a one-page brief in my own structure. I hand it my inbox and it sorts the morning into what needs me and what can wait. I give it a rough argument and it comes back as a clean board update I can edit in ten minutes rather than build from scratch in two hours.

The return here is real, and you feel it long before you could count it. I save the evening I would have lost to formatting, and I walk in prepared. None of it shows up as a line in anyone's budget, which is exactly why personal AI use is the easiest thing in the world to adopt and the hardest to put a number on. I should be honest about the trap inside that, because felt also means unproven. Some of the time I save quietly refills with more work, and I have never actually measured the rest. The confidence is genuine even though I am holding no receipt for it. Boundaries crossed: none. Owners who could claim it: one, me.

Level 2: the small-business operating system

One owner wires several processes together and lets AI run the loop between them. Picture a beauty salon owner. She connects her booking tool, her messages, and her invoicing. The system chases the no-shows, sends a rebooking reminder three weeks after a cut, asks happy clients for a review, and drops a short numbers summary in her inbox every Monday. She built it herself over a few weekends, the way I build mine.

Now the return lands somewhere she can see it. The chair that used to be empty on a Tuesday is booked. She let the outsourced bookkeeper go. She has an admin afternoon back every week. She does not need an attribution model, because she can read the result on this month's till. The cost side is as clean as the return: she built it for the price of a few weekends, with no integration program and no consultants to pay before the first result arrives. It is cheap to run and easy to see, on both halves of the ratio. This is the clearest ROI in the whole AI economy, and it sits at the small end where almost no one is looking. Boundaries crossed: several processes, but one head still sees the whole loop, so a decision takes a minute rather than a meeting.

Level 3: one function at scale

AI inside a single department of a larger company. Move up to a digital-marketing team in a mid-size firm. AI drafts and tests campaign variants, scores the inbound leads, and routes the good ones to sales within the hour. The team is a dozen people, but it is one function, working mostly out of one system, reporting into one budget line.

The return is still visible, with a caveat starting to creep in. Cost per lead drops, the campaign cycle shortens, the pipeline fills faster. But the credit begins to travel. Marketing generated the lead, sales closed it, and the two will read the same number two different ways. The return is attributable with effort, where one rung down it was simply obvious. This is the easiest place a large company can realistically win, which is why I point enterprise leaders here, a department in and well short of the top. Boundaries crossed: many people, but one function and mostly one data system. Owners who could claim it: still essentially one.

Level 4: cross-functional

AI spans two or more functions. Connect sales to the warehouse. Stock reorders react to live sell-through, so the fast movers never run out and cash stops sitting in slow inventory. Genuinely valuable, and a real step up from anything below it.

And here the line of sight goes quiet. The working capital freed up is real, and three functions can each tell an honest story about why. Sales points to its demand signal, operations to its execution, finance to the model it sponsored. The data lives in systems that were never built to talk to each other, no single person owns the result, so no single person can prove it. The value held steady from Level 3 to Level 4 while the ability to see it dropped away.

That return can still be recovered, it is simply expensive to see. Holdout regions, staged rollouts, a clean before-and-after on working-capital days, these recover most of the picture, and they cost real time and money to run. That measurement bill is part of the honest ROI, and Level 4 is the first rung where you have to budget for proof and not only for the build. Boundaries crossed: two functions or more, split systems, and now genuinely split ownership.

Level 5: enterprise with external sources

AI reaches past the company walls. It pulls in customer feedback, market signals, and what is selling and being returned, and feeds that back into a product while it is still being shaped. A consumer brand tuning a formulation or a feature against live demand is operating up here. This is the prize everyone points to when they talk about AI transforming a business.

The prize is the biggest on the ladder, and the proof is the hardest of all to assemble. By the time the product lands, the market has moved, competitors have reacted, and a dozen internal decisions have layered on top of whatever the AI contributed. The gain is large and it is real. Pulling it cleanly out of everything else that happened in the same quarter takes a measurement effort most companies never staff. This is the quiet story behind a headline from MIT's 2025 study of three hundred AI deployments, that 95% of enterprise generative-AI pilots never move the P&L. The same study found the clearest returns sitting in narrow back-office automation, while the budgets pour into broad, cross-cutting bets where the value is real and nobody can hold it up. Boundaries crossed: the most of any level, every internal function plus the outside world.

The shape this makes

Now line the five levels up and plot two things at once, not one. Plot the size of the prize, and plot how clearly you can see it. The prize climbs the whole way up; a Level 5 win dwarfs a Level 1 one. The clarity runs the other way, high at the small end and fading as you rise. The two lines cross in the middle of the ladder.

Chart showing the size of the prize rising with scale while ROI visibility falls, the two lines crossing around Level 2
The prize climbs the whole way up. The clarity runs the other way. They cross around Level 2.

That crossing is the entire point. Around Level 2 you get the rare combination of a return that is both real and fully legible. Climb above it and the value keeps growing while your ability to prove it falls away. The clarity of your ROI peaks early, at the small business, and the prize and the proof pull apart as the boundaries multiply.

Prize size and clarity of return plotted across the five levels
Prize size and clarity of return, level by level.

What this means if you run something big

If you sit where I do, the temptation is to chase the big, invisible prize anyway, because the prize is where the money is. That is how most enterprise AI strategy already behaves, and it is how a year disappears into arguing about whether the flagship program worked while the budget patience runs out.

There is a sharper way to hold it. Sort any AI project on two axes, the value at stake and the visibility of the return, and four boxes fall out. High value and high visibility is the Level 2 sweet spot, and you do it first. High visibility and low value is busywork that demos well and moves nothing, the easiest trap to fund. Low value and low visibility you simply kill. High value with low visibility is the Level 4 and Level 5 prize, the work you cannot avoid because it is where the business is actually going.

Here is the move the sweet spot is pointing at, and it is easy to get backwards: you win small in order to go big. You prove the number where one owner can see it, then you spend that proof, the credibility and the budget it earns, on the bets whose returns will stay murky for a year. A banked, visible win is the thing that buys you the room to fund the unprovable ones. The beauty salon owner's advantage, one head seeing the whole loop, can be engineered inside a 50,000-person firm, but it has to be designed in on purpose, because scale strips it out by default. That is the real job of a use-case audit: find the place where the return will be visible, win there first, and use it to pay for the prize you cannot yet measure.

How this differs from a maturity model

You have seen ladders like this before, and they all point the arrow up and to the right, toward more sophistication. Those models get capability right; it really does grow as you climb. The part worth flipping is the instruction they share, which is to head for the top. Ask a different question, where can you actually see the money, and the prescription turns over. The advanced end is where the return goes to hide, while the small end keeps it in plain sight. Start where you can see it, then climb with proof in your pocket.

What is coming

Over the next few Sundays I am going to take these one rung at a time and go well past a single example. Next week, Level 1, where the return is easiest to see: the actual build, the use cases worth starting with, and how to stand one up this month without writing a line of code. Then the small business operating system. Afterwards, the single function. Then the enterprise, where I will bring the research on why the return goes dark at scale and what a leader does about it.

For this week, one thing to carry into Monday. Before you back any AI project, ask where its return will show up and who will be able to see it. If the answer is one person looking at one number, you are near the sweet spot. If the answer is a committee, get your proof somewhere smaller first, then come back for it.