The story is the same one across thirty different companies. A Big-4 firm wins the AI work. A six-month engagement is scoped. The opening conversation is hopeful. By month three, executive sponsors are reviewing a slide deck of opportunities. By month six, they are reviewing a final deck of recommendations and a pilot proposal. At no point in the engagement did anything write to a system of record. The customer is left with strategic clarity, a portfolio of next-phase initiatives, and the option to either commit another half-million dollars or quietly let the program lapse. Most quietly let it lapse.
This is not a failure of execution. The consultants on these engagements are smart, the methodology is rigorous, and the analysis is genuinely useful for the work it was built to do. The deck is not bad. The deck is the deliverable, by design.
Who is in the room on a Big-4 AI engagement
To see why, walk through the org chart on a typical AI engagement at one of the named firms. There is a partner who owns the relationship. A principal who runs the engagement. An engagement manager who keeps the workstreams aligned. Four to six consultants, some of whom carry the title "AI specialist," who do the analysis. Behind them, an offshore delivery centre that nominally produces working artifacts. Twelve to twenty people in the relationship, billable, on a six-month clock.
What is missing from the engagement team
Notice what is missing. There is no engineer in the customer's environment with permission to write to the customer's systems. The on-site team is staffed by management consultants whose toolkit is interviews, frameworks, and presentation. The offshore team can produce code, but they do it from a delivery centre with limited environment access and a separate priority queue, often working from a different time zone with handoffs every twelve hours. The bridge between "we know what the agent should do" and "the agent is running in production on the customer's stack" is owned by no one in the room.
The six-month sequence: deck, deck, deck, deck
What actually happens in the six months. The first phase is discovery: stakeholder interviews, current-state process mapping, opportunity identification. The output is a deck. The second phase is prioritisation: opportunity sizing, business cases, a portfolio view. The output is a deck. The third phase is design: future-state target operating model, technology selection, an implementation roadmap. The output is a deck. The fourth phase is pilot definition: scope, success criteria, governance structure. The output is a deck. The fifth phase, if it survives the budget review, is the pilot itself, built by the offshore team against a thin-sliced version of the problem. There is a demo. The engagement closes.
What is in production on day 181, in the great majority of cases, is nothing. Some recommendations have been accepted. Some governance committees have been stood up. Some budget has been allocated for a future phase. McKinsey's own published number is that roughly seventy percent of transformations fail to meet their objectives. The Big-4 firms know this; they cite it themselves in pitches for the next round.
Why the model produces decks instead of software
Why the model produces this outcome is not mysterious. The firms that deliver AI engagements at this scale are professional services firms. Their P&L is built on billable hours of senior consultants doing analysis, not engineering. When the engagement requires engineering, it gets routed to a delivery centre where the unit economics are different, the access is different, and the institutional priority is different. The two halves of the work are structurally separate inside the firm because they are commercially separate. The senior consultants get rewarded for selling the next phase. The offshore engineers get rewarded for landing the code that the consultants specified. Nobody in the structure is rewarded for ensuring that the agent is running in the customer's production environment six months from now.
How the Big-4 firms are restructuring
The Big-4 firms are aware of this and are restructuring under the pressure. McKinsey now reports 25 percent of projects priced against outcomes rather than time. Lilli, the firm's internal AI tool, is in daily use by seventy percent of consultants. Five thousand support roles have been eliminated, replaced by AI doing the work those roles used to do. EY has hired 61,000 technologists since 2023, fifteen percent of the workforce, and is openly exploring what it calls service-as-a-software. PwC cut graduate hiring by thirty percent over three years. The firms are not blind to the model breaking; they are restructuring around it as fast as a leveraged services P&L can be restructured, which is to say, slowly. As Hywel Ball, the former chair of EY UK, put it in a recent interview: "the bigger you are, the slower change can be; even small process tweaks can take months or years."
The customer who needs AI in production this quarter cannot wait for the firms to finish restructuring. The structural reason the deliverable is a deck is that the people in the room write decks. They are very good at it. They have been trained for fifteen years to produce them. Asking them to ship software is asking the wrong people to do the wrong job inside the wrong commercial structure. There is no individual fault in this. The fault is in expecting a model designed for one job to do a different one.
The structural reason the deliverable is not software is that software requires a different workflow, a different access pattern, a different unit of work, a different review process, and a different person sitting next to the system administrator at the moment of deployment. None of that fits inside a Big-4 engagement model, and that is fine, because Big-4 engagements were not designed to do it. The model serves a board needing strategy, an executive team needing alignment, a planning function needing an opportunity portfolio. These are real needs and the model serves them well. What the model does not serve is a CFO who needs the AP queue processed with one fewer person on the team this quarter.
The right response is not to complain about the consultants. The model serves the customer it was built for. The right response is to recognise that AI delivery to operating businesses is a different job that requires a different vehicle, and to stop being surprised when the existing vehicles do not deliver it. The next post starts to define what that vehicle has to look like, and where the foundational case for it was made (in 1990, by a man named Michael Hammer, in the pages of Harvard Business Review).