AI Operations — a perspective

01 · The state of play

A tool that outgrew its tool metaphor.

Agentic AI is no longer a research novelty. Gartner has named it the top strategic technology trend two years running. McKinsey's 2024 State of AI survey reports that around 72% of organisations use AI in at least one business function, and that gen-AI regular use has roughly doubled in a single year. Board-level adoption is happening. Financial impact is uneven.

The gap between adoption and impact is not an argument against agents. It is an argument about what "adoption" actually means. An agent bolted onto an existing process is a copilot. A process redesigned around what agents do reliably is a new operating model. The first saves minutes. The second moves margin.

02 · Augmentation → autonomy

Most AI spend today is stuck at augmentation.

In the augmentation pattern, the agent sits next to a human and helps with specific tasks inside an existing role: drafting an email, summarising a document, producing a chart. It shortens a step. It doesn't change the job. The ROI ceiling is modest — a few percentage points of productivity against a licence fee and some change-management friction.

In the autonomy pattern, the agent runs the process end-to-end. Humans move from doing the work to supervising it — setting scope, reviewing exceptions, auditing outputs, stepping in when the system is wrong. The job shifts from manual piloting to air-traffic control. The ROI ceiling is an order of magnitude higher — but it requires rethinking the role, the process, and the supervision layer. That is operating-model work, not tool rollout.

Recent Harvard Business Review coverage (February 2026) on Agent Management makes the same point in different words: the discipline that matters is workforce management, translated to a workforce that happens to be software.

03 · Lean, read twice

The same argument, thirty years ago.

Lean manufacturing has two histories. One is the story of cost cutting: reduce inventory, reduce slack, reduce headcount, capture a one-time margin bump, move on. That history ended badly in many places — thin buffers that broke under the first shock, teams that stopped improving because there was nothing left to cut.

The other is the story Womack and Jones told in The Machine That Changed the World: Lean as an operating system. Flow, standard work, a visible problem-solving culture, reinvestment of freed capacity into quality and new product. That history compounded for decades.

Agentic AI has the same bifurcation. Deployed as a cost-cutting tool, it delivers a short-lived saving and leaves the operating system untouched. Deployed as an operating-model change, it enables work that wasn't possible before and frees capacity for growth. The playbook is not new. The technology is.

04 · The jagged frontier

Where agents work, and where they don't.

One of the clearest empirical pieces on human-AI collaboration comes from Harvard Business School and BCG (Dell'Acqua et al., Navigating the Jagged Technological Frontier, 2023). Consultants given access to GPT-4 performed 40% better on tasks inside the model's capability frontier — and 19% worse on tasks outside it. Same people, same tool, different problems.

The operator's job is to draw that line honestly for your own organisation. In broad terms, agents are good at research and synthesis, structured transformation of data, orchestrating other tools, drafting, classifying, extracting from messy inputs, and following a well-documented process. They are weak at context you haven't written down, long-horizon judgement, noticing when they're confused, and distinguishing a plausible-sounding wrong answer from a correct one.

The hardest skill is distrusting fluent, competent language. Observation from directing AI agents hands-on

Fluency is not correctness. A reliable agent is one that hands off when it should, not one that never stops talking.

05 · Where the return comes from

Two paths, both honest.

Agentic AI earns its keep along two routes, each requiring operating-model change.

Capability expansion. People do work they couldn't do before. A single analyst running what used to need a team. A field engineer who can interpret a technical document in another language in real time. A Mittelstand firm that can run the kind of customer research that previously took a consulting firm a month. This path looks like growth, not savings.

Capacity release. Freed-up time is redeployed to higher-value work — new products, quality improvement, customer development. This is the Lean playbook applied to knowledge work. If released capacity is redeployed only to headcount reduction, the organisation gets a one-off saving and stops. If it is redeployed to growth, the gains compound.

Neither route shows up on the P&L unless the surrounding process and role boundary change. A copilot layered on top of an unchanged process returns unchanged margin. This is the common pattern behind the adoption-vs-impact gap McKinsey and others keep reporting.

06 · Operating the shift

A sequence that actually works.

Six moves, in order. Skipping any one of them is the usual failure pattern.

Literacy

Leadership and the first cohort of practitioners understand what agents do and do not do — on your data, not in a vendor demo. No governance decision is sound without this.
Use-case mapping

Walk the value chain. Surface work that is repetitive, error-prone, time-intensive, and data-rich. Prioritise by impact and readiness, not novelty. Ten candidates; three to start.
Tooling

Choose with reversibility in mind. Avoid lock-in that doesn't buy you anything. Build the internal muscle to swap the engine later — models change faster than business processes.
Operating model

Redesign the role boundary. Agents own what they do reliably. Humans supervise, review exceptions, and escalate. Role descriptions, RACI, and success metrics change at the same time as the tool — not afterwards.
Governance

Data protection, security, audit trail, escalation paths. EU AI Act compliance is not optional for most operational use cases, and is materially easier when designed in than bolted on.
Measurement

Token cost, adoption rate, quality outcomes, escalation frequency. Stop what isn't working. Scale what is. This is an ongoing line item, not a project closure.

07 · Where this practice fits

Multiply, in four depths.

The Foundation workshop covers literacy and the first two or three use cases — the minimum needed to make a sound next decision. The Builder tier develops internal capability so the team can author its own tools. The Strategist tier moves the conversation to portfolio thinking, business cases, and governance. Bespoke engagements run implementation alongside your team, typically in partnership with internal engineering and operations.

None of these tiers try to own the organisation's AI adoption. The aim is to leave internal capability behind, matching the built-in-exit principle that applies across the rest of this practice.

Read the Multiply tier in full

08 · Further reading

Where these arguments come from.

A short, opinionated reading list. Every one of these is worth reading in full. None of them should be read uncritically.

HBS Working Paper · 2023 Dell'Acqua, Kellogg, Mollick et al. — Navigating the Jagged Technological Frontier The foundational empirical study. 758 consultants, GPT-4: +40% on tasks inside the frontier, −19% outside it. Best grounding available for managing task assignment between humans and agents.

Book · 2024 Ethan Mollick — Co-Intelligence: Living and Working with AI Practical framing of working with AI as centaur or cyborg, from a Wharton professor who actually uses the tools. His Substack, One Useful Thing, is the ongoing version.

Anthropic · December 2024 Building effective agents The clearest published distinction between workflows (predefined paths) and agents (dynamic loops). Useful even for non-engineers specifying internal systems.

Harvard Business Review · February 2026 Agent Management — managing the software workforce Reframes AI adoption as a workforce-management problem, not a technology problem. Useful for HR and operations leaders trying to place AI inside existing org structures.

McKinsey · annual The state of AI survey Broadest source for adoption, budget, and outcome data across functions and industries. Read with the usual self-reporting caveats — but the directional signal is consistent.

Stanford HAI · annual Artificial Intelligence Index Report The most thorough quantitative snapshot of where research and industry actually sit. Long, but structured for skimming by section.

Book · 1990 / 2007 Womack, Jones & Roos — The Machine That Changed the World Still the cleanest account of why Lean is an operating-system argument, not a cost-cutting one. The template for reading the AI moment — and the reason the parallel in §03 is not just a rhetorical flourish.

End · Start a conversation

Talk it through?

A thirty-minute call usually produces at least one decision that moves the conversation forward — even when the answer is "not yet", or "not with this practice". The most useful calls are the ones where a leader can describe the trigger event in a sentence: a stuck pilot, a vendor decision, a use-case shortlist that won't land.

Start a conversation Read Multiply in full

Agentic AI is here. The harder question is how.

A tool that outgrew its tool metaphor.

Most AI spend today is stuck at augmentation.

The same argument, thirty years ago.

Where agents work, and where they don't.

Two paths, both honest.

A sequence that actually works.

Literacy

Use-case mapping

Tooling

Operating model

Governance

Measurement

Multiply, in four depths.

Where these arguments come from.

Talk it through?