Perspective · AI Operations
Agentic AI is here. The harder question is how.
Agentic AI — systems that take multi-step action rather than answering one prompt at a time — has moved from research lab to production stack in under two years. The question has shifted. Leaders are no longer asking whether to adopt. They're asking why most adoption isn't showing up in the P&L.
Peter Matysiak · Berlin · April 2026 · 8 min
A tool that outgrew its tool metaphor.
Agentic AI is no longer a research novelty. Gartner has named it the top strategic technology trend two years running. McKinsey's 2024 State of AI survey reports that around 72% of organisations use AI in at least one business function, and that gen-AI regular use has roughly doubled in a single year. Board-level adoption is happening. Financial impact is uneven.
The gap between adoption and impact is not an argument against agents. It is an argument about what "adoption" actually means. An agent bolted onto an existing process is a copilot. A process redesigned around what agents do reliably is a new operating model. The first saves minutes. The second moves margin.
Most AI spend today is stuck at augmentation.
In the augmentation pattern, the agent sits next to a human and helps with specific tasks inside an existing role: drafting an email, summarising a document, producing a chart. It shortens a step. It doesn't change the job. The ROI ceiling is modest — a few percentage points of productivity against a licence fee and some change-management friction.
In the autonomy pattern, the agent runs the process end-to-end. Humans move from doing the work to supervising it — setting scope, reviewing exceptions, auditing outputs, stepping in when the system is wrong. The job shifts from manual piloting to air-traffic control. The ROI ceiling is an order of magnitude higher — but it requires rethinking the role, the process, and the supervision layer. That is operating-model work, not tool rollout.
Recent Harvard Business Review coverage (February 2026) on Agent Management makes the same point in different words: the discipline that matters is workforce management, translated to a workforce that happens to be software.
The same argument, thirty years ago.
Lean manufacturing has two histories. One is the story of cost cutting: reduce inventory, reduce slack, reduce headcount, capture a one-time margin bump, move on. That history ended badly in many places — thin buffers that broke under the first shock, teams that stopped improving because there was nothing left to cut.
The other is the story Womack and Jones told in The Machine That Changed the World: Lean as an operating system. Flow, standard work, a visible problem-solving culture, reinvestment of freed capacity into quality and new product. That history compounded for decades.
Agentic AI has the same bifurcation. Deployed as a cost-cutting tool, it delivers a short-lived saving and leaves the operating system untouched. Deployed as an operating-model change, it enables work that wasn't possible before and frees capacity for growth. The playbook is not new. The technology is.
Where agents work, and where they don't.
One of the clearest empirical pieces on human-AI collaboration comes from Harvard Business School and BCG (Dell'Acqua et al., Navigating the Jagged Technological Frontier, 2023). Consultants given access to GPT-4 performed 40% better on tasks inside the model's capability frontier — and 19% worse on tasks outside it. Same people, same tool, different problems.
The operator's job is to draw that line honestly for your own organisation. In broad terms, agents are good at research and synthesis, structured transformation of data, orchestrating other tools, drafting, classifying, extracting from messy inputs, and following a well-documented process. They are weak at context you haven't written down, long-horizon judgement, noticing when they're confused, and distinguishing a plausible-sounding wrong answer from a correct one.
The hardest skill is distrusting fluent, competent language. Observation from directing AI agents hands-on
Fluency is not correctness. A reliable agent is one that hands off when it should, not one that never stops talking.
Two paths, both honest.
Agentic AI earns its keep along two routes, each requiring operating-model change.
Capability expansion. People do work they couldn't do before. A single analyst running what used to need a team. A field engineer who can interpret a technical document in another language in real time. A Mittelstand firm that can run the kind of customer research that previously took a consulting firm a month. This path looks like growth, not savings.
Capacity release. Freed-up time is redeployed to higher-value work — new products, quality improvement, customer development. This is the Lean playbook applied to knowledge work. If released capacity is redeployed only to headcount reduction, the organisation gets a one-off saving and stops. If it is redeployed to growth, the gains compound.
Neither route shows up on the P&L unless the surrounding process and role boundary change. A copilot layered on top of an unchanged process returns unchanged margin. This is the common pattern behind the adoption-vs-impact gap McKinsey and others keep reporting.
A sequence that actually works.
Six moves, in order. Skipping any one of them is the usual failure pattern.
-
Literacy
Leadership and the first cohort of practitioners understand what agents do and do not do — on your data, not in a vendor demo. No governance decision is sound without this.
-
Use-case mapping
Walk the value chain. Surface work that is repetitive, error-prone, time-intensive, and data-rich. Prioritise by impact and readiness, not novelty. Ten candidates; three to start.
-
Tooling
Choose with reversibility in mind. Avoid lock-in that doesn't buy you anything. Build the internal muscle to swap the engine later — models change faster than business processes.
-
Operating model
Redesign the role boundary. Agents own what they do reliably. Humans supervise, review exceptions, and escalate. Role descriptions, RACI, and success metrics change at the same time as the tool — not afterwards.
-
Governance
Data protection, security, audit trail, escalation paths. EU AI Act compliance is not optional for most operational use cases, and is materially easier when designed in than bolted on.
-
Measurement
Token cost, adoption rate, quality outcomes, escalation frequency. Stop what isn't working. Scale what is. This is an ongoing line item, not a project closure.
Multiply, in four depths.
The Foundation workshop covers literacy and the first two or three use cases — the minimum needed to make a sound next decision. The Builder tier develops internal capability so the team can author its own tools. The Strategist tier moves the conversation to portfolio thinking, business cases, and governance. Bespoke engagements run implementation alongside your team, typically in partnership with internal engineering and operations.
None of these tiers try to own the organisation's AI adoption. The aim is to leave internal capability behind, matching the built-in-exit principle that applies across the rest of this practice.
Where these arguments come from.
A short, opinionated reading list. Every one of these is worth reading in full. None of them should be read uncritically.
Talk it through?
A thirty-minute call usually produces at least one decision that moves the conversation forward — even when the answer is "not yet", or "not with this practice". The most useful calls are the ones where a leader can describe the trigger event in a sentence: a stuck pilot, a vendor decision, a use-case shortlist that won't land.