Spend enough time around enterprise AI and you notice the failures don't look like the ones you were warned about. The machine isn't too dumb. It's often startlingly capable. It fails the way a brilliant new hire fails on the first morning — not for lack of intelligence, but for lack of traction. It knows what to do and has no firm place to stand while doing it.
The clearest way I've found to explain this is a boot.
A boot has four parts that matter. The sole is rigid and load-bearing — the part in contact with the ground, carrying your weight whether you're standing still or running flat out. The upper is the soft, flexible part that wraps your foot and moves with you. The seam is the stitched line that joins the upper to the sole, so the flexible part and the rigid part are actually one boot and not two pieces sitting in a box. And the laces are what shape how the two work together: they apply just enough tension that when you push off, the force transfers cleanly through the seam to the sole instead of your foot sliding around inside. A strong sole and a well-made upper still fail without a seam to join them and laces to shape them. The foot slips, friction builds, and you go over — not because any single part was weak, but because nothing joined them or shaped how they worked together.
Your business is the same shape, whether or not anyone ever drew it that way.
The sole is the set of systems that carry your weight: the records that have to be right, the rules that keep you compliant, the transactions that can't be quietly undone once they happen. These are deliberately rigid, and they should be — you do not want them improvising. The upper is AI: adaptive, fast, fluent with ambiguity, able to read a messy situation and decide what to do next. That flexibility is exactly what you want from it, and exactly why it cannot be trusted to carry the load alone.
Which leaves the two parts nobody puts on the slide — and which turn out to be where every real failure actually lives. The seams and the laces.
The seam is where AI connects to the systems that carry weight: the fixed points where its reasoning is allowed to touch a real record, a real price, a real transaction. The laces are the rules that shape what it can commit to once connected — how far its authority runs, what it's allowed to promise, where it has to stop and ask. Get those two wrong and it does not matter how smart the AI is. The headlines everyone remembers from 2024 were not intelligence failures. Every one of them was a seam-or-lace failure, and the three most-shared of them happen to map exactly onto the three ways the structure can break.
Three failures, none of them about intelligence
Take the dealership whose assistant agreed to sell a brand-new SUV for a single dollar. People online simply told it to agree with everything they said and to treat its reply as a binding offer — and it cheerfully did. The reflex was to laugh at the AI. But the AI was never the problem. It had no connection to pricing, to inventory, to anything that could actually authorize a sale. It could talk about cars all day and could not sell one, because there was no seam — nothing joining its words to the systems that govern a real transaction. There was no boot at all. Just an upper, flopping around, attached to nothing. Within two days the dealership took it down.
Now take the airline. Its assistant told a grieving customer he could claim a bereavement refund after he'd flown. That wasn't the policy. When the customer came to collect, the airline refused — and a tribunal made them pay, ruling that a company is responsible for what its assistant tells a customer, whether the words come from a static page or a chatbot. Here the connection existed; the assistant could reach real policy. What was missing was any limit on what it could promise. It made a binding commitment the business never authorized. The seam was there. The lacing was too loose. The boot existed, and it didn't hold.
Then the opposite failure, which is just as instructive. A delivery company's assistant was locked down so hard, so afraid of saying anything wrong, that it became useless for its actual job — until a frustrated customer discovered it would happily write a poem about how terrible the company was, swearing included. Everyone shared the poem. The lesson underneath the laughter: the lacing was cranked so tight the thing couldn't do the one job it was there to do. A boot laced that hard is one you can't walk in.
One company had no seam. One had a seam but loose laces. One was laced so tight the boot was unwearable. Three different failures — and not one of them was about whether the AI was clever enough. All three were about the structure around the intelligence: where it was allowed to connect, and what it was allowed to commit to.
Why this is the pattern, not a run of bad luck
It would be comfortable to file those three away as freak embarrassments — the kind of thing that happens to other people's companies. The aggregate numbers say otherwise. The industry analysts who track this put the failure rate of ambitious AI efforts somewhere most boards would find alarming: Gartner expects more than 40% of agentic AI projects to be cancelled by the end of 2027, undone by runaway cost, unclear value, or inadequate controls. McKinsey's field research finds that fewer than one in ten AI use cases that get deployed ever make it past the pilot. A 2025 survey of enterprises found that the large majority needed to upgrade their existing systems before AI could be deployed against them at all, and that the most-cited obstacle was not model quality — it was security and control.
Read those numbers through the boot and they stop being a mystery. The pilots dazzle because a demo only has to show the upper moving. Production is where the upper has to transfer load to the sole through seams and laces that mostly were never built. The thing that looked finished in the demo was a single part held up by hand. The thing that fails in production is the absence of everything that was supposed to be around it. The failures cluster because the missing piece is the same piece every time, and it is never the intelligence.
What a boot that works actually looks like
It's worth drawing the success case as plainly as the failures, because it is far less dramatic and that is exactly the point. Picture the same customer-service job the airline and the delivery company got wrong — handling a refund — built correctly.
The connections are deliberate and few. The assistant can read the order history, so it never asks the customer for information the company already has. It can see the policy that governs the request. It can reach the system that would actually move the money. Each of those is a seam, chosen on purpose, and nothing outside that small set is within reach.
The authority is shaped to match. The assistant can approve a refund on its own when the amount, the timing, and the product fall inside limits the business set in advance. Outside those limits it does not improvise and it does not stall — it hands the case to a person, with the context already assembled. Every decision it makes is logged in a form an auditor could follow later: what it did, and which rule and which record told it to. That's the lacing. It is unglamorous, it never trends, and it is the entire difference between an assistant that compounds value and one that becomes a lawsuit.
Notice what is not on that list. Nothing about a cleverer model. Nothing about better phrasing. The boot that works and the boot that ends up in the headlines can run the identical intelligence inside the identical upper. What separates them is whether anyone built the seams and laces, and that is a decision made in a boardroom long before it is a problem on a customer-service transcript.
How this should change the way you buy
This is the part that should change how you buy. The instinct in most leadership teams is to ask whether the AI is smart enough, accurate enough, advanced enough — to go shopping for a better upper. But a better upper doesn't fix a boot with no laces. The questions that actually predict whether this works are quieter and more structural, and any executive can ask them without a single technical term.
Where, exactly, does this connect to the systems that carry our weight — and is that list of connections deliberate, or did it accumulate? What is it allowed to commit to on its own, and where does it have to stop and ask? When it's wrong — and it will sometimes be wrong — does the mistake stay inside a boundary we drew on purpose, or does it become a promise we're legally bound to keep? Can we reconstruct, after the fact, why it did what it did? Those questions interrogate the boot, not the upper. They are answerable in plain language, and a vendor who can only answer the first kind of question — how capable, how advanced — is selling you an upper and letting you assume the rest of the boot comes with it.
Governance, in other words, is not about how the AI thinks. It's about what it can commit to. You will never fully control the reasoning — that's the nature of the thing, the same way you can't dictate every motion of a foot mid-stride. What you can control is the boot: the seams that decide where intelligence is allowed to touch reality, and the laces that decide how much it can promise once it gets there. Regulate the interfaces and the authority, not the intelligence, and you are governing the part you can actually hold.
Start tight, then loosen on purpose
There is one operating principle that follows directly from all of this, and it runs against the grain of how most AI gets rolled out. Start with the laces tight, and loosen them deliberately. It is far easier to grant an assistant more authority once you've watched it earn the trust than it is to claw back a commitment it should never have been able to make. The airline did not get to un-promise the refund. The dealership did not get to retract the offer until after it had become a public spectacle. Authority you extend slowly is a decision. Authority you discover the system already had is an incident.
The most public version of this lesson came from the company that put AI on the front line of customer service, announced it was doing the work of seven hundred people, and a year later said plainly that cost had become the dominant factor and the quality had dropped — and began rehiring. That was not a story about AI being incapable. It was a story about lacing calibrated wrong on the first pass and corrected on the second: loosen everything at once in pursuit of savings, watch quality erode at the edges where the volume hides, then tighten selectively until the assistant handles the routine and the humans hold the hard cases. The correction is the lesson. It is what calibrating a boot looks like when you do it in public.
You don't need a public reversal to know whether your own lacing is drifting. The warning signs are legible from the executive floor. Customers complaining that they were promised something the company won't honor is the loose-lace signal. An assistant that can't perform its basic function, so people route around it, is the over-tight signal. Metrics that look excellent at launch and erode as volume climbs and edge cases accumulate is the calibration signal. And humans spending their days correcting the AI's decisions means the structure isn't transferring load — it's manufacturing work. None of those require a technical audit to spot. They show up in complaints, in escalation rates, in the quiet sense that the thing is creating as much cleanup as it saves.
The question to take into the room
The companies struggling with AI right now mostly believe they have an intelligence problem, and they're out shopping for a smarter model. The ones who have made it work figured out something less flattering and far more useful. They didn't need smarter AI. They needed better boots — and they built them before they let the AI take a single step that touched a customer.
Where is your AI allowed to make a promise you'd be legally bound to keep — and who, exactly, decided where that line falls?
