Silent Semantic Drift -- Inter-Agent Series # 3

The Most Dangerous Agent Failure Looks Exactly Like Success.

May 20, 2026

The most expensive agent failure of the next several years won’t look like a failure at all. It’ll look like success — same green dashboard, same 200 OK, same tidy log entry on both sides of the wire.

Except the two sides have just agreed to two different things, and nobody will know for three weeks.

That failure has no name in your incident vocabulary. It’s not in your CISO’s playbook. It doesn’t pattern-match to anything your QA team has tested for. When it surfaces — at delivery, at reconciliation, at the lawyer’s office — the audit trail will swear nothing went wrong.

I’ve been calling this silent semantic drift. I think you may need the term soon.

What it is

Silent semantic drift is the gradual, undetected divergence between what two or more agents believe they have agreed upon over the course of a multi-turn exchange.

It requires no bad actor. It produces no error message. It is the natural consequence of two or more systems exchanging natural language while each side resolves that language privately, against contexts the other side cannot see.

The failure isn’t happening inside either agent — it’s happening between them, and that’s the thing existing tools were not built to see.

It also lives in one specific place: the interaction phase of the full-cycle agentic experience, the long, ambiguous middle that sits between encounter and settlement — the back-and-forth where a quote becomes an order and terms get pinned down (or don’t). The phase your current stack is mostly silent about.

Why silent

Two weeks ago I argued that a 200 OK is a false positive for alignment. Drift is what fills that gap. Each individual exchange parses as fine — headers well-formed, schemas valid, confirmations coming back in the affirmative. Watch it on a console and every turn looks healthy.

The conventional debugging instinct — something must be erroring; let me find the error — fails here, because nothing is erroring. The transaction completes, settlement fires, the dashboard stays green. Most teams I talk to don’t have it on their list yet.

Why semantic

The drift isn’t in the bytes. It’s in the meaning.

Consider a request like “send the standard concentration.” Both agents parse the sentence. The word standard, however, gets resolved on each side against a different reference: one agent’s product catalog says one thing; the other’s says another. Each side is internally consistent. Neither is hallucinating. The disagreement is entirely about what real-world referent the words are pointing at — and there is no shared shelf the two agents can point at to disambiguate.

This is what psycholinguists since Herbert Clark have called the common ground problem: two parties communicating without a continuously verified shared frame of reference will diverge in their interpretations. The active process by which humans maintain that shared frame has its own name — grounding — and it runs in the background of every successful conversation, mostly through small confirming behaviors: eye contact, nods, repetition, “so what you’re saying is…” Almost none of that crosses the wire between agents, by design.

JSON, as I put it last time, is a courier, not a referee. Silent semantic drift is what happens when nobody refs.

Why drift

A single semantic mismatch on a single turn isn’t, by itself, catastrophic. The danger is what happens across turns. Each subsequent message is interpreted through the already-divergent frame, which means clarifications intended to resolve the divergence often deepen it. The buyer’s agent asks about delivery timeline; the seller’s agent quotes a timeline for the wrong SKU; the buyer’s agent confirms; the divergence is now load-bearing for the rest of the transaction. By turn six, the agents aren’t negotiating the same deal. By turn ten, the gap is wide enough that you’d see it in seconds — if anyone were looking.

Researchers studying multi-agent LLM systems have started cataloguing this empirically. A 2025 taxonomy effort annotated more than 1,600 failure traces across seven multi-agent frameworks and identified inter-agent misalignment as one of three dominant failure categories — and concluded these were fundamental design flaws, not artifacts of specific systems. Separately, benchmarking work has shown that LLM agents negotiating across model families produce measurably worse deal outcomes than agents within the same family.

The classical analogy is the bullwhip effect in supply chains, where small upstream misreadings amplify into large downstream distortions. The shape is similar; the substance isn’t. Bullwhip distortions are about quantities — how much, how many, how soon. Drift is about qualities — what standard refers to, what the order covers, what delivered counts as. Same compounding curve, different axis.

To be precise about what I’m naming: I’m carving drift out as a specific subset of what the literature calls inter-agent misalignment. The broader category covers any breakdown of coordination between agents. Drift is the narrower, sharper case — two well-behaved agents converging on what reads as agreement and silently meaning different things by it. The literature gives us the family. The cross-org transaction case is what makes this particular member of the family lethal.

Drift compounds.

What it isn’t

A definition is only as useful as what it excludes. Silent semantic drift is not:

Hallucination. A single agent fabricating content is a one-sided failure, visible to a knowledgeable counterparty. Drift is bilateral and invisible to both sides.
Specification gaming. An agent finding a loophole to satisfy the letter of an instruction while violating its intent. Drift involves no loophole-seeking; both agents are well-behaved.
Misalignment, in the AI-safety sense. A divergence between an agent’s goals and its principal’s goals. Drift can occur between two perfectly aligned agents whose principals’ goals are entirely compatible.
Concept drift / data drift, in the classical ML sense. Statistical distribution shifts that degrade a model’s performance over time. Silent semantic drift is conversational, not statistical — it lives between two systems, not inside one.

If you take one thing from that list, it’s this: every failure mode above is intra-system — something going wrong inside one agent. Silent semantic drift is inter-system — the failure I flagged earlier as happening between agents, not inside either of them.

That’s why the existing toolkit doesn’t catch it. Your guardrails, your evals, your red-teaming — almost all of it is built on the assumption that the failure is happening inside an agent. You don’t fix it by hardening either side. You fix it by adding something to the space between them — a space your current stack treats as a transport problem when it’s really a negotiation problem your logs aren’t even recording.

Why now

Most production agent transactions today are intra-firm or single-step. Drift exists, but the operator owns both sides of the wire and can paper over the failure with internal controls. The picture changes the moment agents start transacting across organizational boundaries, in chains, with no shared coordinator. That second category — cross-org, multi-turn, autonomous — is what a non-trivial slice of 2026 enterprise stacks will look like. Most teams I talk to are six months from being there and don’t yet have the vocabulary for what’ll go wrong.

And what goes wrong won’t be a typo to fix in a database. By the time drift surfaces, you’re untangling a physical shipment that’s already in transit, a payment that’s already cleared, and a counterparty relationship that’s already taken the hit. And good luck assigning liability — when both audit logs say success and the outcome is failure, your insurance carrier and your general counsel have nothing to anchor against.

If you can’t describe the failure mode to your board, you can’t insure against it. Right now, drift is the un-insurable line item in your 2026 budget. Drift isn’t a dev-ops problem. It’s a boardroom problem dressed as one.

This is also why drift matters for anyone betting on autonomous AI as an investment thesis. The pitch is margin expansion through automation — agents transacting across organizational boundaries without the human-in-the-loop tax. Drift is the ceiling on that pitch. If companies can’t trust what their agents converge on with counterparties, they will keep humans in the loop forever, and the margin story dies. The trust layer isn’t a nice-to-have on top of the agent stack. It’s the precondition for the returns the stack was supposed to generate.

When they get there, drift is the failure mode of the interaction phase. And the interaction phase is the one their existing infrastructure doesn’t touch.

Your turn

Quick gut check before you go.

You have to pick one to be the failure mode in your next high-value cross-org agent transaction. Which do you take?

A. Your agent hallucinates and offers terms you’d never approve. The counterparty rejects, your monitoring screams, you debug it, you ship a fix in a sprint.

B. Your agent and theirs reach textbook-perfect agreement. Three weeks later, the goods arrive wrong, the contract reads fine, and nobody can point to where it broke.

Reply with A or B. Most engineers I ask reach for A on instinct, then change their answer halfway through typing it. I want to know how the split lands across this list.

In two weeks: a day in the life of a bad transaction — the reagent-ordering scenario, walked stage by stage. With your replies from post 2 woven in.

Ming's Substack

Discussion about this post

Ready for more?