Rogue Agents and the Receipt Problem

Why every autonomous agent action needs a tamper-evident trail.

The setup — agents are leaving the sandbox

A year ago, “agent” mostly meant a chatbot with a system prompt. Now agents call APIs, query databases, sign documents, file appeals, draft legal language, route customer data, and increasingly take financial action. The blast radius keeps growing. The accountability layer hasn’t kept up.

When an agent does the wrong thing, three questions get asked at once: what did it do, when did it do it, and who can prove that’s actually what happened? The current answer in most stacks is “check the logs.” That answer no longer holds.

What “rogue” actually means

Three failure modes are worth naming separately, because they’re often collapsed into one.

Drift. The agent did the right thing yesterday. Today, the prompt got tweaked, the model rolled forward, or a new tool joined its toolbelt. The output looks the same — until it doesn’t. Drift is silent until it isn’t, and by the time it’s noticed, the question is what changed when, and who saw the change.

Hallucination at the action layer. Most discussion of LLM hallucination is about wrong text. The harder problem is wrong action — the agent calls the right API but with the wrong recipient, the wrong dollar amount, the wrong document attached. Text-layer hallucination is embarrassing. Action-layer hallucination is expensive and sometimes illegal.

Repudiation. The agent took an action. Six months later — under audit, in litigation, during a regulator’s inquiry — no one can prove what it did, when, on whose authority, against what input. The operator’s logs are server-side and mutable. The vendor’s logs are gone after retention windows expire. The counterparty has its own version of events. Whose record do you trust?

Each failure mode has the same root cause: the actions left no receipt anyone outside the operator can verify.

The receipt problem

Logs are not receipts.

A log is operator-controlled, server-side, and mutable. It’s a record the operator chose to keep, in the format the operator chose, on infrastructure the operator controls. A regulator, an auditor, a counterparty, or a court has no way to know it wasn’t edited last Tuesday. In most cases, they have no way to know it existed in its current form before the moment they asked for it.

This isn’t a paranoid framing. It’s the standard skepticism applied to any record produced and held by the party that benefits from it. Banks figured this out two centuries ago and built independent ledgers. Court systems figured it out and built clerks of court. Healthcare figured it out and built audit-trail standards.

Agentic systems haven’t figured it out yet. Most operators are still in the “trust our logs” phase. That phase ends the first time a regulated agent gets challenged and the operator can’t produce a record that survives third-party scrutiny.

Knox — what was built and why

Knox is a cryptographic anchoring substrate for agent actions.

When an agent does something — calls a tool, processes a document, makes a decision, updates a record — Knox hashes the event and links it into a hash-chain. The chain’s tip is anchored, on a schedule, to the Bitcoin blockchain via OpenTimestamps. The anchor is a public, third-party-verifiable record that the chain existed in its current state at a specific block height, witnessed by the entire Bitcoin network.

A few things follow from that.

The operator can’t rewrite history without breaking math the world can check. Any change to a past event breaks the hash-chain. Any attempt to backfill an event after the fact breaks the Bitcoin anchor. Verification doesn’t depend on trusting Bonis.
Receipts are portable. Anyone — auditor, regulator, counterparty, court — can take a Knox anchor and verify it independently using the OpenTimestamps protocol and a Bitcoin node. No Bonis API call required to confirm.
The same primitive runs across surfaces. Marketplace activity, healthcare appeal evidence, multi-vendor legal cross-validation receipts, deal-event provenance — one ledger, multiple verticals, same verification math.

Bonis Systems also shipped a CLI — bonis-attest inspect-anchors — so anyone can parse and verify Knox-anchored events from the command line without a Bonis server in the loop. The OpenTimestamps tree parsing is pure Swift; the RIPEMD-160 implementation hits canonical RFC 2286 known-answer vectors, including the one-million-a test. The verification path doesn’t depend on Bonis being online, or even existing.

That’s the point. A receipt that requires the issuer to vouch for it isn’t a receipt. It’s a promise.

The bar going forward

If your agent can take an action that matters, it should leave a record that survives the agent, the operator, and the platform.

That’s the bar. Knox is one answer. The bar is the more important thing.

It applies whether you’re regulated or not. It applies whether you’re using a frontier model or a local one. It applies whether your agent is calling a payments API or just deciding which document to surface to a human. The question is the same: when something goes sideways — and at scale, something always does — can you prove what happened, in a form a skeptical third party will accept?

If the answer is “we’ll check the logs,” the answer is no.

Bonis built Knox because that answer needed to change. The receipt problem is the problem. Tamper-evident, third-party-verifiable, vendor-agnostic anchoring is the shape of the answer.

The agents aren’t going to slow down. The accountability layer needs to catch up.