← Blog

Each Action Allowed. The Sequence Wasn't.

Banks figured this out decades ago. Credit card authorisation is per-transaction and deterministic: when the card swipes, the network has a few hundred milliseconds to render an approve, decline, or refer decision against the policy and account state in force at that moment. Fraud investigation is a different job. It runs separately, post-hoc, across the transaction stream. It looks at sequences of authorised transactions and produces findings: cards that should now be flagged, accounts that warrant additional verification, patterns that should change policy. The investigation does not rewrite past authorisations. It creates new evidence that future authorisations can require.

The temptation in autonomous AI governance is to reproduce the per-transaction half of that architecture without building the investigation half alongside it. An agent reads a restricted database. It summarises the records through a model and writes the summary to memory. It publishes the result to a broadly accessible dashboard. Each action is individually authorised; each receipt shows ALLOW. Together, they produce an effect nobody would have authorised if asked about it directly. The individual decisions are correct; the sequence they form is not. The breach is in the aggregate.

This is a class of governance failure for autonomous AI systems that does not look like a failure in any single evaluation. The response sits on a separate plane from single-action enforcement. Banks did this by separating authorisation from investigation. Autonomous AI governance has to do the same thing, and the trap on offer is the one banks were tempted by: fold fraud detection into the authorisation decision. That fold is what breaks the property of the enforcement layer the system cannot afford to give up.

The fix is structural: the second question gets its own plane, one that reads the evidence chain after evaluation, finds the patterns the per-action layer cannot see, and feeds its findings back into governance through a declared input rather than a silent override.

Two Questions, Two Planes

The first question is whether this specific action, attempted by this agent under its current delegation, is permitted by the policy in force. The evaluation runs at the action boundary against policy and delegation at the moment of execution, and produces a determinate result: ALLOW, DENY, or ESCALATE. The evidence it leaves is a receipt naming the action, the delegation, the policy, and the decision. That receipt is enough to reconstruct the decision later.

The second question is whether the sequence the agent has been building is one the operator would still authorise if shown the whole sequence directly. It cannot be answered from any single action because the answer does not depend on any single action. It depends on the chain. The natural place to put it is inside the per-action evaluation loop, where the system already has the action and the policy in scope. That is the place that breaks the architecture.

Two categories of event come up under the second question. A violation is the limiting case: a single action breached policy and the evaluator produced a DENY. Per-action enforcement handles it; nothing about the sequence matters. An anomaly covers the rest: every action is individually permitted and every receipt shows ALLOW, but the pattern across them is wrong, whether through too many actions of one type, an unusual flow of data between resources, or a gradual departure from a baseline that no single action made suspicious.

Two categories, two planes. Per-action enforcement handles violations because the breach is in a single action and the evaluation is deterministic. A sequence-analysis plane handles anomalies because the judgement is across a sequence and may be wrong.

Why Folding the Planes Together Breaks Audit

There is a strong temptation, once teams have recognised that sequence patterns matter, to feed pattern detection back into the per-action evaluator: let the evaluator see the history along with the current action, and let a detected pattern flip an ALLOW into a DENY. It works in the limited sense that the agent gets stopped when the system thinks the pattern is bad. And the reason I take a hard line against it is that it also breaks the property that makes per-action enforcement worth having.

A pure evaluation function takes the action, the policy, and the delegation as its inputs, computes a decision, and produces a receipt. Run it again with the same inputs; the answer is the same. That property is what lets the auditor, months later and with no access to the original system, reconstruct the decision and verify it was correct. If the function also depends on the history of actions the system has rendered in the past, and that history is mutable, the same inputs no longer guarantee the same answer. The auditor’s reconstruction may disagree with the production decision, and there is no way to tell which one was right.

Replay is not a nice-to-have. It is the architectural reason governance survives audit and regulatory inspection, and the reason an auditor can sign off on whether the decision was correct rather than trust that it was. Banks would not accept a transaction authorisation system whose decisions cannot be reconstructed after the fact. Autonomous AI governance has to meet the same bar.

Pattern detection that silently overrides ALLOWs gives that property up. Once it is gone, the system is back to the position governance was trying to escape: actions whose authorisation depends on the trustworthiness of the system that took them.

The Second Plane Reads the Evidence Chain

Behavioural pattern analysis belongs on a second plane, structurally outside the evaluation function. The first plane, per-action enforcement, keeps doing what it has always done: evaluate each action at the action boundary, render an ALLOW or DENY, and add the receipt to the evidence chain.

The second plane reads that chain without sitting inside the evaluation loop or depending on hidden runtime state. It cannot override an ALLOW, and it cannot rewrite a past decision. Its job is narrower: examine sequences after the fact and emit a separate artefact, a bounded signal that summarises the evidence consulted, describes the pattern found, and states the claim future evaluations may be allowed to require.

The Signal Becomes a Policy Fact

The signal is a governance artefact, not a runtime instruction. It cannot rewrite the past; the ALLOWs that have already been issued stay issued. What it does instead is create a new governance event, with its own evidence record, that downstream processes can act on: an investigation, an alert to a human, a policy review. The architectural move is that the signal becomes an input future evaluations can require, once it is named in policy. The sequence finding becomes a policy fact when policy declares it one. The next evaluation can demand it, and the receipt can point back to the evidence record that produced the fact.

Take the sequence from the opening. Each step is authorised, each receipt shows ALLOW, and the per-action evaluator did everything correctly. A sequence-analysis process, reading the chain after the fact, sees what the per-action evaluator could not: a data flow from a restricted source to an unrestricted destination, accomplished through a sequence of individually correct decisions. It emits a signal naming the receipts and describing the flow.

The next time the agent attempts the first step of that sequence, policy can require the assertion that this sequence is consistent with the data-boundary constraint. If the assertion is missing or stale, per-action enforcement denies the action — not because the action is unauthorised in isolation, but because policy now requires evidence the assertion exists. The detection still informs enforcement, but as a declared input rather than a hidden side-channel. Three properties hold: replay still works, audit still works, the per-action evaluator stays pure.

Two Different Jobs

Per-action enforcement and sequence analysis are two different jobs. Enforcement decides what may execute next and produces the evidence chain. Sequence analysis reads the chain, finds the patterns enforcement cannot see, and emits the assertions enforcement can later require.

The architecture is the same separation banks have run for decades. Authorisation is per-transaction and deterministic. Investigation is post-hoc and probabilistic. The bridge is a signal future authorisations can require, never a silent override of a past one.

Systems that fold the two planes together look rigorous and feel coherent. They are answering a sequence-level question with single-action machinery, and the failures they miss are the ones that surface first in the incident report.