← Blog

Why Everyone Is Trying to Govern Behaviour Instead of Actions

Over the last few days, I have had the same conversation repeated in different forms. Multiple discussions. Different starting points. Different proposals for how governance of autonomous AI systems should work. And in every case, the proposal drifted — naturally, confidently, almost gravitationally — away from the one place governance can actually be enforced.

Not toward something wrong. Toward something real and valuable. But toward something that is not enforcement. Every conversation ended in the same structural position: a sophisticated framework for evaluating something other than the action at the point of execution.

The pattern is not a coincidence. It is the industry’s default mode. And understanding why it happens is the first step toward building systems that do not repeat it.

Multiple conversations, one drift

Each begins with a different concern. Each proposes a different solution. All share the same structural characteristic: they move governance away from the action boundary.

The most common position is process alignment. The argument is clear and well-reasoned: governance should ensure that the agent follows the right process. Not just that individual actions are permitted, but that the sequence of actions over time conforms to an expected workflow. The agent should read the policy before acting. It should consult the right sources. It should follow the prescribed steps in the prescribed order.

This is behaviour governance. It evaluates the agent’s trajectory — the shape of its activity over time. Did it follow the right process? Did it take the expected path?

The drift is subtle. Process alignment evaluates how the agent behaves. It does not evaluate whether a specific action, at a specific moment, is authorised to execute. An agent can follow the correct process perfectly and still produce an unauthorised action at step seven. An agent can deviate from the expected process entirely and still produce only authorised actions. The process and the action are correlated but not coupled. Governing the process does not govern the action.

Another recurring position is state validity. The argument is equally coherent: governance should ensure the integrity and provenance of the data the system operates on. Before an action executes, the system should verify that its inputs are valid, that the data has not been tampered with, that the state of the world the action depends on is trustworthy.

This is data governance. It evaluates the environment the action operates in. Is the state valid? Is the provenance clean? Are the inputs trustworthy?

The drift is different but structurally identical. State validity evaluates the world around the action, not the action itself. A system operating on perfectly valid, provenance-verified data can still execute an unauthorised action against that data. The data’s integrity does not determine the action’s authority. Validating the input does not govern the operation.

A third position is output scoring. The argument is pragmatic: governance should evaluate the quality and safety of what the system produces. Score the output. Classify it. Run it through a safety evaluator. If the output is acceptable, the system is governed.

This is content evaluation. It evaluates results. Did the system produce something harmful? Something inaccurate? Something that violates a policy constraint?

The drift here is the most visible. Output scoring evaluates what the system has already done. The action has executed. The consequence has been produced. The evaluation happens after the boundary, not at it. A system that scores its outputs is performing quality assurance. It is not performing governance. The distinction is temporal: governance prevents. Scoring reviews.

The subtlest position is epistemic soundness. The argument is the most intellectually sophisticated: governance should ensure that the reasoning behind a decision is admissible. The evidence should be sound. The inference should be valid. The chain of reasoning from inputs to decision should withstand scrutiny.

This is epistemics. It evaluates the justification for the action. Is the reasoning defensible? Is the evidence sufficient? Would the decision survive challenge?

The drift is the hardest to see because it feels the most like governance. Surely, if the reasoning is sound, the action is justified? But epistemic soundness evaluates whether an action should be taken. It does not determine whether it may be taken. Justification and authorisation are not the same thing. A perfectly reasoned, epistemically sound action that falls outside the agent’s delegation scope is still unauthorised. The quality of the reasoning does not expand the scope of the delegation.

Different positions. Different disciplines. Each evaluates something real. None stands at the point where the action is about to produce a consequence and determines whether it may proceed. Everything upstream can improve the decision. Only the boundary controls the outcome.

Why the drift happens

The action boundary is uncomfortable. It is narrow. It is binary. It asks a single question — is this specific action, by this specific actor, under this specific delegation, at this specific time, permitted by this specific policy? — and it answers with one of three words: ALLOW, DENY, or ESCALATE. There is no nuance. There is no context. There is no consideration of whether the action is wise, well-reasoned, or part of a good process. There is only: is it authorised?

Every other framing feels more substantial. Process alignment considers the whole workflow, not just one action. State validity considers the entire data environment. Output scoring considers the full consequence. Epistemic soundness considers the complete reasoning chain. Each of these is broader, deeper, and more intellectually engaging than a binary gate on a single action.

The drift happens because governance feels like it should be complex. Autonomous systems are complex. The problems they create are complex. The regulatory environment is complex. It seems natural that governance should match that complexity — that it should be a rich, multi-dimensional evaluation that considers process, state, outputs, and reasoning together.

But enforcement is not complex. Enforcement is a gate. A gate is open or closed. The sophistication of a governance system is not in the gate. It is in everything that informs the decision to open or close it. The policy is sophisticated. The delegation model is sophisticated. The evidence chain is sophisticated. The gate itself is simple. It must be simple. A gate that requires interpretation is a gate that can be bypassed.

The industry drifts toward complexity because complexity is where the interesting problems are. But the interesting problems are not the enforcement problem. They are the problems that surround enforcement — the problems that make enforcement decisions better, more informed, more precise. Those problems are upstream. And upstream is not the boundary.

What each position gets right

This is not a critique of these positions. Each identifies a real problem and proposes a real solution.

Process alignment matters. An agent that follows the right workflow is more likely to produce good outcomes than one that does not. Understanding and governing the shape of agent behaviour over time is valuable work. It reduces the probability that an agent will reach a state where it attempts unauthorised actions. It is a form of risk reduction.

State validity matters. A system operating on corrupted, tampered, or unverified data is more dangerous than one operating on clean data. Data provenance and integrity are foundational to trustworthy systems. Validating the state of the world before acting on it is sound engineering.

Output scoring matters. Evaluating what a system produces — catching harmful, inaccurate, or policy-violating outputs — is necessary. Post-execution review is how organisations detect problems that pre-execution controls did not catch. It is the feedback loop that improves the system over time.

Epistemic soundness matters. Decisions backed by sound reasoning and sufficient evidence are better than decisions that are not. Ensuring the admissibility of the reasoning chain is how organisations build confidence that their systems are making defensible choices.

All are real. All are valuable. All make governance better.

None of them make governance exist.

The strongest objection

The sharpest challenge to this framing comes from state validity — and it deserves a direct answer.

If the data is wrong — corrupted, stale, tampered with — then even a correctly authorised action can produce a bad outcome. An agent operating within its delegation, evaluated against the right policy, producing a clean receipt — and still causing harm because the state it acted on was degraded. State matters. This is true.

But governance does not validate truth. Governance is not in the business of determining whether the world is correct. That is an epistemological problem, and it is unbounded. No system can guarantee that its inputs are true. Any system that requires truth as a precondition for action will either block indefinitely or quietly assume what it cannot verify.

What governance can do — and must do — is enforce that actions can only execute when explicit, deterministic constraints on state are satisfied. A required signature is present. A freshness bound has not expired. A mandatory approval has been recorded. A data source has been attested within a defined window. These are not claims about the world being accurate. They are conditions that must hold before the gate opens.

This is qualified state under explicit constraints. The policy does not ask whether the data is true. It asks whether the conditions for acting on it have been met. Was the attestation provided? Is it within bounds? Has the required verification been recorded?

These constraints are part of the policy. They are deterministic. They are replayable. An auditor can reconstruct why the gate opened — not because the data was ultimately correct, but because the conditions for action were satisfied at the time of evaluation. The decision does not depend on whether the world was right. It depends on whether the preconditions for acting in it were met.

A system can be wrong and still be contained. A system without a boundary cannot be contained at all.

This resolves the objection without compromising the boundary’s properties. State validity remains important — it informs the constraints that policy encodes. But the enforcement decision remains deterministic, evidence-producing, and independent of truth claims. Governance does not need the world to be correct. It needs the conditions for action to be explicit, verifiable, and met.

Upstream and boundary

These positions share a structural characteristic that is easy to miss because each one, taken on its own terms, is coherent and complete. The characteristic is this: all of them operate upstream of the enforcement boundary, or downstream of it, but not at it.

Each improves the quality of what reaches the boundary. Each reduces the probability that an unauthorised action will be attempted. None controls whether it executes.

Everything upstream can improve the decision. Only the boundary controls the outcome.

A system with excellent process alignment, validated state, scored outputs, and sound epistemics — but no enforcement at the action boundary — is a system that cannot prevent a single unauthorised action from executing. Every upstream improvement makes it less likely that an unauthorised action will be attempted. None makes it impossible for one to succeed.

This is the distinction between risk reduction and enforcement. Risk reduction lowers the probability of a bad outcome. Enforcement prevents the bad outcome from occurring. Both are necessary. They are not interchangeable.

An organisation that has invested heavily in upstream disciplines and not at all in action-boundary enforcement has built a sophisticated system for making governance decisions — and no mechanism for enforcing them. The decisions are well-informed. The boundary does not exist. The well-informed decisions float above the execution layer, advising but not constraining. Every action executes regardless of what the upstream evaluation concluded.

This is not hypothetical. It is the current state of most autonomous AI deployments. Organisations have invested heavily in content safety, behavioural monitoring, data validation, and reasoning evaluation. The investment is genuine. The capability is real. And the action boundary — the point where a specific action is about to produce a specific consequence — is undefended.

The gravitational pull

Why does every conversation end in the same place? Why do different people, with different backgrounds, proposing different solutions, all drift away from the action boundary?

Because the action boundary is the least intellectually interesting part of the system. It is the narrowest, most constrained, most mechanical component. It does not reason. It does not analyse patterns. It does not evaluate quality. It takes a single action, evaluates it against a policy and a delegation, and produces a decision. The decision is deterministic. The evaluation is a pure function. There is no room for interpretation, no space for nuance, no opportunity for sophisticated analysis.

Every other component is more interesting. Upstream problems involve behaviour patterns, cryptographic verification, provenance chains, classifiers, safety evaluators, formal reasoning, evidence standards. These are where the intellectual challenge is.

The action boundary involves a lookup. Is this action, by this actor, under this delegation, permitted? Yes or no.

Engineers, architects, and researchers are drawn to the interesting problems. The interesting problems are upstream. The upstream problems are where the intellectual challenge is, where the novel research is, where the conference papers are. The boundary is boring. The primitives are well understood — access control, capability checking, policy evaluation.

What is not well understood is that the boundary is the only component that matters at the moment of consequence. Everything upstream improves the probability that the right action reaches the boundary. Only the boundary determines whether the action executes.

The gravitational pull is toward sophistication. Sophistication lives upstream. Enforcement lives at the boundary. The industry follows the sophistication.

The test

The distinction is testable.

Take any governance system. Ask one question: can it prevent a specific action from executing, and prove that it did?

If yes — if the system can evaluate a proposed action against policy and delegation before execution, produce a deterministic decision, and create a tamper-evident record of that decision — it has enforcement. It governs at the action boundary.

If no — if the system evaluates processes, validates state, scores outputs, or analyses reasoning, but cannot prevent an unauthorised action from executing at the moment it is proposed — it does not have enforcement. It has analysis.

Analysis is valuable. It makes enforcement better. It is not enforcement.

Process analysis that cannot block an action is workflow monitoring. State validation that cannot block an action is data quality. Output scoring that cannot block an action is quality assurance. Epistemic evaluation that cannot block an action is decision support.

Each of these is a legitimate discipline. Each produces real value. None governs the action.

A system that cannot prevent an action is not governing it. It is advising on it, analysing it, scoring it, or monitoring it. These are useful activities. They are not governance. Governance requires enforcement. Enforcement requires a boundary. The boundary requires a gate. The gate must be closed by default. The gate must open only on explicit authorisation. And the gate must produce evidence that the decision was made.

The industry’s next move

The industry will continue to build upstream. It should. Upstream investment makes governance systems better, more informed, more precise.

All of this work is necessary. None of it is sufficient.

The question is not whether upstream investment is valuable. It is whether upstream investment is mistaken for enforcement. When an organisation reports that it has governance over its autonomous AI systems because it monitors processes, validates data, scores outputs, and evaluates reasoning — but has no mechanism to prevent an unauthorised action from executing — it has made exactly this mistake.

The drift from actions to behaviour is the industry’s default. It is the path of least resistance, the path of greatest intellectual interest, and the path that produces the most impressive-looking governance architectures. It is also the path that leaves the action boundary undefended.

The boundary is where consequence is produced. Governance that does not reach the boundary does not govern consequence. It governs everything except the thing that matters.