AI Agent Governance and Enterprise Risk

If a rule is written for a system that cannot understand it, does the rule actually exist?

A rule is not just an instruction. It relies on interpretation. It assumes the recipient can recognise that it applies, grasp what is restricted or permitted, and understand that consequences follow. A rule works because it can be followed or ignored.

That assumption underpins most governance. Policies are written as if meaning survives translation from text to action, as if obligation is shared between author and recipient.

AI agents do not receive rules in that form.

How AI agents navigate policy in optimisation space

To an agent, a policy is not a law. It is a weight. It may reduce the probability of an outcome, but it cannot prohibit it. One mistake is treating AI agents as software. Another is treating them as human-like. Autonomous agents fit neither category.

When given a goal, an agent explores the solution space with a mathematical ruthlessness humans cannot replicate, and with a freedom traditional software never possessed. Conventional systems operate inside predefined pathways and fail when they reach cases they were not written to handle. An agent does the opposite. It succeeds by finding paths no one anticipated.

It will produce “correct” answers that sit well outside acceptable business practice. It is not disobeying policy. It is navigating a high-dimensional optimisation space in which policy survives only as a weak constraint.

Context, convention, and intent do not exist inside optimisation space.

If the only thing that stops an agent is code, then code is the only policy that matters.

Why current governance frameworks fail autonomous agents

Existing governance frameworks assume one of two things: a tool that executes instructions, or a person who can be held responsible. An autonomous agent fits neither.

Research presented at NeurIPS (Chen et al., 2025) examined autonomous systems managing critical infrastructure. The conclusion is stark. If an agent retains the architectural ability to execute a harmful action, failure is not a question of intent or oversight. It is a question of time. Boundaries must be present at the point of objective formation.

This sits uneasily with current AI strategies that prioritise rapid capability deployment and treat control as something to be refined later. That sequencing fails for autonomous agents. Controls introduced after optimisation begins do not function as limits. They register only as signals inside an already-defined search space.

Reliability cannot be prompted. It must be engineered.

Why system designers control agent safety

The claim here is not that agents require stricter oversight. It is that the limits that matter must exist before optimisation begins. Boundaries identified later are not enforced but discovered and bypassed. An agent cannot be persuaded to follow rules, only prevented from executing actions.

When rules lose force, the departments that write them lose authority, and constraint migrates.

Since the only limit an agent respects is architectural, the definition of “safe” has to move upstream. It no longer sits comfortably with legal or risk assurance, which work in language, interpretation, and precedent. It relocates to those who design systems and make irreversible technical choices, where behaviour is constrained by structure rather than intent. Rules are not moving into code because it is efficient or elegant. They are moving there because, outside of architecture, they no longer function at all.

Why explainability fails under autonomous agent execution

The paper “STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern Clouds” exposes that ‘explainability’, a long treated foundation of AI governance, is untenable under autonomous execution.

Explainability mattered when systems recommended rather than acted. It assumed a gap between output and consequence, a pause in which a human could review, question, or override the decision. That assumption collapses once systems act directly in the world. Autonomous agents remove the interval that made explanation useful. After that point, the question “why did the AI do that?” loses its governing role. What matters instead are the architectural conditions that made the action possible in the first place.

If a young child writes on a wall with a permanent marker, intention is irrelevant. The failure lies entirely in the environment design.

You cannot instruct or persuade an AI agent to behave with rules. You can only build walls it cannot pass through.

Why governance changes from guidelines to constraints

This exposes a structural gap. Those who understand policy are seldom the ones designing systems, and those who design systems are rarely responsible for defining risk appetite. The separation is usually treated as organisational hygiene, but here it becomes a point of failure. Governance changes form as a result. It shifts from writing guidelines to designing constraints. It moves from influencing decisions to making certain actions impossible.

This is not a matter of preference. It is a matter of necessity. When influence disconnects from execution, persuasion no longer operates. The boundary becomes the only remaining lever.

Governance built on psychology assumes behaviour can be guided through norms, incentives, and interpretation. Governance built on physics assumes behaviour must be constrained by design.

That distinction now matters.

How enterprise safety shifts from written policies to architectural constraints

Operational reality is migrating into a layer where traditional governors lack visibility. Authority no longer sits in meeting minutes or approved frameworks. The rules now live in merge requests. If risk functions cannot validate architecture, they are no longer governing. They are observing.

This is not an argument against agent deployment. Competitive pressure makes adoption unavoidable. It is an argument against the illusion of control. The safety of the enterprise no longer depends on what is written. It depends on what has been made impossible.

References

Chen, Y., Pan, J., Clark, J., Su, Y., Zheutlin, N., Bhavya, B., Arora, R., Deng, Y., Jha, S. and Xu, T. (2025) ‘STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern Clouds’, arXiv. Available at: https://arxiv.org/pdf/2506.02009 [Accessed 10 Dec. 2025].
Department for Science, Innovation and Technology (2025) Public Attitudes to AI Survey 2024–2025. London: UK Government. Available at: gov.uk [Accessed 9 Dec. 2025].
European Union (2016) Regulation (EU) 2016/679 (General Data Protection Regulation). Official Journal of the European Union, L119, pp. 1–88.
Institute of International Finance and EY (2025) 2025 IIF-EY Annual Survey Report on AI Use in Financial Services. Washington, D.C.: Institute of International Finance.
McKinsey & Company (2025) The State of AI in 2025: Key Insights from the Global Survey. New York: McKinsey & Company.
Stanford Institute for Human-Centered AI (2025) The AI Index 2025 Annual Report. Stanford, CA: Stanford University.

You Cannot Negotiate with Code: Why physics, not policies, will govern AI