Perspective

When your AI agent goes wrong, can you actually stop it?

Most AI governance products rely on the agent to cooperate with its own oversight. In 2026, that assumption is starting to break in public.

Deepika Sharma

Founder, Purogaly · May 2026 · 6 min read

In a Kiteworks survey of 225 enterprise leaders published this quarter, 60% admitted they could not terminate a misbehaving AI agent once it had started operating. Sixty-three percent said they could not enforce limits on what their agents were authorized to do.

A separate survey by Gravitee, covering more than 900 executives and technical practitioners, found that 88% of enterprises reported confirmed or suspected AI agent security incidents in 2026. In healthcare, that number climbs to 92.7%.

And four days before this article was written, Gartner published a prediction that by 2027, 40% of enterprises will demote or decommission their autonomous AI agents because of governance gaps that only became visible after a production incident.

These numbers describe the same problem from different angles. Enterprises are deploying agents faster than they can govern them. And when something goes wrong — the agent hallucinates a destructive action, follows a prompt injection, runs on an outdated policy, or simply does the wrong thing at machine speed — the governance product they bought turns out to be the wrong shape for the moment.

The architectural shortcut most products took

The dominant pattern in AI governance today is the SDK. A vendor publishes a library. The agent’s developer imports it. Before the agent does anything risky, it calls into the SDK and asks: is this allowed?

This is convenient to build and easy to demo. It also rests on an assumption that doesn’t hold up in production: that the agent will keep cooperating with its own oversight.

A governance system the agent can decide whether to honor is not governance. It is a hint.

An agent that gets jailbroken can skip the check. An agent running on an older policy version doesn’t know what was added last week. An agent that was misaligned at training never learned to call the check in the first place. A compromised dependency can patch the SDK out from underneath. None of these are exotic scenarios — they’re the everyday failure modes that produced the 88% incident rate.

The deeper issue is that an SDK lives inside the agent’s process. The thing being governed is also the thing running the governance. There is no independent check, and there is no place outside the agent where enforcement actually happens.

What enforcement at the network boundary changes

Consider the same problem solved in a different industry. Banks don’t ask customers to voluntarily check their own withdrawal limits. The limit lives at the bank. When a customer tries to withdraw too much, the bank’s system says no — not because the customer remembered to check, but because the check happens at the boundary the transaction has to cross.

The same principle applies to agent governance. If every agent action has to flow through a gateway before it reaches the systems that execute it, the gateway becomes the enforcement point. The agent can ask, hallucinate, or be coerced into anything; it cannot proceed until the gateway returns an answer.

Failure mode

SDK-based governance

Network-boundary governance

Agent decides to skip the check

Action proceeds. No record.

Action never reaches the target. Refusal is recorded.

Agent runs on stale policy

Acts on yesterday’s rules.

Always evaluated against the current published policy.

Agent is jailbroken

Adversary controls the check.

Adversary cannot reach the enforcement point.

Need to suspend a specific agent

Update SDK, redeploy, hope.

Flip a switch. Next request returns 403.

The third row is the one that the 60% Kiteworks statistic was really about. When a misbehaving agent is running, enterprises with SDK-based governance can’t actually stop it — because stopping it depends on the agent reading and obeying the new instruction. With a network-boundary enforcement point, suspension is a state change that the enforcement layer reads on every request. The next action returns a refusal whether the agent cooperates or not.

The shape of a real enforcement check looks something like this:

Example: an action gated at the boundary
action:           "delete_customer_record"
agent_id:         "agent_7b2d4f01"
policy_matched:   "dual_approval_pii_deletion"
status:           "PENDING"     // agent waits
required_decisions: 2
kill_switch_state: "inactive"   // checked on every request
chain_hash:       "7b2df04a...e1c2"

Note what the agent doesn’t get to control. It doesn’t decide whether to check the policy. It doesn’t decide whether to wait for the human review. It doesn’t decide whether the kill switch is on. All of those decisions live outside the agent — in the enforcement layer, where a human operator and the audit chain can both see them.

Why this conversation got urgent in 2026

Two things changed this year that turned the architectural question into a regulatory one.

The first is the August 2, 2026 enforcement deadline for high-risk AI systems under the EU AI Act. After that date, the European AI Office can issue fines of up to €15 million or 3% of global turnover for breaches of the high-risk system requirements — including failures of human oversight, traceability, and accuracy. For prohibited practices, the ceiling rises to €35 million or 7% of global turnover. Conformity assessments take six to twelve months. Organizations that haven’t started by mid-2026 will not finish in time.

The August deadline is structural, not symbolic

Finland became the first EU member state with full AI Act enforcement powers on December 22, 2025. Italy’s implementing Decree 132/2025 entered into force in October 2025 with administrative fines up to €774,685 and disqualifying measures for up to one year. The Commission has explicitly rejected calls for blanket delays. The infrastructure to fine companies is being built now.

The second shift is the change in what enterprise procurement teams are asking. A year ago, the question on a vendor security questionnaire was whether you hadAI governance. The 2026 version asks whether the governance is enforced at runtime, whether you can intervene during an incident, and whether the evidence would survive an external audit. The vendors that can answer those three questions well are starting to win the deals. The ones that can’t are losing them.

60%

of enterprises cannot terminate a misbehaving AI agent once it starts operating. Most of them think they can.

· · ·

The questions that separate the two architectures

If you’re evaluating an AI governance product in the current cycle, these are the questions that distinguish a runtime-enforcement platform from an SDK with a dashboard. They take a few seconds to ask and tell you almost everything about the architecture.

Architecture diagnostic

Five questions to ask before you sign

01If one of our agents is jailbroken and tries to skip your governance layer, what physically prevents it?
02If we suspend an agent right now, how long until its next attempted action is refused — and is the refusal recorded?
03When we update a policy, does an in-flight agent action get evaluated against the old version or the new one?
04If our agent and your governance layer disagree about whether an action is allowed, who wins, and on what evidence?
05Can you show me the network call where an agent action would be denied, with the response payload it would receive?

None of these are unfair questions. They’re the questions a regulator will eventually ask, framed in technical language now so you can answer them before the regulator does. If the vendor’s answers depend on the agent cooperating, the architecture is the agent’s, and the governance is whatever the agent decides to do that day.

The shift from SDKs to enforcement is the same shift the financial-services industry made when it stopped relying on individual systems to check their own limits and started running every transaction through a gateway. It made fraud detection possible. It made audit trails meaningful. It made the kill switch real.

AI agent governance is going through the same transition. The companies that figure this out in 2026 will quietly pull ahead of the ones still relying on cooperation.

Building or evaluating runtime AI governance?
Reach out at contact@purogaly.com.