Detection versus authorization: the two security models for AI agents

Detection and authorization are the two security models available for AI agent work, and they are routinely confused because both get sold under the same word: security. The difference is one sentence each. Detection observes what an agent does and alerts when behavior matches something bad. Authorization defines what an agent may do before it acts, enforces that boundary deterministically, and verifies the evidence of what happened after.

The distinction is old. Antivirus is detection; file permissions are authorization. An IDS is detection; a firewall rule is authorization. Every mature security stack runs both. What is new with agents is that the default of the market, detection first and authorization maybe later, gets the order exactly wrong.

What each model answers

Detection answers: does this behavior look like a known problem? It is probabilistic by nature. Its outputs are alerts with confidence levels, and its failure modes are the pair every security team knows: false positives that erode attention, and false negatives that erode everything else.

Authorization answers three different questions. What is this agent allowed to do? Did the boundary hold? And afterward: what exactly happened, under which policy? Its outputs are decisions and records: allowed, reviewed, blocked, each one attributable to a rule someone approved. Its failure mode is rigidity: a policy too narrow blocks legitimate work until someone widens it.

Notice what the second list has that the first does not. Decisions and records are evidence. Alerts are not. When an auditor asks what your agents were allowed to do last quarter and what they actually did, a folder of anomaly alerts does not answer the question. A signed policy and the verified record of its application do.

Why agents break detection-first security

Detection works when malicious activity looks different from legitimate activity. Malware has signatures. Exploits have payloads. Anomalies have baselines to deviate from.

Agent attacks have none of that, because the attack and the workload are made of the same material. A prompt injection is text. The agent reading it was built to read text. The tool call it triggers is a legitimate tool call, with valid credentials, through an approved integration. There is no signature, because nothing in the traffic is malformed. The only thing wrong with the action is that nobody authorized it.

That sentence is the whole axis. If the defect is missing authorization, the fix is not better observation of unauthorized actions. The fix is authorization.

You cannot detect your way out of a problem whose definition is "this action was never allowed in the first place."

Where detection still earns its place

None of this makes detection useless for agents; it makes detection second. Inside an authorized boundary there is still work for it: an agent using an approved tool at an anomalous rate, a permitted query pattern that starts to look like exfiltration, drift in the behavior of a dependency. Detection is how you notice the problems your policy did not anticipate. It refines the boundary; it cannot replace it.

The order matters because each layer changes what the other sees. Detection on top of authorization inspects a small, well-defined space: only actions that were allowed. Detection instead of authorization inspects everything, which in agent systems means an unbounded stream of plausible-looking actions, at machine speed, with no ground truth about what should have been allowed.

How to tell which model you are looking at

Vendor language blurs the axis, so ask questions the models answer differently.

Can it stop an action before it executes, based on a rule I wrote? Authorization can. Detection alerts during or after.
Is a model in the enforcement path? If an LLM decides what is allowed, you have probabilistic detection wearing authorization's clothes. Enforcement has to be deterministic to be auditable.
What is the artifact after an incident? Authorization produces the policy that was in force and the per-call record of decisions. Detection produces a timeline of alerts.
Who can verify it? If the answer requires trusting the vendor's classifier, it is detection. If a third party can re-check the decision against the written policy, it is authorization.

The takeaway

Detection and authorization are not competitors; they are layers with a correct order. For AI agent work, authorization comes first because the attacks are indistinguishable from the workload and because only authorization produces evidence a company can stand behind. This is the model Oktsec is built on: signed, deterministic policy decides before the action, and verified evidence proves what happened after. Detection is welcome on top. It just cannot be the foundation.

Oktsec Control runs the authorization loop for approved agent environments. See how it works →

Detection versus authorization.

The short version

What each model answers

Why agents break detection-first security

Where detection still earns its place

How to tell which model you are looking at

The takeaway

The short version

What each model answers

Why agents break detection-first security

Where detection still earns its place

How to tell which model you are looking at

The takeaway

Prompt injection is an authorization problem.

Agent work needs evidence, not trust.