Oktsec Assessment

Security assessment for modern software systems.

Oktsec reviews real software systems across application security, architecture, dependencies, CI/CD, automation, cloud integrations and AI-agent workflows. Each assessment produces reviewed findings, executable evidence, scoring and a practical plan for what to fix, monitor or bring under control next.

Book an assessment See what we review

poc-run · mercury-cli · path-param traversalconfirmed

1$ echo '{"webhookEndpointId":"x/../../account/<ID>/transactions",

2 "amount":999.99,"recipientId":"attacker","paymentMethod":"ach"}' \

3 | mercury --debug webhooks update

4→ POST /api/v1/account/<ID>/transactions (money movement · operator's key)

5CONFIRMED · webhooks command redirected to a payments endpoint

Fixed in the public PR · mercury-cli #67 →

poc-run · stripe-cli · OpenAPI codegenconfirmed

1spec.info.version = '"\n func init(){panic("RCE_VIA_OPENAPI_SPEC")} \n const _="'

2$ go generate ./... # text/template, no escaping; format.Source accepts it

3→ stripe_version_header.go now has func init(){panic(...)}

4$ ./stripe --version → panic: RCE_VIA_OPENAPI_SPEC (every release binary)

5CONFIRMED · reported to Stripe

Reported to Stripe · HackerOne #3709089 · CWE-94 · 8.0 HIGH

Each proof-of-concept fires in a network-isolated, capability-stripped sandbox and confirms itself. Proven, not claimed.

Formal programs

Findings accepted by the programs that set the bar.

Confirmed reports to notable vendors and formal disclosure programs (VRP / MSRC), across agent tooling, CLIs and infrastructure.

Google VRP submissions (a2a, gemini-cli) plus a confirmed RCE in ax. VRP · RCE

Microsoft MSRC reports across AutoGen, TypeScript and Playwright. MSRC

Stripe Template injection in the OpenAPI codegen, RCE in every release binary (stripe-cli). Reported to Stripe, HackerOne #3709089. RCE · CWE-94

Cloudflare Revocation auth bypass in workers-oauth-provider. Auth bypass

AWS SigV4 credential exfiltration in mcp-proxy-for-aws. Cred exfil

Mercury Path-param URL traversal via stdin reroutes a webhooks update to a money-movement endpoint under the operator's key (mercury-cli). View PR → Traversal · public

What we review

Across modern software systems.

AppSec

Application and product security

Real application and product security across the surfaces your software exposes.

Architecture

Architecture and technical debt

Design, trust boundaries and the technical debt that quietly becomes security risk.

Supply chain

Dependencies, CI/CD and supply chain

Dependencies, build pipelines, automation and the provenance behind them.

AI agents

AI-agent workflows and automation

Our specialty: agent runtimes, MCP servers, tools and automation paths.

Classes found

Real exploits, across the stack.

From remote code execution to credential exfiltration, the findings span the full range of what actually breaks agent systems and developer tooling.

Vulnerability classes

Deep assessments

Sustained engagements, fixes shipped.

The strongest proof isn't a report, it's corrected code in production. These engagements ran over multiple rounds, with fixes merged as a direct result.

AI platform · R1-R6

Sustained assessment, 3 criticals corrected.

RCE via JS injection, a supply-chain path and a token leak, each with multiple PRs merged into production.

3 critical fixes merged

Funded AI project · R1-R13

Continuous assessment, ~17 findings.

Thirteen rounds of review with roughly seventeen findings and three waves of fixes shipped to production.

~17 findings 3 fix waves

Client names withheld pending disclosure permission. Formal vendor-program findings above are reported through public disclosure channels.

The difference

Executable evidence, not opinions.

Most assessments stop at findings. Oktsec delivers evidence that runs. Every confirmed finding comes with a proof-of-concept that fires in an isolated sandbox and is verified server-side.

It isn't an AI saying "this looks like a vulnerability." It's a live proof-of-concept firing and exiting with a confirmation code.

finding-0142 · evidencereproducible

1class SSRF · severity high

2# attempt without fix

3poc → internal metadata reached

4exit 1 · reproduced

5# after patch

6exit 0 · blocked

AI-agent specialty

Focused on the surfaces where agent risk becomes real.

We review the systems agents depend on when they move from chat into action: runtimes, tools, workflows, package paths and infrastructure-facing automation.

AI agents & agent runtimes

The execution surfaces where agents act on real systems.

MCP servers & tool integrations

Tool exposure, auth boundaries and call-abuse surfaces.

CLI & developer tools

Command surfaces, hooks and local execution paths.

GitHub Actions & CI/CD

Workflow permissions, secrets handling and pipeline trust.

Prompt-injection & tool-call abuse

Where untrusted input turns into privileged action.

Software supply-chain paths

Package installs, dependencies and provenance gaps.

Workflow

A scoped, isolated, evidence-backed run.

Each assessment follows a repeatable path: scope the target, run the research in an isolated environment, capture evidence, validate findings and deliver a report the team can act on.

Scoped target

Define the target, boundaries and rules of engagement.

Isolated run

Execute in an isolated sandbox, fully instrumented.

Executable PoC

Confirm each finding by firing the proof-of-concept to a confirmation code.

Findings

Triage and rank issues with reproducible context.

Patches

Propose concrete fixes, not just findings.

Provenance

Trace each finding to its evidence and origin.

Human review

A researcher reviews everything before delivery.

Delivery

A report your team can act on and verify.

Deliverables

What your team receives.

Each engagement ends with artifacts your team can review, reproduce and use to prioritize remediation.

Findings report with severity

Executable proof-of-concept

Reproduction steps & evidence

Proposed patches

Signed evidence bundle

Remediation guidance

Oktsec Control rollout recommendation

Book an assessment

Get proof your team can act on.

Tell us the target: an agent runtime, MCP surface, developer tool, CI workflow or automation-heavy codebase. We scope a focused audit and confirm every finding with an executable proof-of-concept.

Book an assessment Back to overview