Oktsec reviews real software systems across application security, architecture, dependencies, CI/CD, automation, cloud integrations and AI-agent workflows. Each assessment produces reviewed findings, executable evidence, scoring and a practical plan for what to fix, monitor or bring under control next.
Fixed in the public PR · mercury-cli #67 →
Reported to Stripe · HackerOne #3709089 · CWE-94 · 8.0 HIGH
Each proof-of-concept fires in a network-isolated, capability-stripped sandbox and confirms itself. Proven, not claimed.
Confirmed reports to notable vendors and formal disclosure programs (VRP / MSRC), across agent tooling, CLIs and infrastructure.
Mercury Path-param URL traversal via stdin reroutes a webhooks update to a money-movement endpoint under the operator's key (mercury-cli). View PR → Traversal · public Real application and product security across the surfaces your software exposes.
Design, trust boundaries and the technical debt that quietly becomes security risk.
Dependencies, build pipelines, automation and the provenance behind them.
Our specialty: agent runtimes, MCP servers, tools and automation paths.
We don't report "maybe." 250+ findings, each confirmed by a proof-of-concept that fires in an isolated sandbox.
From remote code execution to credential exfiltration, the findings span the full range of what actually breaks agent systems and developer tooling.
The strongest proof isn't a report, it's corrected code in production. These engagements ran over multiple rounds, with fixes merged as a direct result.
RCE via JS injection, a supply-chain path and a token leak, each with multiple PRs merged into production.
Thirteen rounds of review with roughly seventeen findings and three waves of fixes shipped to production.
Client names withheld pending disclosure permission. Formal vendor-program findings above are reported through public disclosure channels.
Most assessments stop at findings. Oktsec delivers evidence that runs. Every confirmed finding comes with a proof-of-concept that fires in an isolated sandbox and is verified server-side.
It isn't an AI saying "this looks like a vulnerability." It's a live proof-of-concept firing and exiting with a confirmation code.
We review the systems agents depend on when they move from chat into action: runtimes, tools, workflows, package paths and infrastructure-facing automation.
The execution surfaces where agents act on real systems.
Tool exposure, auth boundaries and call-abuse surfaces.
Command surfaces, hooks and local execution paths.
Workflow permissions, secrets handling and pipeline trust.
Where untrusted input turns into privileged action.
Package installs, dependencies and provenance gaps.
Each assessment follows a repeatable path: scope the target, run the research in an isolated environment, capture evidence, validate findings and deliver a report the team can act on.
Define the target, boundaries and rules of engagement.
Execute in an isolated sandbox, fully instrumented.
Confirm each finding by firing the proof-of-concept to a confirmation code.
Triage and rank issues with reproducible context.
Propose concrete fixes, not just findings.
Trace each finding to its evidence and origin.
A researcher reviews everything before delivery.
A report your team can act on and verify.
Each engagement ends with artifacts your team can review, reproduce and use to prioritize remediation.
Tell us the target: an agent runtime, MCP surface, developer tool, CI workflow or automation-heavy codebase. We scope a focused audit and confirm every finding with an executable proof-of-concept.