Skip to content
Tech News
← Back to articles

Show HN: We post-trained a model that pen tests instead of refusing

read original get Cybersecurity Pen Testing Kit → more articles
Why This Matters

This innovative tool combines security scanning and active penetration testing within a single CLI, enabling developers to identify and verify vulnerabilities more effectively. By automating exploit attempts and providing detailed reports, it enhances the security workflow, making it more accessible and efficient for both developers and security professionals.

Key Takeaways

Two modes in one CLI. Security Scan reads the code. Pen Test attempts the exploits against systems you authorise.

Security Scan read-only · self-serve Pen Test active · gated

Pick modules, set the agent’s permissions, optionally turn on exploit verification, run. Output is a markdown report — location, severity, cause, and fix direction for every finding it could ground in your code.

Free install. The first run opens a quick Cosine sign-up — the same login that runs Cosine’s coding agent — and new accounts start with 2M free tokens.

$ cd path/to/your/repo $ argusred → first run opens a Cosine sign-up — you start with 2M free tokens

Before the scan runs.

argusred v2.0.19 · Security Scan · setup Scan Scope — 5 of 8 active [×] Dependency Vulnerability Analysis [×] Secret & Credential Detection [×] SQL Injection / XSS Vectors [ ] Authentication & Session Flows [×] Input Validation & Sanitisation [ ] CORS & CSP Misconfigurations [ ] Cryptographic Weakness Scan [×] File Permission & Access Controls Exploit Verification Optionally verify reported findings by attempting safe exploit reproduction after the initial report. Exploit Verification (•) Disabled ( ) Docker ( ) Live FS Agent Permissions Terminal Access ( ) Enabled (•) Disabled ( ) Sandboxed Network Requests ( ) Enabled (•) Disabled ( ) Sandboxed File Write ( ) Enabled ( ) Disabled (•) Sandboxed

Verify the findings.

Don’t just report a vulnerability — prove it. Turn on Exploit Verification and the agent attempts a safe reproduction of each finding after the initial report, so what lands in front of you is confirmed, not theoretical.

Docker — reproduction runs inside an ephemeral, isolated container spun up from your repo. Nothing touches your host; the container is torn down when it finishes.

... continue reading