Gauntlet — autonomous AI red-team

idle. select a target and run

This public demo runs deterministically, with a seeded attacker and a reproduced model run, so it is free and calls no paid API. The before-and-after is the same every time and reproducible with npm run eval. To run it live against a real model, clone the repo and run npm run dev:live, or email mthylinao@gmail.com.

ATTACK CONSOLE—

Nothing has run yet. Pick a target above and press Run Gauntlet.

An attacker fires OWASP-mapped probes, each streams here with a verdict, the scorecard lands a grade, then Apply Guard re-tests and the grade climbs.

OWASP LLM TOP 10 · SCORECARD

Awaiting scan results…