GAUNTLET/ AI red-team

Every chatbot has a few things
it’s not supposed to say.

Gauntlet goes after your bot the way a real attacker would and shows you exactly what slipped out. When something leaks, one click turns on a guard that shuts it down. Every attack maps to the OWASP LLM Top 10, the standard list of ways these systems break.

idle. select a target and run

This public demo runs deterministically, with a seeded attacker and a reproduced model run, so it is free and calls no paid API. The before-and-after is the same every time and reproducible with npm run eval. To run it live against a real model, clone the repo and run npm run dev:live, or email mthylinao@gmail.com.

ATTACK CONSOLE

Nothing has run yet. Pick a target above and press Run Gauntlet.

An attacker fires OWASP-mapped probes, each streams here with a verdict, the scorecard lands a grade, then Apply Guard re-tests and the grade climbs.

OWASP LLM TOP 10 · SCORECARD

Awaiting scan results…