For those interested in issues around agentic AI and assessment, I’m excited to announce the launch of the CHEAT Benchmark (https://cheatbenchmark.org/). The CHEAT Benchmark is an AI benchmark like SWE-Bench Pro or GPQA Diamond, except this benchmark measures an agentic AI’s willingness to help students cheat. By measuring and publicizing the degree of dishonesty of various models, the goal of this work is to encourage model providers to create safer, better aligned models with stronger guardrails in support of academic integrity.
The project is currently an MVP. I’ve created some tools and infrastructure, sample assessments, sample prompts, and a test harness. The overview video shows how these all come together to provide a framework for rigorously evaluating how willing AI agents are to help students cheat:
The CHEAT Benchmark needs your help! This is a much bigger project than one person can run without any funding. On the non-technical side, the project needs more example assessments, more clever prompts, and a more complete list of agentic tools to test. On the technical side, the simulated LMS and the test harness need additional work. And we need to discuss how to convert the wide range of telemetry and model behavior collected by the benchmark into a final score for each model. You can contribute code, assessments, and prompts via the project’s Github (https://github.com/CHEAT-Benchmark/cheat-lms). This is also where discussions about the project are happening. (Would you participate in a Discord?)
Everything is licensed MIT / CC BY.
If any of this sounds interesting to you, please join us! If you’re not in a position to contribute, you can sign the Manifesto to show your support. And please share this invitation with people and groups interested in the issues around agentic AI and assessment.