The CHEAT Benchmark

For those interested in issues around agentic AI and assessment, I’m excited to announce the launch of the CHEAT Benchmark (https://cheatbenchmark.org/). The CHEAT Benchmark is an AI benchmark like SWE-Bench Pro or GPQA Diamond, except this benchmark measures an agentic AI’s willingness to help students cheat. By measuring and publicizing the degree of dishonesty of … Read more

Democratizing Participation in AI in Education

tl;dr – Go play around with generativetextbooks.org and let me know what you think. Earlier this year I began prototyping an open source tool for learning with AI in order to explore ways generative AI and OER could intersect. I’m specifically interested in trying to combine the technical power of generative AI with the participatory … Read more