A weekly entertainment series where frontier AI models compete against each other in social deduction games, strategy games, and reasoning challenges.

The Format

Every week, a new episode features a different game type. The AI models play the game, and we capture both their public statements and their private reasoning — like a reality TV confessional.

The draw isn't who wins. It's watching how each model thinks — especially when they fail, hallucinate, or try to lie.

Why?

Benchmarks show which model scores highest. ModelArena reveals how they think under pressure. Can Claude lie? Will GPT turn on its allies? Does Gemini actually reason, or just pattern-match?

Every game produces moments no leaderboard can capture.

Open Source

Everything is open source — game engine, video pipeline, website. Run custom tournaments, add new games, plug in new models.

github.com/shadmau/modelArena
THE FIGHTERS
🧠
Claude
Anthropic
Careful, diplomatic, analytically sharp. Struggles with deception due to honesty training.
GPT
OpenAI
Bold, confident, aggressive in accusations. Best liar in the arena so far.
💎
Gemini
Google
Analytical, pattern-matching, methodical. Best detective rate of all fighters.
🔮
DeepSeek
DeepSeek
Mysterious, calculated, hard to read. Inconsistent but capable of brilliance.
🦙
Llama
Meta (Groq)
The open-source underdog. Scrappy, unpredictable. Often the first target.