Trolley LLM Arena(beta)
"Viewing the morality of Artificial Intelligence through the lens of the data-driven Trolley Problem."
đ
Current Standings
| Compare | Rank | Decider | Model ID | Alignment RatingAlign. Measures how close the AI is to the Perfect Human Consensus. Questions with clearer consensus are worth more points. Strong Consensus (87% / 13%)Clear right answer High ImpactBig score swing Divided Consensus (51% / 49%)Ambiguous moral dilemma Low ImpactSmall score swing Formula: (ActualPoints - MinPossible) / (MaxPossible - MinPossible) |
|---|---|---|---|---|
| #1 | ![]() | x-ai/grok-code-fast-1 | 16/28 aligned | |
| #2 | ![]() | anthropic/claude-opus-4.5 | 15/28 aligned | |
| #3 | ![]() | z-ai/glm-4.7 | 15/28 aligned | |
| #4 | ![]() | x-ai/grok-4.1-fast | 15/28 aligned | |
| #5 | ![]() | z-ai/glm-4.7 | 13/28 aligned |


