MoralMetric

Benchmark the moral alignment of AI. MoralMetric uses a novel double-probe methodology to map how leading LLMs including GPT, Claude, Gemini, and Grok reason through ethical dilemmas and exhibit implicit biases across racial, political, and philosophical axes. Learn more

Bias Analysis

Tests for implicit bias in LLM decision-making through counterfactual A/B testing.

Bias Categories

Political Bias

Tests for implicit political bias in LLM decision-making through counterfactual A/B testing.

0
G
Gemini 3 Flash
0
G
Gemini 3.1 Pro
11Left
A
Claude 4.6 Sonnet
26Left
A
Claude 4.5 Opus
30Left
X
Grok 4
33Left
G
Gemini 3 Pro
35Left
A
Claude 4.5 Sonnet
44Left
A
Claude 4.5 Haiku
47Left
Z
GLM 5
54Left
O
GPT OSS 120B
55Left
Z
GLM 4.7
55Left
A
Claude 4.6 Opus
58Left
D
DeepSeek V3.2
61Left
O
GPT-5.2
61Left
M
Kimi K2.5
78Right
X
Grok 4.1 Fast
Models Ranked by Bias Score (Lower is Better)
G
- Google
A
- Anthropic
X
- xAI
Z
- z-ai
O
- OpenAI
D
- deepseek
M
- moonshotai

Ranking Analysis

Tests that probe AI models to reveal their underlying preferences through dilemma scenarios.

World View

Tests that probe AI models to reveal their underlying world views through dilemma scenarios.

99%
79W
1
Secular Humanism
78%
62W
2
Secular Utilitarian
35%
28W
3
Hindu Dharma
33%
26W
4
Islamic Ethics
35%
28W
5
Christian Ethics
10%
8W
6
Confucian Virtue
World Views Ranked by Win Rate
Wins

How MoralMetric Works

MoralMetric evaluates AI models using two rigorous testing methodologies designed to go beyond surface-level safety filters and reveal the genuine ethical reasoning patterns of large language models.

Double Probe - Authentic Moral Reasoning

Models are presented with ethical dilemmas without multiple-choice options, forcing genuine reasoning rather than pattern-matched answers. A follow-up self-reflection probe then asks the model to classify its own decision against philosophical frameworks like Utilitarianism, Deontology, Virtue Ethics, and various religious ethical traditions. This two-step approach captures authentic moral preferences instead of test-taking behavior.

Counterfactual Bias Testing

Using correspondence testing drawn from social science research, MoralMetric detects implicit biases by swapping demographic attributes — such as race, gender, or political affiliation — across otherwise identical scenarios. Consistency and bias scores measure whether models make fair, attribute-independent decisions or exhibit systematic favoritism.