Blog
Conceptual explainers and essays on causal evaluation, LLM judges, and alignment.
Start Here
Your AI Metrics Are Lying to You
Why "You're absolutely right!" scored 9/10 but tanked user satisfaction by 18%. Zero math, just the core insight.
Read the flagship post →Looking for theory?
Research papers with formal proofs and identification results
Looking for benchmark results?
Read the CJE empirical paper on arXiv
Loading posts...
