Learning Path
Master CJE at your own pace. Start with the basics and go as deep as you need.
How to use this guide: Follow the path sequentially. Each step builds on the previous one. Stop when you have what you need—whether that's a working implementation or a deep understanding of the theory.
The Core Concept
The Deliberation Ladder
Idealized Deliberation Oracle (IDO)
What you'd decide with unlimited time, perfect information, and reflection.
Examples: Customer lifetime value, long-term health/happiness, future life satisfaction
Oracle / High-Rung Outcome
Expensive but practical labels that better approximate Y*.
Examples: Expert audits, task success, long(er)-run outcomes, 30–90d retention, expert panel rating, task completion rate
Cheap Surrogate
Fast signals you can collect at scale.
Examples: LLM-judge scores, clicks, watch-time, quick human labels, BLEU
Everything below teaches you how to climb this ladder efficiently—calibrating cheap signals (S) to expensive outcomes (Y) so you can make decisions that serve your true objective (Y*).
Short on time?
CJE in 5 Minutes
Quick overview: the problem, the solution, one concrete example, and where to go next. Read this if you want the gist before committing to the full path.
Read the 5-minute overview→Start: Why Your Metrics Are Lying
Zero math. Just concrete failure stories—like the "You're absolutely right!" meme that tanked developer productivity. Understand the deliberation ladder (S → Y → Y*) and why high scores on cheap metrics can predict low value on what actually matters.
Read: Your AI Metrics Are Lying to You→⏱️ 30 minutes • Zero equations • Accessible to executives and PMs
See CJE in Action
CJE does three things: calibrates judge scores to real outcomes, gives you honest confidence intervals, and detects when calibration drifts. See all three in one page—no installation required.
Try: CJE in 5 Minutes (Colab)→⏱️ 5 minutes • Runs in browser via Google Colab
Align Your Setup
Now that you've seen what CJE does, align your prompts and judges to the same target (Y*). Define a Standard Deliberation Protocol (SDP), get copy-paste templates, and avoid common pitfalls. Zero math.
Read: Aligning Generation & Evaluation→⏱️ 18 minutes • Get: SDP templates, judge rubrics, rollout plan
Get CJE Running
Install the package, run your first evaluation, interpret diagnostics. Most practitioners stop here—you'll have a working implementation and know when to trust it.
result = analyze_dataset(fresh_draws_dir=
⏱️ 30 minutes hands-on
✓ Checkpoint: You can stop here
At this point, you understand the problem, have aligned your prompts and judges to Y*, and can run CJE in production. You know what CJE does: calibrate cheap metrics, give honest confidence intervals, and detect when calibration breaks.
Continue if you need to explain the methods to skeptical stakeholders or understand the theoretical foundations.
Prefer a single comprehensive paper? The preprint covers methodology, experiments, and formal theory in one document.
Alignment Theory: Formal Proofs
Formal framework for Y*-alignment: propositions (optimization compatibility, judge-pool invariance, OPE transportability), assumptions ledger, and integration with OpenAI's deliberative alignment work.
Read: Your AI Is Optimizing the Wrong Thing — Technical Appendix→⏱️ 25 minutes • Covers: calibration theory, judge-pool invariance theorem, OPE transportability
Economics of Alignment
Why alignment fails when fabrication is cheap and verification is expensive. How SDP raises fabrication cost via checkable commitments and lowers verification cost via structured decomposition. The Beckerian deterrence model for AI accountability.
⏱️ 30 minutes combined • Covers: F vs V economics, certificate ladder, enforcement pathways
Empirical Deep Dive
Complete evaluation on 5k real Arena data: all 14 estimators, ablations, diagnostics, uncertainty decomposition. Understand edge cases and when methods fail.
Read: Full Arena Experiment→⏱️ 45 minutes • Focus: estimator comparisons, OUA decomposition, transportability tests
Why the Methods Work
Understand the unifying principle: projection onto convex constraint sets. Why AutoCal-R and SIMCal-W reduce variance while preserving unbiasedness. When off-policy evaluation hits fundamental limits.
⏱️ 25 minutes combined • Covers: isotonic regression, mean preservation, ESS limits
Evaluation Theory: Formal Framework
Complete theoretical treatment: identification assumptions (S1-S4, L1-L2), influence functions, semiparametric efficiency, Pearl-Bareinboim transport theory.
Read: Technical Appendix→⏱️ 25 minutes • Covers: DM/IPS/DR estimators, cross-fitting, DML, transportability proofs, literature connections
