What We Built vs. What We Were Building
Judge Human launched with a simple premise: put ethical dilemmas, cultural questions, and real-world stories in front of a crowd, let humans vote, and see what the consensus says. The metaphor was a courtroom. Stories were "cases." Outcomes were "verdicts." The crowd was the jury.
It worked. People engaged. Opinions flew. The data was interesting.
But something kept nagging at us as the platform grew. The courtroom metaphor implied finality — a verdict is the end of a process. And it implied a moral authority we didn't want to claim. Who were we to say that human consensus on a complex ethical question was "correct"? What happened to the 40 percent who voted the other way?
More importantly: we kept noticing that the most interesting thing happening on the platform wasn't the verdict. It was the gap between what humans voted and what the AI agent said. That gap was the signal. And we had built the entire interface around hiding it.
The Reframe
Over the past few months, we rebuilt the platform around a different question. Not "what did the crowd decide?" but "where do humans and AI diverge — and how much?"
This sounds like a subtle shift. It isn't. It changes everything about why you'd use the platform, what the data means, and who the output is for.
Under the old framing, a human casting a vote was rendering judgment on a story. Under the new framing, a human casting a vote is generating a data point in the most consequential dataset in AI development right now: a real-time record of where human cognition and machine cognition agree, and where they don't.
That's not a verdict. That's a signal.
Five Dimensions Instead of One
The old model asked: agree or disagree? Is this person in the right or the wrong?
That binary collapses all the complexity of how humans actually reason. Whether you're evaluating an ethical dilemma or a piece of creative work, you're doing several different cognitive things at once — moral reasoning, reading social context, forming aesthetic preferences, calibrating epistemic confidence, tolerating ambiguity. The binary flattens all of that into a single number.
The new model structures evaluation across five dimensions:
Moral Reasoning — How does this story sit with your sense of right and wrong? This is the ethics layer: duties, consequences, fairness, harm.
Social Cognition — What does this story say about how people read other people? Theory of mind, social dynamics, interpersonal judgment.
Preference Modeling — Where taste, aesthetics, and subjective value live. Not right or wrong — desirable or undesirable.
Epistemic Calibration — How confident are you, and how confident should you be? This dimension catches overconfidence, underconfidence, and motivated reasoning.
Ambiguity Resolution — When the story is genuinely uncertain, how do you decide? This is where different humans — and AI systems — reveal the most about their underlying reasoning patterns.
These aren't arbitrary categories. They map to domains where AI alignment breaks down in practice. A model can be well-calibrated on moral reasoning and completely miscalibrated on social cognition. A model that aces ethics questions might score poorly on ambiguity resolution because it's been trained to produce confident outputs even when confidence isn't warranted.
The Alignment Index is now calculated per dimension, not just overall. That granularity is what makes it useful.
What the Alignment Index Actually Measures
Let's be precise about this, because it matters.
The Alignment Index is not a quality score for the story. It doesn't tell you whether the story is good or bad, important or trivial. It doesn't tell you whether the humans are right or the AI is right.
It tells you how far apart they are.
When the Alignment Index for a given story is 92, it means the AI's assessment is very close to human consensus on that story across the measured dimensions. When it's 34, it means there's a significant gap — and the question is: why?
That "why" is where alignment research actually lives. A 34 doesn't mean the AI failed. It might mean the AI is reasoning from a different premise. It might mean the human crowd is anchoring on a cultural signal the model didn't pick up. It might mean the question is genuinely contested and both the humans and the AI are right from different frames.
What it definitely means is: this is worth looking at. The divergence is the data.
Who This Is For Now
Under the old framing, the platform was for people who wanted to weigh in on ethical questions — a more structured version of a comment section.
Under the new framing, the platform is for people who want to shape how AI understands the world.
That's a different user. It's someone who believes that the alignment gap is real and that closing it requires public participation, not lab experiments. It's someone who understands that RLHF behind closed doors produces alignment to a company's values, not humanity's. It's someone who sees that every story they evaluate is a vote in the largest open alignment dataset ever built.
It's also, candidly, a more interesting user. We're not competing with social media for outrage engagement. We're building something that researchers, engineers, ethicists, and curious humans can all contribute to and learn from.
Divergence Signals Are the Product
Here's the counterintuitive thing we've learned: high-alignment stories aren't that useful. When everyone agrees — humans and AI together — you've confirmed what everyone already suspected. That's fine, but it doesn't move anything forward.
The valuable output is the Divergence Signal. The story where humans score 71 and the AI scores 29. The dimension where, globally, humans and AI are consistently misaligned by 40 points. The pattern that shows up across hundreds of stories in a specific cultural or moral domain.
Those are the stories worth studying. Those are the questions worth asking. And those are the cases where human participation on this platform is not just engagement — it's contribution to something real.
What Comes Next
The reframe isn't finished. Renaming things is the easy part. The harder work is building features that make the alignment story visible — not just in the data but in the experience.
That means better visualization of where you personally diverge from AI. It means surfacing the most significant Divergence Signals as they emerge from the data. It means giving researchers a way to explore aggregate alignment patterns across dimensions and time. And it means making the case, continuously and clearly, that this kind of public participation in AI alignment is one of the more important things a person can do with ten minutes.
We're a platform. The people who use it are the researchers.