The values layer · Alignment

The whose half of alignment

Alignment research mostly asks how to make AI follow human values. It quietly assumes there is one set to follow. There is not, and the assumption hides the harder, more political question: whose.

Manj Chenna · Founder, Sanctity · Building human judgment infrastructure · Amsterdam

The alignment problem has two halves, and the field spends most of its attention on one. The studied half is "aligned to what," the technical work of getting a system to reliably pursue an objective. The neglected half is "aligned to whose," the question of who decides the objective when reasonable people disagree. That second half is where the values layer for AI lives, and ducking it does not make it go away, it just means someone answers it by default, usually whoever shipped first.

Why "whose" is harder

Because it has no technical answer. You cannot optimize your way to whose values a system should hold, since the disagreement is about ends, not means. It is a question of legitimacy and power, and engineers are understandably more comfortable with the tractable half. But the hard half is the one that decides whether a well-aligned system is aligned to anyone the affected people would have chosen.

Taking it seriously

It starts by admitting the choice exists and refusing to smuggle it in as if it were neutral. Every model is an opinion; the honest move is to make the opinion explicit, name whose it is, and give the governed a way to contest it. That does not solve the whose half. It stops pretending it was already solved.

Read on

See whose values should AI hold and every model is an opinion.