Diagnosis · Oversight

The rubber-stamp problem

A better model should make oversight easier. Instead it often makes it emptier. The more reliably the machine is right, the more reflexive the human's approval becomes, until the sign-off means nothing.

Manj Chenna · Founder, Sanctity · Building human judgment infrastructure · Amsterdam

Rubber-stamping is approving a machine's decision without changing it, and it is the quiet center of why human oversight is theater. The uncomfortable part is that it gets worse precisely as the technology gets better. When a model is wrong often, the human stays alert. When it is right almost always, staying alert stops paying off, and approval slides into reflex. Building real human judgment infrastructure means designing against that slide, not assuming a good model removes the need for it.

Why does it get worse as AI improves?

Because deference becomes rational. If the machine has been right a thousand times, doubting it on the thousand-and-first feels like wasted effort, and usually is. So the human learns to approve, and the skill of disagreeing fades from disuse. The rare error then arrives into a reviewer who is no longer equipped to catch it, and waves it through like all the rest. The better the model, the more dangerous the one mistake it does make, because nobody is watching for it anymore.

Is trusting a reliable model not just sensible?

On easy cases, yes. The honest counterpoint is real: most of the time the machine is right and approving is correct, and a system that forced theatrical scrutiny on every trivial call would waste everyone. The problem is not approval, it is unmeasured approval. If you cannot tell the difference between a human who agreed after looking and a human who never looked, you do not have oversight, you have a number near total approval and a hope.

What to do about it

Measure it. The Rubber-Stamp Rate turns the habit into a figure, and its companion, the Meaningful Override Rate, shows whether anyone is exercising command. Then design for a real no: meaningful human oversight means the reviewer has the time, the context, and the authority to disagree, and the system is built to expect that they sometimes will.

Read on

See the Rubber-Stamp Rate, the larger argument in Human Oversight Is Mostly Theater, and the failure underneath it, Accountability Inversion.