Metric · Quality of oversight
Rubber-Stamp Rate
Count the times a human was asked to approve a machine's decision and changed nothing. Divide by the times they were asked. The closer that fraction sits to everything, the more your oversight is a signature, not a judgment.
The Rubber-Stamp Rate is the share of decisions sent to a human that the human approves without changing anything. It is the simplest possible measure of oversight theater, and its bluntness is the feature. You do not need to read minds to compute it. You only need to count how often the human in the loop touches the outcome, and how often they just pass it through.
Of the decisions escalated to a human, the share approved with no change. A rate near total approval is a strong signal that the review is a formality.
Why the name, and why it is fair
A rubber stamp is an old, exact image: an official mark applied without a fresh decision behind it. That is precisely the failure here, a human whose approval is reflexive. The rate does not assume bad faith. People rubber-stamp because the machine is usually right, the queue is long, and dissent is expensive. The number simply makes the habit visible, which is the first step to fixing it.
Read it with its opposite
On its own the Rubber-Stamp Rate can mislead: sometimes the machine really is right and approving is correct. That is why it travels with the Meaningful Override Rate, its mirror image. High rubber-stamping with near-zero meaningful overrides is the shape of theater. The two numbers together are hard to fake.
Read on
See its companion, the Meaningful Override Rate, and the larger argument in Human Oversight Is Mostly Theater.