Design · The stop

The kill switch nobody wants to need

Every serious system needs a way to stop. The trouble is that the stop control is the one feature nobody exercises, so it tends to exist on the diagram and fail in the moment, which is the only moment it matters.

Manj Chenna · Founder, Sanctity · Building human judgment infrastructure · Amsterdam

A kill switch is a way to halt an AI system that someone has actually tested before the day they need it. The first half is easy and the second half is where it falls apart. Almost every system can claim a stop control. Far fewer have one that is reachable, fast, and proven to work under the conditions where you would reach for it. A stop you have never pulled is a hope, and hope is not part of human judgment infrastructure.

Why a stop matters

Because some failures unfold faster than a fix can be designed, and the only responsible move is to halt and reassess. An agent acting at scale can do a lot of damage in the time it takes to debug it. The kill switch buys the one thing you cannot otherwise get in a fast failure: a pause, so a human can take command.

Why most kill switches are theater

Because they are never tested, often not reachable by the people who would need them, and sometimes not even wired to actually stop the consequential part. A control that looks like a stop and has never stopped anything is worse than none, because it offers false comfort. A real one is exercised on a schedule, like a fire drill, so it works when it is not a drill.

Read on

See how to build oversight that holds and a recall for a decision system.