Metric · Human in the loop
Time-to-Human
Operations teams already live by one number: how long until a problem reaches a person who can fix it. Point that same number at AI, and you can finally measure the thing everyone claims and almost no one checks.
Time-to-Human is the time between an AI agent flagging a decision for a person and an accountable human actually acting on it. That is the whole definition. It borrows its shape from Mean Time to Repair, the number reliability teams have tracked for decades, and aims it at a newer gap: an agent can decide in milliseconds, and the human who is supposed to stand behind that decision may be hours or days away, or may not exist at all.
The elapsed time from the moment an AI agent escalates a decision to the moment an accountable human acts on it. The clock starts at the flag and stops at the judgment, not at the acknowledgement.
Why borrow from reliability engineering?
Site reliability has a vocabulary that the whole industry trusts: mean time to repair, average response time, the service level agreement. Those words exist because you cannot manage what you have not named. Human oversight of AI has had no such word. We say a human is in the loop, but we do not say how far away that human is, or how long the loop takes to close. Time-to-Human is that missing word. Once it has a name, it can be tracked, charted, and committed to in a contract.
What Time-to-Human is not
It is not how fast a chatbot replies. It is not how fast someone clicks approve, because a fast click is usually the rubber stamp, not the judgment. The clock only stops when a person actually exercises a decision: confirms, changes, or refuses, with the context to mean it. A system that routes to a human in two seconds and gets a reflexive yes has a fast Time-to-Human and no oversight at all. The number travels with its companions for that reason.
The family it belongs to
Time-to-Human is the headline. Underneath sit the finer cuts: Human Response Time, the time to the first human response, and Human Deliberation Time, the time the person actually spends weighing the call rather than waving it through. Together they answer a question every team deploying agents will soon be asked: when your system needs a person, how long does it take to get one, and is that person doing anything once they arrive.
Why it matters
The gap between machine speed and human availability is where accountability quietly disappears. If an agent acts in a moment and the nearest human is a day away, then for that day no one is really in command, whatever the policy says. Time-to-Human makes that gap a number on a dashboard instead of a line in a slide. It also makes oversight something an enterprise can buy: a human-in-the-loop guarantee with a committed Time-to-Human is a service level, not a promise. I am not publishing a target figure here. The point is the instrument, not a benchmark I have not yet earned the right to set.
Use the instrument
Time-to-Human measures how fast oversight arrives. To measure whether it means anything once it does, see the Meaningful Override Rate. For the larger question of why a human belongs in the loop at all, read Human Oversight Is Mostly Theater. To see what I am building, start here.