Measuring human oversight of AI systems

Regulations require human oversight — but no one measures whether humans can actually exercise it. We develop assessment methods that close this gap.

The oversight challenge

AI systems now operate across four layers of human decision-making. As visible experts, they analyse beyond PhD-level. As personal assistants, they know your patterns better than you do. As agentic teams, they don't just advise — they execute. As external influence, they shape context before you start thinking.

The question isn't whether AI influences decisions. It already does. The question is whether humans can remain clear enough to steer.

What we measure

Discernment — the capacity to remain clear, responsible, and sovereign when working with AI systems.

Influence recognition

Where do people defer to AI authority without questioning? Where does framing shape conclusions before analysis begins?

Override capacity

When AI recommendations conflict with human judgment, what happens? What conditions enable meaningful disagreement?

Responsibility tracking

Where has accountability become diffuse or untraceable? How does responsibility flow through human-AI workflows?

Intervention points

Which concrete checks restore human steering? What mechanisms actually work in practice?

Regulatory context

The EU AI Act requires human oversight of high-risk AI systems. Article 14 mandates that humans can "correctly interpret" AI output and decide when to override. Article 4 requires AI literacy across organizations.

These requirements assume capacities that no one currently measures. Impact assessments identify risks. Training provides knowledge. But neither tests whether humans can actually exercise oversight when it matters.

EU AI Act

Article 14 oversight requirements

AI Literacy

Article 4 capability mandates

Impact Assessment

FRAIA and similar frameworks

Assessment

Discernment Snapshot

A focused assessment of human oversight capacity in one concrete decision context. Not a compliance checklist — a measurement of whether the humans involved can actually steer.

Learn more
Scope One decision context
Duration 72 hours
Output Assessment memo + Checklist

Find out where you stand

One decision context. Clear measurement. Actionable recommendations.

Get in touch