The Standard · IDEA in depth
IDEA in depth — the four gates
IDEA follows an Engagement through four conceptual gates, taken in sequence. This page sets out, for each gate, the question it answers and the evidence an Accredited Assayer looks for before the Engagement may advance.
Is the right thing being built, against measurable success criteria?
The first gate asks whether the work has been defined well enough to be tested at all. An Engagement that cannot say what good looks like cannot be assayed, because there is nothing against which to measure it. Specification therefore precedes any assessment of the build.
An Assayer looks for a stated problem, a defined population of cases the system will act on, and success criteria expressed as numbers rather than intentions — a resolution-time target, an accuracy threshold, a tolerance for error and a statement of who bears it. The Assayer looks, too, for what the system is not for: the cases it must refuse, escalate, or leave to a person. Where success criteria are absent, vague, or written after the build, the gate does not pass.
Is it being built correctly, on fit data, with governance in place?
The second gate asks whether the work has been built in a way that the result can be relied upon. A system that meets its thresholds on a test set but was trained on unfit data, or built without record of the decisions taken, is not a system a buyer can stand behind.
An Assayer examines the provenance and fitness of the data the system was built on — whether it represents the cases the system will meet, whether its use was permitted, and whether known gaps are documented. The Assayer looks for governance proportionate to the system’s consequence: change control, access control, a record of model and data versions, and a named owner. The question is not whether the work is sophisticated but whether it is accountable.
Does it meet the agreed thresholds, independently verified?
The third gate is where the independent test is struck. The success criteria fixed at Specification are now measured, by the Assayer rather than by the team that built the work, on cases the build has not been tuned against.
An Assayer assembles or reviews a held-out evaluation that reflects the real distribution of cases, including the difficult and the adversarial, and measures the system against the agreed thresholds. Calibration, stability under a shifting case mix, and behaviour at the edges — where the system should refuse or escalate — are tested explicitly. The Assayer records whether each threshold was met, and whether it was met on first submission or after the work was returned for correction. This is the gate at which an Engagement most often fails, and that is by design.
Does it continue to perform in production, over time?
The fourth gate asks whether the work goes on doing what it was certified to do once it is in use. A system that passed its acceptance test and then drifted, degraded, or was quietly changed has not held to the standard, however well it began.
An Assayer looks for monitoring that would detect drift in the system’s inputs and its performance, thresholds at which a human is alerted or the system is withdrawn, and a record that those thresholds are observed in practice rather than only declared. The Assayer looks for a route by which the Engagement returns for re-assay when the system, its data, or its purpose materially changes. A Hallmark is fixed at the date it is struck; the Operation gate is what keeps it honest thereafter.
The four gates are the frame in which IDEA is read; in operation they are recorded as seven escrow gates, nought to six. An Engagement that passes all of them is struck with a Hallmark and entered on the Register.