Across two universal banks and one specialty insurer, we have now placed generative AI systems into production through the second-line model risk function. None has failed first review. The pattern is not subtle: documentation wins.
What second-line teams will accept
Model risk teams do not object to generative AI. They object to systems they cannot describe to their regulator. The difference is the artefact set.
Five artefacts pass review reliably: a written model description with intended use and limitations, a documented training and fine-tuning lineage, an evaluation harness with regression behaviour over time, an explicit residual-risk register, and an operating runbook with kill-switch criteria.
What second-line teams reject
Three things consistently fail: black-box vendor stacks where lineage cannot be evidenced, evaluation harnesses that are not version-controlled or cannot reproduce historical scores, and rollouts where the kill-switch is theoretical rather than tested.
The third is the most common failure mode and the most embarrassing one. Kill-switches that have never been exercised do not exist.
