Benchmarks test capability.
We test character.

Models have behavioral patterns that only show up under pressure, at the edges, and when the conditions change. Clawbotomy finds them before your users do.