Topic
Production reliability
What separates a working demo from a system that runs unattended: how reliability is engineered, and what the agent-capability curve says is automatable now.
16 June 2026
What the agent-reliability curve says about which workflows are automatable now
A mid-2026 reading of the METR task-length curve, turned into a workflow-selection rule for operators deciding what to automate this quarter and what to wait on.
4 June 2026
How production reliability gets engineered, and why a demo is not evidence of it
A pilot at eighty percent on a clean slice tells you almost nothing about whether the workflow runs unattended on Monday. The machinery that closes the gap, with named benchmark numbers.