New METR benchmarks put Claude Opus 4.6 at a 14.5-hour task horizon, with AI capability now doubling roughly every four months.