Amazon's anti-benchmark AI bet

My interview with Amazon's AI chief. Also: OpenAI's code red, Anthropic bares Claude's soul.

Dec 02, 2025

∙ Paid

Hi to everyone who subscribed from yesterday’s scoop. I’m en route to New York City for Andrew Ross Sorkin’s DealBook Summit tomorrow. Reach out if you’re also attending or want to grab a drink before I leave town.

Amazon’s AI chief has a message for the model benchmark obsessives: Stop looking at the leaderboards.

“I want real-world utility. None of these benchmarks are real,” Rohit Prasad, Amazon’s SVP of AGI, told me ahead of today’s announcements at AWS re:Invent in Las Vegas. “The only way to do real benchmarking is if everyone conforms to the same training data and the evals are completely held out. That’s not what’s happening. The evals are frankly getting noisy, and they’re not showing the real power of these models.”

It’s a contrarian stance when every other AI lab is quick to boast about how their new models quickly climb the leaderboards. It’s also convenient for Amazon, given that the previous version of Nova, its flagship model, was sitting at spot 79 on LMArena when Prasad and I spoke last week. Still, dismissing benchmarks only works if Amazon can offer a different story about what progress looks like.

Continue reading this post for free, courtesy of Alex Heath.

Or purchase a paid subscription.