The Information — AI · · 1 min read

AI Evaluators Struggle with Models That Know When They’re Being Tested

Mirrored from The Information — AI for archival readability. Support the source by reading on the original site.

AI researchers are starting to make progress on a confounding problem: AI models are getting better at telling when they are in an evaluation.

That could become a problem for AI companies that use evaluations to gauge the capabilities and behaviors of their models before releasing them. If models act differently during testing, that could mean they get released with undesirable tendencies. It could also undermine their creators’ ability to show off test scores to potential clients. 

Evaluations are important for “convincing customers that our products are better at their use case than other products,” said Silas Alberti, who works on evaluations at Cognition, the AI coding startup.

And as models get smarter, they are gaining even more eval awareness, as researchers call it. For example, in testing of its non-public Mythos model, Anthropic found that Mythos more often mentioned that it was being tested than its predecessors Claude Opus 4.6 and Sonnet 4.6.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from The Information — AI