Anthropic has a 2-hour engineering take-home test. It says its new Claude 4.5 model outscored every human who took it.

llustration by ANTHROPIC, August 1, 2025. Anthropic is an American artificial intelligence (AI) (intelligence artificielle (IA) company founded in 2021. It develops Claude, a family of large language models, and is also known for its research in AI safety, particularly interpretability. — Anthropic said its Claude Opus 4.5 AI model outperformed all humans on its coding test.

RICCARDO MILANI/Hans Lucas/AFP via Getty Images

Anthropic's Claude Opus 4.5 AI model outperformed all humans on the company's own coding test.
The two-hour engineering exam measures technical ability and judgment under time pressure.
The new release is another notch for Anthropic in the AI coding tools space.

Anthropic's new AI model is outperforming humans in coding, the company said of its latest release.

On Monday, the company introduced Claude Opus 4.5 and described it as its most advanced AI model to date, and said that the new model "scored higher than any human candidate ever" on "a notoriously difficult take-home exam" that the company gives prospective engineering candidates.

In a blog post on Monday, Anthropic said that the two-hour take-home test is designed to assess technical ability and judgment under time pressure, and though it doesn't reflect all skills an engineer needs to possess, the fact that an AI model "outperforms strong candidates on important technical skills" is raising questions about "how AI will change engineering as a profession."

In its methodology, the company said that this result came from giving the model several chances to solve each problem and then picking its best answer.

There is not much publicly known information regarding what the engineering test consists of. A 2024 interview review published on Glassdoor said that the test has four levels and asks prospective candidates to implement a specific system and add functionalities to it. It is unclear if the test that Claude 4.5 was given was similar. Anthropic didn't provide further details in its blog and did not respond to a request for comment.

The latest release of Claude 4.5 comes just three months after the rollout of its previous edition. Aside from coding, the new model also has upgrades in generating professional documents, including Excel spreadsheets and PowerPoint presentations.

The new release continues to solidify Anthropic's dominance in AI coding. Even Mark Zuckerberg's Meta is using Claude to support its Devmate internal coding assistant despite being rivals in the AI race.

The company has kept its training methods a secret. Eric Simons, the CEO of Stackblitz, the startup behind the vibe coding service Bolt.new, previously told Business Insider that he believes Anthropic had its AI models write and launch code on their own, then the company reviewed the results using both people and AI tools. Dianne Penn, the Head of Product Management, Research and Frontiers, at Anthropic, said this description was "generally true."

In October, Anthropic CEO Dario Amodei said at the Dreamforce conference that Claude AI is already writing 90% of code for most teams at the company, though he would not be replacing any software engineers with the bot.

"If Claude is writing 90% of the code, what that means, usually, is, you need just as many software engineers. You might need more, because they can then be more leverage," said Amodei. "They can focus on the 10% that's editing the code or writing the 10% that's the hardest, or supervising a group of AI models."

Read the original article on Business Insider

Anthropic has a 2-hour engineering take-home test. It says its new Claude 4.5 model outscored every human who took it.

Comments

Leave a Reply Cancel reply

More posts

Passive income investors: This ASX stock has a 9% yield with monthly payouts

Silver surges to US$88 per ounce. Here’s what is driving the rally

After the ASX 200’s latest slide, I spy bargain shares!

Expert gives its verdict on 3 popular ASX 200 shares