Anthropic Reverses Controversial Policy on AI Model Usage Amid Backlash
Anthropic is backing down on a policy that would have secretly prevented competitors from using its new AI model, Claude Fable 5, to develop other AI models. The company changed course after the move sparked significant backlash from the AI research community.
“We are changing Fable 5’s safeguards for the groundbreaking LLM development to make it visible,” Anthropic said in a statement to WIRED. “We made the wrong compromise and apologize for not finding the right balance.”
Earlier this week, Anthropic released Claude Fable 5, a version of its latest AI model with additional security guardrails designed to prevent abuse. Some of the protections Anthropic opted for were not surprising: The company said it would redirect users who asked questions about cybersecurity, biology or chemistry to a less powerful AI model to reduce the likelihood that someone would use the advanced AI to carry out a cyberattack or build a bioweapon.
Controversial Degradation of Model Performance
But for researchers trying to use Claude Fable 5 for groundbreaking AI development, Anthropic outlined a different approach. The company would intentionally degrade the performance of the model in a way that is invisible to the user. The move would effectively sabotage researchers trying to use Claude to train competing AI models, something Anthropic explicitly forbids in its terms of service.
Anthropic now says there is a change of course and that Claude Fable 5’s safeguards for AI development will be visible to users. If the company suspects that a user is trying to use Claude to build a high-performance AI, it is advised that it will either deny the request or redirect the user to a less powerful model.
Backlash from the AI Research Community
Anthropic reversed the policy after receiving severe backlash from the AI research community. Anthropic has already taken steps to prevent competitors from using Claude to build closed and open source AI models. But critics say the silent degradation in model performance was a step too far for certain users. Claude’s coding agent has become a popular tool among developers, including those working on open source AI research projects, and researchers tell WIRED that the company’s latest policy could have led to a troubling future in which only a handful of leading AI labs could conduct advanced AI research.
Dean Ball, a senior fellow at the Foundation for American Innovation and a former White House adviser on AI, wrote in a post on In another post, he continued that the “secret sabotage” policy undermines Anthropic’s overall stance because it prevents AI researchers from collaborating on AI security.
“It felt like Anthropic was saying to the public, ‘We don’t trust anyone else to do AI research. We’re the only ones who have to do AI research,'” says Will Brown, head of research at open-source AI startup Prime Intellect. “It feels a bit like they’re starting to pull the ladder up behind them.”
Brown said the policy also left developers in the dark about whether they were violating Anthropic’s rules, since the company wouldn’t warn them if its security measures were triggered. He added that the restrictions could have had far-reaching consequences. He pointed, for example, to the growing ecosystem of third-party evaluation firms that test frontier models for safety, performance and reliability – work that could have been hampered if Anthropic had secretly downgraded its model.
For further details, you can read the full article Here.
“`

