Skip to Content

Anthropic ditches its core safety promise in the middle of an AI red line fight with the Pentagon

<i>Chance Yeh/Getty Images via CNN Newsource</i><br/>Pictured is Anthropic cofounder and CEO Dario Amodei.
Chance Yeh/Getty Images via CNN Newsource
Pictured is Anthropic cofounder and CEO Dario Amodei.

By Clare Duffy, Lisa Eadicicco, CNN

(CNN) — Anthropic, a company founded by OpenAI exiles worried about the dangers of AI, is loosening its core safety principle in response to competition.

Instead of self-imposed guardrails constraining its development of AI models, Anthropic is adopting a nonbinding safety framework that it says can and will change.

In a blog post Tuesday outlining its new policy, Anthropic said shortcomings in its two-year-old Responsible Scaling Policy could hinder its ability to compete in a rapidly growing AI market.

The announcement is surprising, because Anthropic has described itself as the AI company with a “soul.” It also comes the same week that Anthropic is fighting a significant battle with the Pentagon over AI red lines.

The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter. Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei an ultimatum on Tuesday to roll back the company’s AI safeguards or risk losing a $200 million Pentagon contract. The Pentagon threatened to put Anthropic on what is effectively a government blacklist.

But the company said in its blog post that its previous safety policy was designed to build industry consensus around mitigating AI risks – guardrails that the industry blew through. Anthropic also noted its safety policy was out of step with Washington’s current anti-regulatory political climate.

Anthropic’s previous policy stipulated that it should pause training more powerful models if their capabilities outstripped the company’s ability to control them and ensure their safety — a measure that’s been removed in the new policy. Anthropic argued that responsible AI developers pausing growth while less careful actors plowed ahead could “result in a world that is less safe.”

As part of the new policy, Anthropic said it will separate its own safety plans from its recommendations for the AI industry.

Anthropic wrote that it had hoped its original safety principles “would encourage other AI companies to introduce similar policies. This is the idea of a ‘race to the top’ (the converse of a ‘race to the bottom’), in which different industry players are incentivized to improve, rather than weaken, their models’ safeguards and their overall safety posture.”

The company now suggests that hasn’t played out.

In a statement to CNN, an Anthropic spokesperson described the updated policy as “the strongest to date on the level of public accountability and transparency.”

“We’ve gone a significant step further from our prior policies by committing to publicly publish detailed reports at regular intervals on our plans to strengthen our risk mitigations, as well as the threat models and capabilities of all our models,” the statement said. “From the beginning, we’ve said the pace of AI and uncertainties in the field would require us to rapidly iterate and improve the policy.”

The new safety policy

Anthropic’s new safety policy includes a “Frontier Safety Roadmap” that outlines the company’s self-imposed guidelines and safeguards. But the company acknowledged the new framework is more flexible than its past policy.

“Rather than being hard commitments, these are public goals that we will openly grade our progress towards,” the company said in its blog post.

The change comes a day after Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei a Friday deadline to roll back the company’s AI safeguards, or risk losing a $200 million Pentagon contract and being put on what is effectively a government blacklist.

Anthropic has concerns over two issues that it isn’t willing to drop, according to a source familiar with the company’s meeting with Hegseth: AI-controlled weapons and mass domestic surveillance of American citizens. Anthropic believes AI is not reliable enough to operate weapons, and there are no laws or regulations yet that cover how AI could be used in mass surveillance, a source said.

AI researchers applauded Anthropic’s stance on social media on Tuesday and expressed concerns about the idea of AI being used for government surveillance.

The company has long positioned itself as the AI business that prioritizes safety. Anthropic has published research showing how its own AI models could be capable of blackmail under certain conditions. The company recently donated $20 million to Public First Action, a political group pushing for AI safeguards and education.

But the company has faced increasing pressure and competition from both the government and its rivals. Hegseth, for example, plans to invoke the Defense Production Act on Anthropic and designate the company a supply chain risk if it does not comply with the Pentagon’s demands, CNN reported on Tuesday. OpenAI and Anthropic have also been locked in a race to launch new enterprise AI tools in a bid to win the workplace.

Jared Kaplan, Anthropic’s chief science officer, suggested in an interview with Time that the change was made in the name of safety more than increased competition.

“We felt that it wouldn’t actually help anyone for us to stop training AI models,” Kaplan told the magazine. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

CNN’s Hadas Gold contributed to this story.

This story has been updated with additional information.

The-CNN-Wire
™ & © 2026 Cable News Network, Inc., a Warner Bros. Discovery Company. All rights reserved.

Author Profile Photo

CNN Newsource