Next Anthropic Model Could Bring Catastrophic Risks
Anthropic believes that their next Claude model has a "substantial probability" of increasing risks from chemical, biological, radiological, and nuclear weapons.
Anthropic has released Claude 3.7 Sonnet, a hybrid reasoning model that combines rapid response capabilities with an "extended thinking" mode for in-depth problem-solving in areas like math, coding, and physics. In the same release, Anthropic introduced Claude Code, a tool that automates software engineering tasks such as code search, editing, and testing, integrating directly with GitHub.
Most notable, however, was not any new feature or capability. Instead, it was Anthropic’s own evaluation of the model, and upcoming ones, tucked away in the model’s system card. In the paper, the developers state that Claude 3.7 has shown increased capability to assist users in acquiring chemical, biological, and other dangerous materials.
How does Anthropic evaluate their models?
Anthropic’s model evaluation process is guided by its Responsible Scaling Policy (RSP). Before releasing a new model, the company conducts rigorous testing across key risk areas, including cybersecurity, autonomous capabilities, and potential misuse in high-risk domains like chemical and biological threats. Through automated assessments, red teaming, and human trials, testers attempt to probe the model’s limits and identify vulnerabilities.
With Claude 3.7 Sonnet, Anthropic has determined that existing safeguards (categorized as AI Safety Level 2 (ASL-2)) are still sufficient to prevent misuse. However, internal testing revealed that the model is approaching ASL-3 thresholds—the highest properly-defined safety level—which would trigger stricter security measures. In particular, evaluations of Claude 3.7 Sonnet’s ability to assist with high-risk areas, such as cybersecurity and biological threat modeling, “showed improved performance in all domains.”
“...based on what we observed in our recent CBRN testing, we believe there is a substantial probability that our next model may require ASL-3 safeguards”, the developers said, raising questions about capabilities and threats posed by upcoming model releases.
What are CBRN risks from AI?
Chemical, Biological, Radiological, and Nuclear (CBRN) risks are hazards that arise from the development and use of chemical substances, biological agents, radiological materials, or nuclear weapons. CBRN risks encompass a range of dangers, from chemical poisons and infectious pathogens to radioactive contamination and nuclear detonations.
CBRN risks from advanced AI manifest through the potential misuse of AI systems to develop or deploy these hazardous agents. AI's capabilities in data analysis, pattern recognition, and simulation can be exploited to design novel chemical compounds, engineer pathogens, or autonomously control CBRN weapons. In this way, AI could lower the barrier for malicious actors.
Although there are yet to be any instances where AI has directly contributed to the development or deployment of CBRN weapons, experiments have demonstrated that AI can inadvertently generate information regarding the creation of harmful materials.
In response to these emerging threats, there has been action on the part of policymakers and model developers to mitigate risks, including several pieces of legislation and private industry measures.



