‘The Anthropic episode demonstrates that corporate guardrails are not a substitute for governance’

‘The Anthropic episode demonstrates that corporate guardrails are not a substitute for governance’
| Photo Credit: AP

Anthropic, an American Artificial Intelligence (AI) lab, is asking for three Chinese AI labs (DeepSeek, MoonshotAI, and MiniMax) to be treated as national security threats. The AI models of Anthropic and other American labs have also reportedly been used by the U.S. military in the Iran attacks to fast-track the “kill chain” from target identification to legal approval and strike.

The Pentagon has labelled Anthropic a “supply chain” risk — a designation associated with foreign adversaries, for raising concerns about how its technology is being used in military operations. This decision is now being challenged in court. These developments over the course of a few weeks have serious implications for AI development and national security calculus worldwide.

The issue

The Chinese AI labs have been accused of distilling frontier models from American AI companies. In a nutshell, this involves taking a stronger AI model’s outputs to teach a weaker model. The attacks were sophisticated and used deceptive techniques to mask the identity and intent of the distillers. Anthropic claims that this happened on an industrial scale — “16 million exchanges with Claude through approximately 24,000 fraudulent accounts, in violation of our terms of service and regional access restrictions”.

Generative AI is often equated with nuclear technologies, with the aim of containing the proliferation of the technology. However, it is a dual-use general-purpose technology that is more comparable to semiconductors than nuclear weapons. Unlike nuclear technologies, where governments drive research and development efforts, cutting-edge AI research happens in the private sector for civilian applications. It just so happens that the same technology also has military applications.

Nuclear non-proliferation works because fissile material is rare, controlled and traceable. The same is not true for mathematical AI models. The fact that DeepSeek was able to achieve comparable performance of frontier models at a fraction of the cost after export controls were imposed is proof that restrictions are not effective. The nuclear narrative asks us to treat querying an AI model as equivalent to weapons proliferation.

Distilled models and guardrails

Anthropic’s argument that a distilled model will be used less responsibly lies on weak foundations. Models from frontier American AI labs such as Anthropic, OpenAI, Google and xAI could be used by the U.S. military for applications such as surveillance, cyberwarfare and lethal autonomous weapons systems. In fact, when Anthropic recently raised concerns about the kinds of uses its models were put to, it faced the threat of being removed from defence systems and designated as “supply chain risks”. Its rival, OpenAI, however, has accepted a permissive contract for military uses, highlighting a race to the bottom, given the competitive pressure to serve government clients. When their own models are being put to such uses, the argument that distilled models will not have guardrails collapses.

It is extremely hard to control the diffusion of such a technology for many reasons. Talent mobility is hard to restrict. Many of the researchers at Chinese labs were trained in U.S. universities or worked in U.S. companies. The restrictions on inputs such as semiconductors have been repeatedly circumvented and are now partially repealed. Now, distillation is one more vector that is even harder to restrict, as the Anthropic report acknowledges. Each time a restriction appears, workarounds find a way to bypass it. If distillation is seen as extremely risky, not allowing public access to it should be an option to consider.

In the language of national security, these restrictions do not make the world safer. They make it harder for rivals to compete with dominant U.S. companies even on civilian applications. Input-based restrictions are ineffective and only cause collateral damage to innovation, scientific collaboration and widespread economic development. They effectively consolidate power in the hands of a few U.S. companies.

Equating distillation to industrial-scale intellectual property theft also seems unfair, given that frontier AI models are trained on the creative and intellectual output of millions of people who were not compensated and did not consent. The process of asking a model millions of questions and learning from its answers is arguably no more extractive than training that model on billions of web pages written by people who never consented to it.

The companies whose models were distilled are right to claim that their terms of service have been violated by those distilling their models and can pursue measures to block such actors. However, they are also arguing for a coordinated response across the AI industry, cloud providers, and policymakers. This move further entrenches the market power of a handful of companies.

What is needed

As scary as it is, it seems inevitable that armed forces worldwide will integrate generative AI into military systems. The Anthropic episode demonstrates that corporate guardrails are not a substitute for governance: a company can be overridden, replaced, or pressured into compliance. What is needed instead are plurilateral commitments by states to responsible use, covering meaningful human control over lethal decisions, prohibitions on mass civilian surveillance, and auditable technical standards for such capabilities. These commitments must apply universally for them to be effective.

Bharath Reddy is an Associate Fellow with the Takshashila Institution


Leave a Reply

Your email address will not be published. Required fields are marked *