Why is Anthropic accusing Chinese AI labs over distillation attacks

Training a frontier AI model requires billions of dollars in compute power and massive, curated datasets. However, a parasitic economy has emerged in the industry’s underbelly, allowing competitors to bypass these costs through a technique known as “distillation.” This week, that shadow war burst into the open as Anthropic, the maker of Claude chatbot, formally accused China’s DeepSeek and two other AI labs in the country—Moonshot and MiniMax—of launching industrial-scale campaigns to siphon the intelligence of its flagship model.

Distillation is effectively an intellectual property heist executed through an API. Instead of training a model from scratch, an operator queries a superior model—the “teacher”—millions of times, feeding the resulting high-quality answers into a smaller, cheaper “student” model. The student eventually learns to mimic the teacher’s reasoning without the operator ever paying the initial training costs. According to Anthropic, the three accused labs generated over 16 million exchanges with Claude, utilising a sprawling infrastructure of 24,000 fraudulent accounts to harvest its advanced coding and reasoning capabilities.

For the victimised AI labs, this represents a catastrophic leak of proprietary value. To prevent this, companies impose strict rate limits and blocking tools. However, the sheer scale of the attack on Claude highlights the critical role of a specific, often overlooked piece of internet infrastructure: commercial proxy service firms.

“Hydra clusters” and commercial proxy services

To distill a model effectively, an attacker needs volume. If millions of queries originate from a single server or data centre, security systems will instantly identify the anomaly and sever the connection. To evade detection, the accused labs allegedly utilised “hydra clusters”—massive networks of accounts routed through commercial residential proxy services. These services allow traffic to appear as if it is coming from millions of distinct, legitimate devices scattered across the globe.

This is where the connection to the broader cybercrime ecosystem becomes clear. A recent report by Google’s Threat Intelligence team on the disruption of “RSOCKS,” a massive residential proxy botnet, provides a grim look at the physical reality behind these services. While proxy firms often market themselves as legitimate tools for ad verification or SEO monitoring, their networks are frequently built on compromised hardware. The Google investigation revealed that millions of the IP addresses sold by such services belong to hacked Internet-of-Things (IoT) devices like smart refrigerators, routers, and garage door openers. The owners of these devices were oblivious to the fact that their bandwidth is being used to train a foreign AI model.

For the model distiller, this infrastructure provides the ultimate camouflage. By rotating their API requests through these “zombie” residential networks, an attacker can make a million extraction attempts look like one query each from a million distinct households. To the AI company’s standard defense systems, which rely heavily on reputation scores associated with IP addresses, this traffic appears entirely organic. It mimics the chaotic, distributed nature of genuine human usage, effectively neutralizing IP banning as a defensive strategy.

The commoditization of these proxy services has lowered the technical bar for distillation attacks, fueling the rapid rise of competitors who can achieve near-SOTA performance at a fraction of the cost. The accusations against DeepSeek, Moonshot, and MiniMax suggest that this “catch-up” strategy has been institutionalized within parts of the Chinese tech sector, leveraging Western innovation to circumvent U.S. export controls on advanced semiconductors.

New ways to detect attacks

The dynamics of this shadow war are shifting. As detailed in Anthropic’s update, the company is moving away from network-level defenses—which proxies render obsolete—toward behavioral analysis. The defense team has developed a method to detect distillation not by looking at who is asking, but by analyzing what is being asked. The breakthrough relies on the statistical properties of the queries themselves.

To train a competent student model, an attacker cannot ask random questions. The queries must cover a specific, mathematically diverse range of topics to capture the full breadth of the teacher’s capabilities. This necessity creates a unique statistical signature. Anthropic’s new detection technique measures the “conditional probability” of the incoming prompts, essentially identifying when a stream of queries is too mathematically perfect to be human. While a human user’s interactions are erratic and topical, a distiller’s queries follow a distinctive pattern designed to maximize information gain per token.

This pivot marks a significant maturation in the defense against IP theft. It suggests that the utility of commercial proxy networks in AI distillation may be nearing a plateau. If the detection logic occurs at the semantic level—analysing the text and intent rather than the connection source—masking the IP address becomes irrelevant. An attacker could route traffic through the most pristine residential proxies money can buy, but if the sequence of their questions betrays a training objective, the system can silently flag the account or feed it “poisoned” data to sabotage the student model.

Nevertheless, the market for commercial proxies remains robust. As AI labs deploy these statistical defenses, distillers will likely respond by introducing noise into their data collection, deliberately making their queries less efficient to mimic human randomness. The proxy firms, sitting in the middle of this flow, continue to profit from the insatiable demand for anonymity. For the AI industry, the challenge has evolved from a game of whack-a-mole with IP addresses to a deeper forensic analysis of intent, signaling that the protection of artificial intelligence assets now requires an understanding of the very mathematics that power them.

Published – February 25, 2026 08:28 am IST

Why is Anthropic accusing Chinese AI labs over distillation attacks

ByBibha Enterprises

“Hydra clusters” and commercial proxy services

New ways to detect attacks

Like this:

By Bibha Enterprises

Related Post

Median opened, service lane pressed into use as GHMC issues notices for Gachibowli road widening

Highly condemnable, shameful: Priyanka slams arrest of IYC leaders

Sharad Pawar discharged from Pune hospital; advised rest for few days

Leave a Reply Cancel reply

You missed

Median opened, service lane pressed into use as GHMC issues notices for Gachibowli road widening

Highly condemnable, shameful: Priyanka slams arrest of IYC leaders

Sharad Pawar discharged from Pune hospital; advised rest for few days

India’s support measures in auto, renewable energy sectors fully compliant with WTO norms: Official

marketspoon

ByBibha Enterprises

“Hydra clusters” and commercial proxy services

New ways to detect attacks

Share this:

Like this:

By Bibha Enterprises

Related Post

Leave a Reply Cancel reply

You missed