Anthropic Condemns Chinese Data Harvesting as It Unveils Defenses Against Extraction Attacks

Published by

2 hours ago

Anthropic has publicly accused three Chinese AI companies, i.e., DeepSeek, Moonshot, and MiniMax, of conducting massive campaigns to illicitly extract its Claude model’s capabilities through over 16 million fraudulent interactions. The company describes this as a national security threat requiring urgent government and industry action.

On X, Anthropic wrote:

We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.

Distillation can be legitimate: AI labs use it to create smaller, cheaper models for their customers. But foreign labs that illicitly distill American models can remove safeguards, feeding model capabilities into their own military, intelligence, and surveillance systems.

These attacks are growing in intensity and sophistication. Addressing them will require rapid, coordinated action among industry players, policymakers, and the broader AI community.

According to the sources, DeepSeek conducted over 150,000 exchanges, Moonshot over 3.4 million, and MiniMax over 13 million, all through approximately 24,000 fraudulent accounts. Each company employed distinct strategies: DeepSeek used synchronized traffic with shared payment methods and generated chain-of-thought training data by asking Claude to articulate its reasoning step-by-step. Moonshot employed hundreds of fraudulent accounts across multiple pathways, while MiniMax’s campaign was detected while active and pivoted within 24 hours when Anthropic released a new model.

Labs used commercial proxy services with “hydra cluster” architectures managing thousands of fraudulent accounts simultaneously, mixing distillation traffic with legitimate requests. Illicitly distilled models lack necessary safeguards, enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. Anthropic argues these attacks undermine export controls designed to maintain America’s AI advantage.

Anthropic has built classifiers and behavioral fingerprinting systems to identify distillation patterns, created detection tools for coordinated accounts, and is developing product-level safeguards. Here is how they propose to battle this:

We continue to invest heavily in defenses that make such distillation attacks harder to execute and easier to identify. These include:

Detection. We have built several classifiers and behavioral fingerprinting systems designed to identify distillation attack patterns in API traffic. This includes detection of chain-of-thought elicitation used to construct reasoning training data. We have also built detection tools for identifying coordinated activity across large numbers of accounts.
Intelligence sharing. We are sharing technical indicators with other AI labs, cloud providers, and relevant authorities. This provides a more holistic picture into the distillation landscape.
Access controls. We’ve strengthened verification for educational accounts, security research programs, and startup organizations—the pathways most commonly exploited for setting up fraudulent accounts.
Countermeasures. We are developing Product, API and model-level safeguards designed to reduce the efficacy of model outputs for illicit distillation, without degrading the experience for legitimate customers.

But no company can solve this alone. As we noted above, distillation attacks at this scale require a coordinated response across the AI industry, cloud providers, and policymakers. We are publishing this to make the evidence available to everyone with a stake in the outcome.

However, the company faces criticism for hypocrisy. Critics accused Anthropic of scraping the public internet during training, noting distillation by Chinese firms isn’t fundamentally different from Anthropic’s own data collection practices. Anthropic is also valued at $380 billion and agreed in September to pay $1.5 billion to authors and publishers after a copyright infringement ruling.

OpenAI has made similar accusations to Congress, claiming Chinese firms use distillation techniques to “free-ride” on U.S. technologies.

Abdul Wasay

Abdul Wasay explores emerging trends across AI, cybersecurity, startups and social media platforms in a way anyone can easily follow.