By Abdul Wasay ⏐ 2 months ago ⏐ Newspaper Icon Newspaper Icon 2 min read
Reddit Takes Legal Action Against Perplexity Ai Other Data Scrapers

Reddit has launched a high stakes legal battle against artificial intelligence firm Perplexity AI and three associated companies Oxylabs UAB, AWMProxy, and SerpApi, accusing them of “industrial scale, unlawful data scraping” from its platform. The lawsuit, filed in a New York federal court, could have far reaching implications for how social platforms protect user generated content amid the AI boom.

According to the complaint, Perplexity and its partners allegedly extracted vast quantities of Reddit user data without authorization to train AI models. Reddit claims that the companies used sophisticated evasion tactics including spoofing user agents and masking IP addresses to bypass anti bot protections. When blocked directly, they allegedly turned to scraping Google’s cached search results to continue harvesting Reddit content indirectly.

Reddit’s legal team likened the operation to “breaking into an armored truck because you can’t access the vault,” highlighting the company’s stance that its community driven archive of human conversations is a proprietary and monetizable asset.

Chief Legal Officer Ben Lee stated that Reddit “supports responsible AI innovation, but not at the expense of consent, compensation, and transparency.”

Perplexity, a fast growing AI search startup recently valued at over 20 billion dollars, has denied wrongdoing, saying its mission is to “respectfully provide open access to knowledge.” The company has previously claimed that its data sourcing methods comply with fair use principles, arguing that publicly available web content should not require explicit licensing.

However, legal experts disagree. Industry analysts told various outlets that Reddit’s lawsuit could set a precedent on whether public internet content can be used for commercial AI development without explicit permission. A ruling in Reddit’s favor could pressure other AI firms, including OpenAI, Anthropic, and Meta, to pay for data access or drastically change their training pipelines.

This lawsuit comes just months after Reddit inked data licensing deals reportedly worth tens of millions of dollars with Google and OpenAI, highlighting a clear shift toward controlled, paid data partnerships. For Reddit, which went public earlier this year, user data is not just content but a key revenue source.

Regulators and lawmakers are also watching closely. In the United States and Europe, policymakers are increasingly questioning whether AI training on scraped content violates intellectual property or privacy laws. The Reddit Perplexity case may become a defining test of “data ownership” in the AI economy.