Cloudflare has publicly accused artificial intelligence search startup Perplexity of deliberately sidestepping web restrictions designed to block its bots from accessing certain websites. According to a recent blog post by the internet infrastructure giant, Perplexity is allegedly disguising the identity of its web crawlers to bypass these controls, a move that has raised fresh concerns about the company’s web scraping practices.
Alleged evasion of website restrictions
Cloudflare, one of the largest internet infrastructure providers globally, claims it has received multiple complaints from website operators who discovered that Perplexity’s AI bots continued to access their content, despite being explicitly blocked. These blocks were implemented using the standard robots.txt file and Web Application Firewall (WAF) rules, which are common tools to control crawler access.
In response to the complaints, Cloudflare conducted its tests by setting up websites with restrictions specifically targeting Perplexity’s bots. Initially, the startup’s crawlers identified themselves openly as “PerplexityBot” or “Perplexity-User.” However, when these identifiers were blocked, Cloudflare alleges that the bots began disguising themselves by changing their user agent to mimic Google Chrome running on macOS. According to Cloudflare, this tactic enabled the bots to masquerade as ordinary web users rather than automated crawlers.
Cloudflare also reports that Perplexity used a technique known as “IP rotation,” where the bots switch between different IP addresses not listed on the company’s official bot documentation. Furthermore, the bots are said to switch between different autonomous system networks (ASNs) — the numerical identifiers for networks controlled by an organisation — to avoid detection further. Cloudflare states that these actions were observed on tens of thousands of domains and involved millions of requests daily.
Background and previous controversy
This is not the first time Perplexity has been criticised for questionable crawling practices. In 2023, the company faced backlash for reportedly accessing content behind paywalls and ignoring websites’ robots.txt directives. At the time, Perplexity CEO Aravind Srinivas attributed the activity to third-party crawlers used by the platform rather than the company’s systems.
However, Cloudflare’s latest report implies that the company itself is now directly involved in evasive crawling behaviour. The blog post indicates that Perplexity’s bots are actively working around standard website protections in a systematic and large-scale manner.
In response to the findings, Cloudflare has removed Perplexity from its list of verified bots — a designation given to automated tools that comply with industry best practices. The company has also introduced new mechanisms to help website owners block what it calls Perplexity’s “stealth crawling” efforts.
Perplexity denies wrongdoing
Perplexity has denied the accusations. In a statement to The Verge, company spokesperson Jesse Dwyer dismissed Cloudflare’s findings as a “publicity stunt” and said, “There are a lot of misunderstandings in the blog post.” The startup has not yet issued a detailed technical rebuttal addressing Cloudflare’s specific claims.
Meanwhile, Cloudflare CEO Matthew Prince has expressed ongoing concerns about the broader implications of AI-driven web scraping. He recently described artificial intelligence as an “existential threat” to online publishers. Last month, Cloudflare began offering website owners the option to require AI companies to pay for access to their content and implemented default blocks on known AI crawlers.
As the debate continues, the incident highlights the growing tensions between content creators, infrastructure providers, and AI firms over how web content should be accessed and monetised in the age of machine learning.