Amazon reportedly investigating Perplexity AI after accusations it scrapes websites without consent

Wired previously found a Perplexity crawler that's been bypassing the Robots Exclusion Protocol.

·Contributing Reporter

Updated 28 June 2024 at 8:26 pm·3-min read

Amazon Web Services has started an investigation to determine whether Perplexity AI is breaking its rules, according to Wired. To, be precise, the company's cloud division is reportedly looking into allegations that the service is using a crawler, which is hosted on its servers, that ignores the Robots Exclusion Protocol. This protocol is a web standard, wherein developers put a robots.txt file on a domain containing instructions on whether bots can or can't access a particular page. Complying with those instructions is voluntary, but crawlers from reputable companies have generally been respecting them since web developers started implementing the standard in the '90s.

In an earlier piece, Wired reported that it discovered a virtual machine that was bypassing its website's robots.txt instructions. That machine was hosted on an Amazon Web Services server using the IP address 44.221.181.252 that's "certainly operated by Perplexity." It reportedly visited other Condé Nast properties hundreds of times over the past three months to scrape their content, as well. The Guardian, Forbes and The New York Times had also detected it visiting their publications multiple times, Wired said. To confirm whether Perplexity truly was scraping its content, Wired entered headlines or short descriptions of its articles into the company's chatbot. The tool then responded with results that closely paraphrased its articles "with minimal attribution."

A recent Reuters report claimed that Perplexity isn't the only AI company that's bypassing robots.txt files to gather content used to train large language models. However, it seems like Wired only provided Amazon with information on Perplexity AI's crawler. "AWS’s terms of service prohibit abusive and illegal activities and our customers are responsible for complying with those terms," Amazon Web Services told us in a statement. "We routinely receive reports of alleged abuse from a variety of sources and engage our customers to understand those reports." The spokesperson also added that the company's cloud division told Wired it was investigating information the publication provided as it does all reports of potential violations.

Perplexity spokesperson Sara Platnick told Wired that the company has already responded to Amazon's inquiries and denied that its crawlers are bypassing the Robots Exclusion Protocol. "Our PerplexityBot — which runs on AWS — respects robots.txt, and we confirmed that Perplexity-controlled services are not crawling in any way that violates AWS Terms of Service," she said. Platnick told us that Amazon looked into Wired's media inquiry only as part of a standard protocol for investigating reports of abuse of its resources. The company has apparently not heard from Amazon about any type of investigation before Wired contacted the company. Platnick admitted to Wired, however, that PerplexityBot will ignore robots.text when a user includes a specific URL in their chatbot inquiry.

Aravind Srinivas, the CEO of Perplexity, also previously denied that his company is "ignoring the Robot Exclusions Protocol and then lying about it." Srinivas did admit to Fast Company that Perplexity uses third-party web crawlers on top of its own, and that the bot Wired identified was one of them.

Update, June 28, 2024, 2:20PM ET: We have updated this post to add Perplexity's statement to Engadget.

Update, June 28, 2024, 8:27PM ET: We have updated this post to a statement from Amazon Web Services.

Associated Press Finance
Amazon is reviewing whether Perplexity AI improperly scraped online content
Amazon is reviewing claims that the artificial intelligence startup Perplexity AI is scraping content — including from prominent news sites — without approval. Amazon spokesperson Samantha Mayowa confirmed Friday that the tech giant was assessing information it received from the news outlet WIRED, which published an investigation earlier this month that said Perplexity appeared to scrape content from websites that had prohibited access from such practices. Perplexity uses servers by Amazon Web Services, otherwise known as AWS.
Tom's Guide
Rumored iPhone 16 camera layout is the right move, even if it’s for the wrong reason
It's looking increasingly likely that Apple is going to change the camera layout on the iPhone 16. It's an improved look, even if it doesn't lend the desired boost to spatial video.
TechRadar
Sure, the iPhone 15 Pro's Action button is great, but Apple should adopt this retro Google Pixel feature
I’d love it if the iPhone 16 took inspiration from this retro Google Pixel feature: squeezable sides.
Tom's Guide
I used this $80 Bluetooth keyboard for a week and it's a game-changer
Logitech has launched a new ultra-portable Bluetooth keyboard that can pair with, and switch between, 3 separate devices.
TechRadar
The latest Google Lens update might bring Circle to Search to many more phones
Clues found in recent Google betas hint at the expansion and we could see a new navigation option for the tool as well.
TechCrunch
SAP, and Oracle, and IBM, oh my! 'Cloud and AI' drive legacy software firms to record valuations
There's something of a trend around legacy software firms and their soaring valuations: Companies founded in dinosaur times are on a tear, evidenced this week with SAP's shares topping $200 for the first time. The Germany-based enterprise software provider was valued at $92 billion two years ago, and $156 billion 12 months back, meaning its market cap has grown more than 50% in the past year alone. CEO Christian Klein has overseen SAP's turnaround since 2020, focusing on helping customers transition to the cloud while striking useful partnerships with hyperscalers such as Google and Nvidia along the way.
Insider Monkey
Salesforce, Inc. (CRM): Is It One of the Best Cloud Computing Stocks to Buy Now?
We recently compiled a list of the 10 Best Cloud Computing Stocks to Buy Now. In this article, we are going to take a look at where Salesforce, Inc. (NYSE:CRM) stands against the other cloud computing stocks. You can also check out the 10 Best Artificial Intelligence Stocks to Buy Under $10 here. Cloud computing […]
INSIDER
After my husband and I stopped wearing our wedding rings, we tried polyamory. 3 years later, we're happily nonmonogamous.
My husband and I were monogamous before we got married. When we stopped wearing our wedding rings, we decided to try polyamory.
The Telegraph
Woman arrested after prison officer filmed allegedly having sex with inmate
A woman has been arrested after footage was widely circulated that appeared to show a prison officer having sex with an inmate.
The Telegraph
‘I’m not going to buy electric again – it’s the worst car I’ve ever had’
Ray Bestwick bought an electric car last May in the hope of hassle-free motoring.
Hello!
Amal Clooney, 46, is glowing in flirty floral dress as she walks hand in hand with George Clooney
George Clooney and his wife, Amal, were spotted in St Tropez this past week, and Amal made the most of the glorious sunshine with a flirty floral mini dress. See the pictures here...
Associated Press
An Indian military tank sinks while crossing a river in a region bordering China, killing 5 soldiers
Five Indian soldiers were killed when a military tank they were travelling in sank while crossing a river in the remote region of Ladakh that borders China, officials said Saturday. The tank sank early Saturday due to sudden increase in the water levels of Shyok River during a military training activity, according to an Indian army command center statement. It said the accident took place in Saser Brangsa near the Line of Actual Control that divides India and China in the Ladakh region.
SETHLUI.COM
Guan Huat Yong Tau Foo: 44-year-old YTF stall with super gao laksa broth & over 40 ingredients
The post Guan Huat Yong Tau Foo: 44-year-old YTF stall with super gao laksa broth & over 40 ingredients appeared first on SETHLUI.com.
NY Daily News
Ben Affleck officially moved out of marital home with Jennifer Lopez while she was on vacation
Ben Affleck has made his move-out official, reportedly taking all of his belongings from the Beverly Hills mansion he previously shared with wife Jennifer Lopez. The actor, 51, has been staying alone in a Brentwood rental home for the past month, but recently decided to remove all of his items that remained in Beverly Hills. He’s said to have moved all of his things out of the $60 million ...
INSIDER
How the US Navy tried — and failed — to sink carrier USS America for weeks
The USS America's sinking followed decades of service — and weeks of bombardment. Sinking an aircraft carrier proved harder than the US Navy thought.
INSIDER
Kamala Harris' camp is mad that Newsom and Whitmer are being floated as Biden replacements over the VP
Harris is seen by many as a natural potential successor to Biden. But an array of Democrats believe her middling approval ratings are a liability.
SETHLUI.COM
Mr Egg Fried Rice: Tasty fried rice with massive portions, hidden among HDBs in Bishan
The post Mr Egg Fried Rice: Tasty fried rice with massive portions, hidden among HDBs in Bishan appeared first on SETHLUI.com.
The Guardian
‘Sex in an LA spa was strangely wholesome, like an extension of the wellness experience’: This is how we do it in America
Rob used to be hyper-monogamous – but then he met Mikey and discovered a whole world of experimentation
The Telegraph
‘Trillion dollar trainwreck’: US super stealth fighter is eating the next generation
All of a sudden, the US Air Force is considering cancelling a multibillion-dollar effort to develop a new stealth fighter. Citing the high cost of the so-called “Next-Generation Air Dominance” programme and the competing demands of other projects, USAF leaders have warned they may have no choice but to cancel NGAD – and find other ways of winning control of the air in future wars.
Fortune
Nvidia will produce such a massive ‘cash gusher’ that it will have to buy back more stock because all that money has nowhere else to go, analyst says
Melius Research projects Nvidia will generate $270 billion in cash over the next three years, potentially setting the stage for huge shareholder returns.

Latest stories