AI search engine accused of plagiarism announces publisher revenue-sharing plan

July 31, 2024:

Robot caught in a flashlight vector illustration

On Tuesday, AI-powered search engine Perplexity unveiled a new revenue-sharing program for publishers, marking a significant shift in its approach to third-party content use, reports CNBC. The move comes after plagiarism allegations from major media outlets, including Forbes, Wired, and Ars parent company Condé Nast. Perplexity, valued at over $1 billion, aims to compete with search giant Google.

“To further support the vital work of media organizations and online creators, we need to ensure publishers can thrive as Perplexity grows,” writes the company in a blog post announcing the problem. “That’s why we’re excited to announce the Perplexity Publishers Program and our first batch of partners: TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com.”

Under the program, Perplexity will share a percentage of ad revenue with publishers when their content is cited in AI-generated answers. The revenue share applies on a per-article basis and potentially multiplies if articles from a single publisher are used in one response. Some content providers, such as WordPress.com, plan to pass some of that revenue on to content creators.

A press release from WordPress.com states that joining Perplexity’s Publishers Program allows WordPress.com content to appear in Perplexity’s “Keep Exploring” section on their Discover pages. “That means your articles will be included in their search index and your articles can be surfaced as an answer on their answer engine and Discover feed,” the blog company writes. “If your website is referenced in a Perplexity search result where the company earns advertising revenue, you’ll be eligible for revenue share.”

A screenshot of the Perplexity.ai website taken on July 30, 2024.
Enlarge / A screenshot of the Perplexity.ai website taken on July 30, 2024.

Benj Edwards

Dmitry Shevelenko, Perplexity’s chief business officer, told CNBC that the company began discussions with publishers in January, with program details solidified in early 2024. He reported strong initial interest, with over a dozen publishers reaching out within hours of the announcement.

As part of the program, publishers will also receive access to Perplexity APIs that can be used to create custom “answer engines” and “Enterprise Pro” accounts that provide “enhanced data privacy and security capabilities” for all employees of Publishers in the program for one year.

Accusations of plagiarism

The revenue-sharing announcement follows a rocky month for the AI startup. In mid-June, Forbes reported finding its content within Perplexity’s Pages tool with minimal attribution. Pages allows Perplexity users to curate content and share it with others. Ars Technica sister publication Wired later made similar claims, also noting suspicious traffic patterns from IP addresses likely linked to Perplexity that were ignoring robots.txt exclusions. Perplexity was also found to be manipulating its crawling bots’ ID string to get around website blocks.

As part of company policy, Ars Technica parent Condé Nast disallows AI-based content scrapers, and its CEO Roger Lynch testified in the US Senate earlier this year that generative AI has been built with “stolen goods.” Condé sent a cease-and-desist letter to Perplexity earlier this month.

But publisher trouble might not be Perplexity’s only problem. In some tests of the search we performed in February, Perplexity badly confabulated certain answers, even when citations were readily available. Since our initial tests, the accuracy of Perplexity’s results seems to have improved, but providing inaccurate answers (which also plagued Google’s AI Overviews search feature) is still a potential issue.

Compared to the free tier of service, Perplexity users who pay $20 per month can access more capable LLMs such as GPT-4o and Claude 3, so the quality and accuracy of the output can vary dramatically depending on whether a user subscribes or not. The addition of citations to every Perplexity answer allows users to check accuracy—if they take the time to do it.

The move by Perplexity occurs against a backdrop of tensions between AI companies and content creators. Some media outlets, such as The New York Times, have filed lawsuits against AI vendors like OpenAI and Microsoft, alleging copyright infringement in the training of large language models. OpenAI has struck media licensing deals with many publishers as a way to secure access to high-quality training data and avoid future lawsuits.

In this case, Perplexity is not using the licensed articles and content to train AI models but is seeking legal permission to reproduce content from publishers on its website.

Source link