Newspapers Sue OpenAI for Copyright Infringement and ‘Fake News’ Hallicunations
Starting last year, various rightsholders have filed lawsuits against companies that develop AI models.
The list of complainants includes record labels, book authors, visual artists, a chip maker, and news publications. These rightsholders all object to the presumed use of their work without proper compensation.
Keeping pace with the constant stream of legal paperwork is a challenge, but a complaint filed at a New York federal court yesterday deserves to be highlighted. In this case, eight major news publications are suing OpenAI and Microsoft for copyright infringement.
U.S. Newspapers Sue OpenAI and Microsoft
The New York Daily News, Chicago Tribune, Orlando Sentinel, Sun-Sentinel, Mercury News, Denver Post, Pioneer Press, and Orange County Register, claim that the AI companies used their publications to train and develop ChatGPT models without obtaining permission.
In addition, ChatGPT can recall large parts of their copyright-protected articles, which effectively bypasses their paywalls. This has a direct effect on the newspapers’ revenues, they argue.
“Defendants are taking the Publishers’ work with impunity and are using the Publishers’ journalism to create GenAI products that undermine the Publishers’ core businesses by retransmitting ‘their content’—in some cases verbatim from the Publishers’ paywalled websites—to their readers.”
Training On and Reproducing Copyrighted Articles
The complaint alleges that the newspapers’ articles are prominent parts of the training material for OpenAI’s models. GPT-3, for example, has 175 billion parameters and includes the ‘WebText2’ and ‘Common Crawl’ databases that both contain material owned by the plaintiffs.
This alleged unauthorized use remains ongoing, the newspapers claim, and it will likely continue in the future.
“On information and belief, Microsoft and OpenAI are currently or will imminently commence making additional copies of the Publishers’ Works to train and/or fine-tune the next generation GPT-5 LLM,” the complaint adds.
The plaintiffs show that ChatGPT can reproduce content from copyrighted news articles when prompted. In addition, third-party services in the OpenAI store are specifically marketed to bypass their paywalls, they say.
These tools include a custom GPT called “Remove Paywall” and a tool such as “News Summarizer”, which promises to “save on subscription costs” and “skip paywalls just using the link text or URL.”
OpenAI and Microsoft have previously argued that the use of copyrighted works to train its models falls under fair use. In addition, they called out the lack of specific copyright infringements by third parties.
This lawsuit is likely to trigger similar defenses, but copyright infringement allegations are just part of the newspapers’ complaint.
‘Fake News Hallucinations’
The newspapers are not only concerned by the unauthorized use of their works; they also allege that the AI tools cause commercial and competitive injury by spreading false claims.
The plaintiffs cite various examples where ChatGPT allegedly links dubious news reporting to their newspapers.
“As if plagiarizing the Publishers’ work were not enough, Defendants’ products are often subject to ‘hallucinations’ where those products malign the Publishers’ credibility by falsely attributing inaccurate reporting to the Publishers’ newspapers.
“Beyond just profiting from the theft of the Publishers’ content, Defendants are actively tarnishing the newspapers’ reputations and spreading dangerous disinformation.”
One example is the spurious claim that disinfectants can cure Covid. While many newspapers reported on these claims, they didn’t endorse them.
These hallucinations dilute and injure the reputation of the newspapers, the complaint alleges. This claim comes on top of the various copyright infringement accusations for which they request compensation.
Ultimately, the newspapers are not against Artificial Intelligence, but they do want OpenAI and Microsoft to pay for the content they use and, ideally, ensure that their reputations are not harmed in the process.
“This lawsuit is about how Microsoft and OpenAI are not entitled to use copyrighted newspaper content to build their new trillion-dollar enterprises, without paying for that content.
“As this lawsuit will demonstrate, Defendants must both obtain the Publishers’ consent to use their content and pay fair value for such use,” the newspapers conclude.
—
A copy of the complaint, filed by the newspapers at the U.S. District Court for the Southern District of New York, is available here (pdf)
From: TF, for the latest news on copyright battles, piracy and more.
TorrentFreak