The coming years will be filled with key court rulings to determine whether developers of generative artificial intelligence systems have illegally used writings, designs, voices, images, works of art, comments or videos uploaded to the Internet by professionals or users. Material that has been used to train their algorithms without the permission of its creators and sometimes protected by copyright. In that contest, the first victory has been scored by OpenAI.
A federal judge in New York has dismissed a lawsuit by the editors of two independent media outlets, Raw Story and AlterNet, who alleged that the artificial intelligence company had made illegitimate use of their articles. They highlight that OpenAI removed the authorship information, the headline of thousands of its news, as well as the fact that they were protected by copyright, thus allowing ChatGPT to replicate and redistribute it.
An activity that, they alleged, was carried out without having express authorization to do so from its original creators and obtaining profits that went to their own coffers. “When a user asks ChatGPT about a current event or the results of investigative journalism, ChatGPT will provide answers that mimic the copyrighted journalistic works that covered those events,” the editors cited as evidence.
“ChatGPT does not have any independent knowledge of the information provided in your responses. Rather, to serve Defendants’ paying customers, ChatGPT repackages, among other materials, copyrighted journalistic work product developed by Plaintiffs and others at their expense,” they added.
The lawsuit sought compensation for damages caused to the media and a court order to remove content from ChatGPT’s training sets. However, the judge has closed the door to that option for the moment, indicating that the media have not been able to demonstrate the damage that ChatGPT had caused them to support the lawsuit. “Without concrete damage, there is no legitimation,” he says.
The judge explains that the large amount of information that was used in ChatGPT training means that the probability that the program reproduces content from the plaintiffs’ articles is very low. “When a user enters a question in ChatGPT, this AI synthesizes the relevant information from its repository into a response. Given the amount of information contained in the repository, the likelihood of ChatGPT producing plagiarized content from one of the plaintiffs’ articles appears remote,” the ruling reads.
However, the ruling gives Raw Story and AlterNet another opportunity to try again with solid evidence of both the “substantial risk” that the current version of ChatGPT is generating responses plagiarizing their articles, and the “concrete harm” that this causes to your business. Warning them, of course, that she is “skeptical” that they can “claim recognizable damage” based on what has been seen so far.
The plaintiffs’ lawyer has stated that they are “confident” that they will be able to “address the concerns identified by the court through an amended complaint” in statements to the Reuters Agency.
OpenAI, for its part, has stressed the legitimacy of its actions to train ChatGPT. “We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by long-standing and widely accepted legal precedents,” a spokesperson highlighted in a statement sent to this medium. Earlier this year the company acknowledged that it would have been “impossible” to train ChatGPT while respecting copyright.
Greater rivals on the horizon
Raw Story and AlterNet are two independent American media with a progressive line and have a similar audience, although somewhat larger in absolute numbers for the former (about five million monthly readers). Raw Story acquired Alternet in 2018 to form a shared media company for the two platforms. John Byrne, founder of Raw Story, argued during the event that “news organizations must stand up to OpenAI” for using “the hard work of journalists whose jobs are under siege.”
OpenAI has other larger processes ahead of it than the one involving Raw Story and AlterNet, although this could set a precedent in terms of judges’ assessment of the real damage that AI companies have done to rights holders. The biggest challenge for the media sector is surely the lawsuit filed by The New York Times against OpenAI itself and Microsoft.
The Times’ arguments are similar to those of Raw Story and AlterNet. The prestigious newspaper argues that OpenAI and Microsoft are benefiting from the newspaper’s intellectual property by using it to generate content that competes with the original articles, which has the potential to reduce the newspaper’s traffic and revenue. Although the lawsuit does not specify an exact amount, it claims compensation of billions of dollars for this illicit use of its contents.
Evidence presented by the Times in the process has shown that it also asks companies to stop using its articles in their training data and to destroy any AI models built with this material.
OpenAI is also being sued by the US authors’ union, to which writers such as George RR Martin belong (song of ice and fire). They have denounced the “systematic theft” of their works by the company that created ChatGPT, which is already valued at $157 billion after closing the largest round of venture capital financing in history.
Agreements with other editors
Perhaps to avoid further legal problems, OpenAI has also signed important agreements with news publishers to access and use their content in training its artificial intelligence models. Among the most notable pacts is one with the Associated Press (AP) agency, one of the most important in the world; with or News Corp, owner of hundreds of media outlets such as The Wall Street Journal or The Times.
OpenAI also signed an exclusive partnership with the Spanish media group Prisa in March of this year, allowing ChatGPT to access content published by El País, Cinco Días, AS and El HuffPost. “Joining forces with OpenAI opens up new avenues for us to reach our audiences. Taking advantage of ChatGPT’s capabilities allows us to present our in-depth, quality journalism in innovative formats, reaching people looking for rigorous and independent content. This is a definitive step towards the future of news, where technology and human experience merge to enrich the reader’s experience,” said Carlos Núñez, executive president and CEO of Prisa.
#OpenAI #wins #battle #news #publishers #train #ChatGPT #permission