Authors Sue OpenAI, Allege Their Books Were Used to Train ChatGPT Without Their Consent

Two authors filed a lawsuit against OpenAI alleging that their copyrighted books were used to train ChatGPT without their consent.
Paul Tremblay and Mona Awad claim that ChatGPT generates "very accurate summaries" of their works, according to the complaint.
They allege the summaries are "only possible" if ChatGPT was trained on their books, which would be a violation of copyright law.

Two authors filed a lawsuit against OpenAI last week alleging that their copyrighted books were used to train the company's artificial intelligence chatbot, ChatGPT, without their consent.

WATCH ANYTIME FOR FREE

>Stream NBC10 Boston news for free, 24/7, wherever you are.

Paul Tremblay, author of "The Cabin at the End of the World," and Mona Awad, author of "Bunny" and "13 Ways of Looking at a Fat Girl," said ChatGPT generates "very accurate summaries" of their works, according to the complaint. They allege the summaries are "only possible" if ChatGPT was trained on their books, which would be a violation of copyright law.

OpenAI did not immediately respond to CNBC's request for comment. Lawyers for Tremblay and Awad did not immediately respond.

Get updates on what's happening in Boston to your inbox. Sign up for our >News Headlines newsletter.

ChatGPT automatically generates text based on written prompts in a way that's much more advanced and creative than the chatbots of Silicon Valley's past. The technology was developed by San Francisco-based OpenAI, a research company led by Sam Altman and backed by Microsoft.

The chatbot is trained on an enormous amount of text data. OpenAI doesn't reveal what precise data was used for training ChatGPT, but the company said it generally crawled the web, including the use of archived books and Wikipedia.

The lawsuit, which was filed with a San Francisco federal court, alleges that "much" of the material in OpenAI's training data is based on copyrighted materials, including books by Tremblay and Awad. But proving exactly how and where ChatGPT gleaned this information, as well as whether the authors have suffered financial damages, could be a challenge.

Money Report

news 9 mins ago

Do your daily habits keep you mentally sharp? Get your ‘brain care score' in just 5 minutes

news 15 mins ago

The most desirable countries for jobseekers looking to live abroad: They want ‘flexibility, remote work and high quality of life,' expert says

The complaint references exhibits of the summaries that ChatGPT generated, and it notes that the chatbot gets some things wrong. Awad and Tremblay said that the rest of the summaries are accurate, however, which means "ChatGPT retains knowledge of particular works in the training dataset."

"At no point did ChatGPT reproduce any of the copyright management information Plaintiffs included with their published works," the complaint said.

Also on CNBC

Subscribe to the CNBC YouTube Channel

Morning Squawk Newsletter

Breaking

Money Report

Authors Sue OpenAI, Allege Their Books Were Used to Train ChatGPT Without Their Consent

By Ashley Capoot,CNBC • Published July 5, 2023 • Updated on July 5, 2023 at 2:28 pm

Money Report

Do your daily habits keep you mentally sharp? Get your ‘brain care score' in just 5 minutes

The most desirable countries for jobseekers looking to live abroad: They want ‘flexibility, remote work and high quality of life,' expert says

Also on CNBC

This article tagged under:

Follow Us