featured

Associate Dean (Postgraduate Programs), and Associate Professor, Centre for Social Impact

Authors sue OpenAI over ChatGPT copyright: could they win?

Opinion | 11 July 2023

Dr Dilan Thampapillai

Associate Dean (Postgraduate Programs), and Associate Professor, Centre for Social Impact

Two authors allege unauthorised use of their books in a lawsuit against OpenAI, but they must prove potential economic loss, writes UNSW Business School's Dilan Thampapillai

Imagine you read a book. You commit details of the book to memory and ruminate on its ideas. Somebody then asks you a question about the book. You provide them with a written response. Would you be surprised if the book's author tried to sue you for copyright infringement? OpenAI is facing exactly this situation.

Authors Mona Awad (Bunny, 13 Ways of Looking at a Fat Girl) and Paul Tremblay (The Cabin at the End of the World) filed a lawsuit against OpenAI recently, claiming the books were used to train ChatGPT, its artificial intelligence (AI) software, without their consent.

The Guardian reported that it is the first lawsuit against ChatGPT that concerns copyright. The only difference from the scenario I’ve outlined is that instead of a human reading a book, OpenAI is accused of allowing its AI program to copy a book to its internal database and train on it.

OpenAI recently were used to train ChatGPT AI software.jpg — In the first lawsuit against ChatGPT alleging copyright infringement, OpenAI is accused of allowing its AI program to copy and train on a book in its internal database. Photo: Getty

What’s the lawsuit’s chance of success?

OpenAI is a large language model (LLM). These LLMs train on data in the form of written works to provide natural language responses to prompts.

The basis of the lawsuit is that OpenAI trained itself on their novels and produced accurate summaries of their works when prompted. Notably, the lawsuit does not specify which specific parts of Awad and Tremblay’s novels have been unlawfully copied and reproduced in the summaries.

The lawsuit alleges OpenAI uses “shadow libraries” that illegally publish thousands of copyrighted works (using torrent systems). Their claim is based on a 2020 paper by OpenAI that reveals 15 per cent of their training dataset comes from “two internet-based books corpora.”

But the lawsuit faces some immediate hurdles. The litigants will need to prove that OpenAI most likely copied their works. They will also need to demonstrate the likelihood of some economic loss. Crucially, copyright protection does not extend to ideas.

Copyright protection is limited to written expression. And though copying something to a database might be an act of infringement, that act alone is unlikely to cause significant harm to the economic interests of the authors. The real danger is that OpenAI can do some of the things human authors can do.

How does Australian law apply?

OpenAI is just the first generation of what this technology looks like. No doubt, many authors (and other creative producers) are starting to wonder what will happen when OpenAI and similar technologies evolve. Moore’s Law, a calculation that estimates the capacity of digital technology doubles roughly every two years, suggests the rate of this development might be exponential.

What would happen if a similar claim was raised in Australia? Would our fair dealing laws step in and protect the development of technology – or would our law side with the authors?

The United States has the doctrine of fair use in its copyright laws. In the past, fair use has been used to draw a balance between new technologies and established copyright interests. The Sony video cassette recorder case is a famous example.

In the Sony case, a majority of the US Supreme Court permitted homeowners to record their favourite television shows and watch them later, so long as they didn’t keep the recordings. (By comparison, Australia didn’t legalise this until 2006.) Fair use also allowed the rap group 2-Live Crew to radically rework and parody Roy Orbison’s song Pretty Woman.

Australia has effectively put the essence of some fair use decisions into its Copyright Act. The Australian Copyright Act contains provisions on time-shifting and fair dealing for parody. Yet, Australia has repeatedly declined to house fair use within its law. Instead, we rely upon its unwieldy cousin, known as the doctrine of fair dealing. A claim like the one Mona Awad and Paul Tremblay are making against OpenAI would likely fail in Australia.

OpenAI's ChatGPT is being sued for copyright infringement (1).jpg — Copyright law was originally developed in a time of human-centric writing and copying and lacks consideration for the existence, infringement, and exceptions of AI-driven works. Photo: Unsplash

Ideas are not protected

Like the United States, Australian law protects tangible expression but not ideas. People need to be free to use ideas in subsequent works.

Much the same logic should apply to large-language models such as OpenAI.

And a formidable barrier emerges in the bedrock ideas of copyright law. Copyright was conceived and refined in an era when writing and copying were done by human beings. This means the fundamental concepts within the law relating to subsistence (proving a work’s continued existence), infringement and exceptions are human-centric.

This is quite a mountain to climb in any copyright litigation. If a human actor has not committed an act of infringement, it might be hard to find another human liable – even though an author might feel aggrieved.

Subscribe to BusinessThink for the latest research, analysis and insights from UNSW Business School

Nevertheless, the problem is that Australian law does not house an open-ended legal rule like fair use, which can draw a fine balance between technology and authors.

And we are yet to have the policy debate here about how we will manage the looming conflict between rapidly advancing technologies and authors who depend on their writing for their livelihoods.

The OpenAI litigation might well fail. But it is just the first salvo in a major AI-driven ground shift in copyright.

User experience Risk Technology Social Impact Ethics Digital Business Strategy Innovation Law Regulation Artificial Intelligence

Republish this article

Republish

You are free to republish this article both online and in print. We ask that you follow some simple guidelines.

Please do not edit the piece, ensure that you attribute the author, their institute, and mention that the article was originally published on Business Think.

By copying the HTML below, you will be adhering to all our guidelines.

Press Ctrl-C to copy

<h1>Authors sue OpenAI over ChatGPT copyright: could they win?</h1>

<figure><img src="https://assets-us-01.kc-usercontent.com:443/4df0558d-5779-0012-00f2-b3fa06d6c950/d8887362-2670-41a3-933d-e6cc83ca8b57/Authors%20sue%20OpenAI%20for%20ChatGPT%20copyright_could%20they%20win_-min.jpg?w=1320" alt="Authors sue OpenAI over ChatGPT copyright: could they win?" /><figcaption>Authors sue OpenAI over ChatGPT copyright: could they win?</figcaption></figure>
 Imagine you read a book. You commit details of the book to memory and ruminate on its ideas. Somebody then asks you a question about the book. You provide them with a written response. Would you be surprised if the book's author tried to sue you for copyright infringement? OpenAI is facing exactly this situation.
Authors Mona Awad (<a href="https://www.bloomsbury.com/au/bunny-9781788545440/">Bunny</a>, <a href="https://www.bloomsbury.com/au/13-ways-of-looking-at-a-fat-girl-9781788549684/" data-new-window="true" target="_blank" rel="noopener noreferrer">13 Ways of Looking at a Fat Girl</a>) and Paul Tremblay (<a href="https://www.goodreads.com/en/book/show/36381091" data-new-window="true" target="_blank" rel="noopener noreferrer">The Cabin at the End of the World</a>) <a href="https://www.cnbc.com/2023/07/05/authors-sue-openai-allege-chatgpt-was-trained-on-their-books.html" data-new-window="true" target="_blank" rel="noopener noreferrer">filed a lawsuit</a> against OpenAI recently, claiming the books were used to train ChatGPT, its artificial intelligence (AI) software, without their consent.
<a href="https://www.theguardian.com/books/2023/jul/05/authors-file-a-lawsuit-against-openai-for-unlawfully-ingesting-their-books" data-new-window="true" target="_blank" rel="noopener noreferrer">The </a><a href="https://www.theguardian.com/books/2023/jul/05/authors-file-a-lawsuit-against-openai-for-unlawfully-ingesting-their-books" data-new-window="true" target="_blank" rel="noopener noreferrer">Guardian reported</a> that it is the first lawsuit against ChatGPT that concerns copyright. The only difference from the scenario I’ve outlined is that instead of a human reading a book, OpenAI is accused of allowing its AI program to copy a book to its internal database and train on it.
<figure class="figure"><img src="https://assets-us-01.kc-usercontent.com:443/4df0558d-5779-0012-00f2-b3fa06d6c950/122cd853-b8fc-4524-bd3c-9fa693f161e0/OpenAI%20recently%20were%20used%20to%20train%20ChatGPT%20AI%20software.jpg" class="figure-img" alt="OpenAI recently were used to train ChatGPT AI software.jpg"><figcaption class="figure-caption">In the first lawsuit against ChatGPT alleging copyright infringement, OpenAI is accused of allowing its AI program to copy and train on a book in its internal database. Photo: Getty</figcaption></figure>
<h2>What’s the lawsuit’s chance of success?</h2>
OpenAI is a large language model (LLM). These LLMs train on data in the form of written works to provide natural language responses to prompts.
The basis of the lawsuit is that OpenAI trained itself on their novels and produced accurate summaries of their works when prompted. Notably, <a href="https://llmlitigation.com/pdf/03223/tremblay-openai-complaint.pdf" data-new-window="true" target="_blank" rel="noopener noreferrer">the lawsuit</a> does not specify which specific parts of Awad and Tremblay’s novels have been unlawfully copied and reproduced in the summaries.
The lawsuit <a href="https://news.bloomberglaw.com/ip-law/openai-facing-another-copyright-suit-over-ai-training-on-novels" data-new-window="true" target="_blank" rel="noopener noreferrer">alleges</a> OpenAI uses “shadow libraries” that illegally publish thousands of copyrighted works (using torrent systems). Their claim <a href="https://news.bloomberglaw.com/ip-law/openai-facing-another-copyright-suit-over-ai-training-on-novels" data-new-window="true" target="_blank" rel="noopener noreferrer">is based</a> on a 2020 paper by OpenAI that reveals 15 per cent of their training dataset comes from “two internet-based books corpora.”
But the lawsuit faces some immediate hurdles. The litigants will need to prove that OpenAI most likely copied their works. They will also need to demonstrate the likelihood of some economic loss. Crucially, copyright protection does not extend to ideas.
Copyright protection is limited to written expression. And though copying something to a database might be an act of infringement, that act alone is unlikely to cause significant harm to the economic interests of the authors. The real danger is that OpenAI can do some of the things human authors can do.
<a class="o-btn o-btn--primary" href="https://businessthink.unsw.edu.au/articles/business-secrets-chatgpt" target="_self">Read more: Have you given away business secrets on ChatGPT?</a>
<h2>How does Australian law apply?</h2>
OpenAI is just the first generation of what this technology looks like. No doubt, many authors (and other creative producers) are starting to wonder what will happen when OpenAI and similar technologies evolve. <a href="https://www.britannica.com/technology/Moores-law" data-new-window="true" target="_blank" rel="noopener noreferrer">Moore’s Law</a>, a calculation that estimates the capacity of digital technology doubles roughly every two years, suggests the rate of this development might be exponential.
What would happen if a similar claim was raised in Australia? Would <a href="https://theconversation.com/explainer-what-is-fair-dealing-and-when-can-you-copy-without-permission-80745" data-new-window="true" target="_blank" rel="noopener noreferrer">our fair dealing laws</a> step in and protect the development of technology – or would our law side with the authors?
The United States has the doctrine of fair use in its copyright laws. In the past, fair use has been used to draw a balance between new technologies and established copyright interests. The Sony video cassette recorder case is a famous example.
In the Sony case, a majority of the US Supreme Court permitted homeowners to record their favourite television shows and watch them later, so long as they didn’t keep the recordings. (By comparison, Australia didn’t <a href="http://www8.austlii.edu.au/cgi-bin/viewdoc/au/legis/cth/num_act/caa2006213/sch6.html" data-new-window="true" target="_blank" rel="noopener noreferrer">legalise this</a> until 2006.) Fair use also allowed the rap group 2-Live Crew to radically <a href="https://www.youtube.com/watch?v=65GQ70Rf_8Y" data-new-window="true" target="_blank" rel="noopener noreferrer">rework and parody</a> Roy Orbison’s song Pretty Woman.
Australia has effectively put the essence of some fair use decisions into its <a href="https://www.legislation.gov.au/Details/C2019C00042" data-new-window="true" target="_blank" rel="noopener noreferrer">Copyright Act</a>. The Australian Copyright Act contains provisions on time-shifting and fair dealing for parody. Yet, Australia has repeatedly declined to house fair use within its law. Instead, we rely upon its unwieldy cousin, known as the doctrine of fair dealing. A claim like the one Mona Awad and Paul Tremblay are making against OpenAI would likely fail in Australia.
<figure class="figure"><img src="https://assets-us-01.kc-usercontent.com:443/4df0558d-5779-0012-00f2-b3fa06d6c950/241bb980-316c-46f7-9a6d-6cf17582e478/OpenAI%27s%20ChatGPT%20is%20being%20sued%20for%20copyright%20infringement%20%281%29.jpg" class="figure-img" alt="OpenAI's ChatGPT is being sued for copyright infringement (1).jpg"><figcaption class="figure-caption">Copyright law was originally developed in a time of human-centric writing and copying and lacks consideration for the existence, infringement, and exceptions of AI-driven works. Photo: Unsplash</figcaption></figure>
<h2>Ideas are not protected</h2>
Like the United States, Australian law protects tangible expression but not ideas. People need to be free to use ideas in subsequent works.
Much the same logic should apply to large-language models such as OpenAI.
And a formidable barrier emerges in the bedrock ideas of copyright law. Copyright was conceived and refined in an era when writing and copying were done by human beings. This means the fundamental concepts within the law relating to subsistence (proving a work’s continued existence), infringement and exceptions are human-centric.
This is quite a mountain to climb in any copyright litigation. If a human actor has not committed an act of infringement, it might be hard to find another human liable – even though an author might feel aggrieved.
<a class="o-btn o-btn--primary" href="https://businessthink.unsw.edu.au/subscribe" target="_self">Subscribe to BusinessThink for the latest research, analysis and insights from UNSW Business School</a>
Nevertheless, the problem is that Australian law does not house an open-ended legal rule like fair use, which can draw a fine balance between technology and authors.
And we are yet to have the policy debate here about how we will manage the looming conflict between rapidly advancing technologies and authors who depend on their writing for their livelihoods.
The OpenAI litigation might well fail. But it is just the first salvo in a major AI-driven ground shift in copyright.

Comments

Opinion

featured

Authors sue OpenAI over ChatGPT copyright: could they win?

What’s the lawsuit’s chance of success?

How does Australian law apply?

Ideas are not protected

Republish

Related

So, what are these AI skills?

DeepSeek has accelerated the race for global AI dominance

AI in recruitment: Coping with the flood of ChatGPT applications

Why do we trust failing humans more than we trust flawless AI?

Find an expert

Connect