Authors VS ChatGPT: 17 authors say Open AI stole their books to train ChatGPT

Let’s get this party started :slight_smile:

A group of authors, including notable names like George R.R. Martin, John Grisham, Jodi Picoult, Michael Connelly, David Baldacci, and Scott Turow, are suing the makers of Chat GPT for intellectual property theft, specifically copyright infringement. OpenAI and other AI defendant companies claim that their use of training data, even if scraped from the internet, qualifies as “fair use” under U.S copyright law.

I know @aleksander will have something to say about this topic, but I would love to hear how you guys feel about it.

I know we have some authors here @roberto @Jose_Briones @kirkmahoneyphd and I would LOVE to hear your thoughts.

Personally, I also consider myself a writer (I’ve been published in Polish Newsweek) and although I have STRONG feelings about ChatGPT (@aleksander knows how much I hate it), as a writer I believe every writer learns and evolves by reading others’ works, a process similar to what AI does by analyzing and generating content based on existing writings.

While I understand and sympathize with the writers’ concerns, I believe that the writers’ legal arguments may not be strong enough to win the case.

What do you guys think?

3 Likes

As you know, I cannot disclose the specific name I have assigned to ChatGPT here, but you can deduce it by simply swapping two letters within that name.

The meaning of “fair use” can be interpreted in various ways. All of my work, including my entire website, is copyrighted. I am okay with individuals using quotes from my work as long as they give proper credit. People can also use images from my books or work if they ask me permission, and I do not charge any fees for such use.

Respecting an author’s integrity goes beyond the scope of “fair use.” If you wish to use someone’s work, you should provide proper references and directly seek permission if it goes more than that.

For my future works, I’ll include a specific mention of AI in the copyright notice.

4 Likes

@roberto Thanks for sharing your thoughts.
I think you & I are on the same page as far as ChatGPT goes (Like I said, I’m not a fan). It’s not a creative engine- it basically regurgitates what it has “learned” by reading (scraping) the text that is on the internet- which means that it has learned by reading text that was written before Generative AI, which includes copyrighted work. It puts stuff together and many times it’s just plain WRONG.
However, that said, isn’t it true that the way every writer learns, evolves, gets inspired and improves their craft is by reading others’ works?

For example, a while ago, here in Poland, there was some discussion that J.K. Rowling borrowed the idea for Harry Potter from a Polish book, written in 1933. “Kaytek the Wizard” (“Kajtuś Czarodziej” in Polish) was one of the lesser-known works of Janush Korczak, but it was one of the most famous and beloved Polish children books. (There is a moving coming out) Kaytek has been often compared to the hero of J.K. Rowling’s ‘Harry Potter’ series – although Korczak’s book was written more than 60 years before.
“Kaytek the Wizard” and J.K. Rowling’s “Harry Potter” series both delve into the world of magic and the adventures of young wizards. Both stories center on a young boy discovering his magical abilities. Kaytek and Harry both come into their powers unexpectedly and must navigate the complexities of these newfound abilities. Both Kaytek and Harry grapple with the implications of their powers. Kaytek deals with the immediate and often chaotic consequences of his unrestrained magical abilities, while Harry grapples with the darker elements of his power, especially as he learns more about his connection to Voldemort. Both Kaytek and Harry feel the weight of being different from others. While Kaytek’s struggles arise primarily from the misuse of his powers, Harry’s stem from his status as “The Boy Who Lived” and the attention, both positive and negative, that it brings. Finally, even though the school setting is central in the Harry Potter series (with Hogwarts), and only a part of “Kaytek the Wizard”, both narratives explore the protagonists’ interactions with peers, teachers, and school challenges.

In essence, both “Kaytek the Wizard” and the “Harry Potter” series share themes related to magic, growth, and morality, but are definitely their own distinct works.
Was JK Rowling inspired by “Kaytek the Wizard” & that’s why she wrote Harry Potter? We can only speculate, as many people have done, here in Poland.

However, even if she had been? Would it be so bad?

The point is that the main legitimate argument the writers might have is that AI didn’t pay for or legally acquire the original source material, or didn’t properly cite the sources it used. However, when something inspires a writer, how should that be noted?

As much as I dislike ChatGPT, generative AI is doing what human writers do, drawing inspiration from existing works. The difference is in the commercial exploitation of that derived content.

However, when we take the case of Harry Potter & Kaytek the Wizard, it’s clear that the book which might have been inspired by the original Polish material, is the one which was a commercial success.

2 Likes

I can only speak for myself. You can improve your work by reading other authors and gaining experience. However, the “human touch” that AI is unlikely to replicate is the emotional learning process, how it impacts you, and how it shapes your writing style. AI is and will continue to be valuable in specific domains. Still, I doubt its ability to replace writers and the unique human approach to conveying and processing information.

2 Likes

^ Brilliant!

2 Likes
2 Likes

@kirkmahoneyphd Cloning of voices & likenesses is DEFINITELY crossing the line.
Let the lawsuits come at Spotify. This is just too much.

To reply to @urszula, you said that we humans learn from reading books, which I agree with. However, we can’t share information we found in books, or concepts we learned, in the exact same way that an author did; which is probably not the case for a Large Language Model like ChatGPT. We can control what information we share, and can’t share, but a computer would just fetch information and when asked about it, it’ll give it to you in exactly the way it learned it (unless it’s on the program’s blacklist like military information to name a few). That’s where AI could lose!

1 Like

In my second edition:

Substack also now has an interesting option:

2 Likes

@rochdirais While I do agree the way we process info & the way we share it is so much different than an LLM, PLUS, human brains have guardrails in a way that LLMs do not.
That said, as someone who grew up around the Napster/Peer-to-peer file sharing era, I can see a lot of similarities with the way AI will impact copyrighted work. Look at us now, 20 years later & everyone is streaming- but they’re are paying for it. There definitely has to be some regulation, but to be honest- I’m not really sure (or know enough about it) to really say what the best course of action is.