26 Comments

Excellent article and discussion. 👍🏽 I am hoping the parties on both sides can come up with a fair profit-sharing arrangement but that will take a lot of trust, which seems to be in short supply.

Expand full comment

Thanks! I think trust goes out the window when we start suing each other, but even then, some sort of profit-sharing seems like a reasonable outcome here.

Expand full comment

Canadian news outlets are currently suing OpenAI for copyright infringement in the amount of $20,000 per article. This could add up to billions in damages if they succeed.

When it comes to image copyrights, we work very closely with our legal dept. when we create some graphic that we use in a film production. If we wanted to make our own version of a real police badge, we have to prove that our version is significantly different. We have to show all of our work like a high school math problem. As creators, we take inspiration from other creators all the time. In fact, it's impossible to design in a vacuum. But we don't take inspiration from millions of sources.

I think this is the major shortfall of any legal challenge by artists. If Gen AI is stealing from everyone, how can a single or a small group of artists sue the Gen AI company? How can the courts determine which part of which artist's work was used in which part of a Gen AI creation? Even the AI experts don't know exactly how it works! It's an impossible situation where the loser is the artist.

Expand full comment

Regarding "How can the courts determine which part of which artist's work was used": That's certainly part of the problem, and it may feel like the AI companies have 'hacked copyright' with their fancy 'learning' technology, carefully shrouded in secrecy. But what some of the lawsuits are arguing is that it is illegal by itself to collect copyrighted works for training purposes. If the courts agree, this would introduce lots of red tape to the AI process, as illustrated by @Nneoma Grace Agwu-Okoro elsewhere in the comments.

I'm curious though: when you create graphics for a film production, is AI image generation an acceptable tool? And if OpenAI wins all of these court cases, making AI art 'totally legal' (there's a better legal term, probably), would that make a difference for the art department?

Expand full comment

The short answer is it depends. Graphics that we put on the screen is evaluated on a gradient of highly featured as in an ECU to deep, deep, deep background where it might be just a blurry image. Different Studios have different appetites for risk. Not only that but not all training data used in LLM are created equally. The largest stock library in the world iStock offers AI images that supposedly were trained on their licensed data.

Expand full comment

Ah, of course! It's easy to forget there are in fact licensed alternatives.

(Also, I definitely knew ECU means Extreme Close-Up. Just like everyone reading this.)

Expand full comment

LOL Nice catch! I often forget that not everyone knows Filmspeak.

Expand full comment

I think the cases against OpenAI and other AI companies could go either way because whether we like it or not, generative AI is here to stay. Its pace of development is a whole other issue that will be determined by the outcome of the cases.

Now onto copyright infringements and my opinion;

- The creatives can argue that these AI companies should pay them for using their work because they're well-funded for-profit companies, but think about the logistics of that for a minute. Who will they pay and who will be left out? What's the method for calculating fees to be paid to each creative? How will one work be valued over another (because IP is qualitative, not quantitative)? and other questions like these

- The AI companies can argue for fair use, transformative use, or even "research" (as someone in the comments pointed out), but this defense has a 50/50 chance of success because fair use isn't a globally accepted defense to copyright infringement, it might not be accepted as a defense to every claim in the suit, and it may be accepted as a defense because the AI companies can prove "transformative use" whatever that truly means

- Legal departments can argue that AI companies should meticulously note and report in a public registry every copyrighted work being used as training data for its generative AI systems, that way there is complete transparency in the training process, making it as highly regulated as data privacy has now become. whether or not this will please the creatives is still to be seen.

- Legal departments can also argue that these AI companies should use the principle of first sale, where they knowingly buy a copy of a copyrighted work and automatically receive the right to sell, display, or otherwise dispose of that particular copy, notwithstanding the interests of the copyright owner (just like movie and music streamers did at the birth of the streaming industry). But like the first point, the issue lies in the logistics of it. Unlike the music or movie industries, there aren't unified bodies of copyright owners worldwide on the many subject matters and copyright-protected categories that exist today. Asking the AI companies to do this might prove to be a herculean task.

These are just my opinions. I am paying rapt attention to these cases as they will have a global effect and I am interested to see which way the courts go.

Insightful article as always @P.Q. Rubin

Expand full comment

Thanks, Nneoma! I feel like you just wrote the follow-up article I wish I could write.

The logistics of copyright holders' compensation will be complicated. But I don't know if that's a good argument against compensation. AI critics will say that it's simply the consequence of the AI companies' ruthlessness in stealing everyone's content. (Just like how a criminal can be ordered to compensate their victim, but cannot possibly compensate everyone fairly after having robbed a million people.)

I remember Reddit coming under fire for selling their users' collective data to Google for AI training purposes. Reddit reportedly got $60 million, none of which went to its users. Maybe the real problem is not logistics, but power balance?

Finally, I am intrigued by the "principle of first sale" you mentioned. If I understand correctly, this doctrine applies to physical copies (books, movies, CDs), but not digital copies or licenses. That would mean that OpenAI's (presumed) digital scraping is off the table, but something like Google Books has stronger legal footing. How interesting!

Expand full comment

@ PQ.RUBIN Great Overview. Sam Altman just donated recently millions to Trump inauguration Fund, my thoughts about how this lawsuit will go through since Trump will be in the white house soon. I think everybody knows and Sam Altman is on suck-up mode right now, all you have to do is just listening to him. @bruce landay made an interesting point and i do believe that Open AI or other AI companies should pay creators to train their models. I don't think they lack the money for the billions they will be making in the next coming years.

Expand full comment

Sam Altman handing out donations (some would say bribes) to politicians is unfortunately not exceptional. Even if it doesn’t get him out of these lawsuits, it ensures favorable legislation in the long run. This dynamic should be interesting, given that there are also significant financial interests on the other side. Let's not forget: copyright law itself is often seen as a product of corporate interests.

Expand full comment

This is a very good overview. My thoughts on this are that we're in for a protracted legal battle. While AI companies may plead "fair use", content creators still have rights protected by law. But if the AI companies can prove that they generate content through "researching" their data trove, and then refining that into "original" content, then that could be a different angle altogether.

Expand full comment

Thank you!

One thing I mostly left out (because I have trouble understanding it) is "transformative use", which is apparently a type of fair use. When you write "researching", are you referring to the same thing, or is "research" a distinct type of fair use?

Expand full comment

I was using “research” as a distinct type of fair use; however, I think the proper term would be transformative use. If the AI companies can prove that their platforms add a unique purpose to the original work, they just might get away with it. How they would do that though is another issue.

Expand full comment

I see, thanks!

Expand full comment

As an author I would like to see the AI organizations lose the lawsuits and pay for training their models. These are well funded for profit companies and they need to take responsibility for the data they are vacuuming up.

Thanks for a good overview!

Expand full comment

Thanks, Bruce! Surely, OpenAI has the funds to compensate all parties involved and continue its highly lucrative business. However, if a loss in court means no more scraping without permission, that sounds to me like a real problem competition-wise.

Expand full comment

I had totally forgotten about Napster

Expand full comment

That's exactly how the record industry likes it 😉

But yes, it's been a while, and the only reason I brought up Napster is because I couldn't think of a more recent case where a tech company got in legal trouble and ended operations because of it.

Expand full comment

My instinct is that the AI organizations will lose.

Expand full comment

Really? I think it's the opposite.

Expand full comment

The reason I think it is because these AI machines write stories. Writing is an act of creation which is where something emerges from nothing. AI cannot do that which means its stories already existed in some form.

Expand full comment

This is a profound issue that goes beyond copyright infringement. It may be more accurate to say that this is a good reason the AI companies *should* lose.

Expand full comment

Good response Robert but I look at it like this (just my 2 cents): AI has a much bigger combination of words in it's memory than a "creator" does....and it understands how to combine these words at faster speed and varying combinations than a "creator" So if i tell it to write me a story on Robert switching off the light and running into an alien in his garage....that story was created by AI...not me or whoever made the sentence switch off the light plus a story of running into an alien...it is absolutely brand new....however....when I was responding to you....I was more looking at the angle of AI organizations (with major big bucks) going against old laws that did not fathom the number of ways AI can help us today and even more in the future as it learns more "words"

Expand full comment
Comment deleted
Dec 21
Comment deleted
Expand full comment

Thanks Al, much appreciated!

I'm curious how you as a photographer look at unauthorized scraping. You make the comparison: "If I take a photography student into a museum...", but there are some important differences from the artists's perspective: In a museum, you get control over the way your art is displayed, you get to set the admission price and other conditions: 'no pictures, just buy a reproduction in the museum shop!'.

Scrapers work differently. To stick to the comparison: they break into all museums at night, they copy everything they can find, they remove name tags and signatures, and then they feed all of their copies into their magical machine that makes them more money than any photographer has ever made. Don't you think artists need some form of protection, or compensation at the very least?

Expand full comment
Comment deleted
Dec 22
Comment deleted
Expand full comment

I see where you're coming from, and how the AI's automated process can feel like a creative process. But I still think there's a difference between a student's homage and a machine's mass scraping. Maybe we can agree that AI's "inspiration" is a bit more... industrial-scale than the average art student's?

Expand full comment