Generative AI Does Not Violate Copyright, but Needs To

Tags

AI, AI and copyright, Copyright, Generative AI, Politics, public policy

— Kevin B Korb, 30 Mar 2024

There have been numerous lawsuits brought against generative AI companies by copyright holders on creative works that claim generative AI infringes on the copyright of material used in training up those AIs. Some of those suits may succeed and some may fail. Furthermore, some of those that fail may well end up being widely acknowledged as deserving to have succeeded, even while failing. I will return to that last point at the end: it suggests that, perhaps exceedingly unsurprisingly, new technology requires new legislation. In the meantime, though, I want to argue that the underlying logic of many complaining copyright holders is flawed: that even though generative AI companies have paid for access to the copyrighted material, they have no right to use it in training without explicit permission. The courts may end up disagreeing with me — indeed, in recent times US courts in particular have shown a distinct inclination to legislate on their own — but I think existing copyright law clearly cuts against their argument.

Note that I’m not saying that no copyright holder has a legitimate case to make under existing copyright law. There are two kinds of complaint that I address here. The first claims that some works of art or authorship produced by AI are so close to an original as to count as an illegitimate copy. These are the kinds of claims that have been made by creators since IP law was invented; indeed, providing protection against such infringement is the primary purpose of those laws. The second claims that using original works, whether access is paid for or not, to train AIs that then go on, not to copy those works, but to produce works that are not recognizable as copies already violates copyright law. This kind of claim is novel, and, I claim, unsupported by existing copyright law. There are now multiple lawsuits making this kind of claim, however (see, e.g., The Intercept, Raw Story and Alternet Sue OpenAI and New York Times Sues OpenAI and Microsoft).

There is some gray area between these two complaints. The boundary between infringement and fair use of existing works will likely always be litigated and always be adjusted. Existing IP law protects derivative works that constitute infringement, wherever that line is drawn. But where people argue that the protection extends beyond what is counted as infringement, then it is overreach. For example,

In a case filed in late 2022, Andersen v. Stability AI et al., three artists formed a class to sue multiple generative AI platforms on the basis of the AI using their original works without license to train their AI in their styles, allowing users to generate works that may be insufficiently transformative from their existing, protected works, and, as a result, would be unauthorized derivative works. (Harvard Business Review, 7 Apr, 2023)

The judgment of whether generated works are insufficiently transformative is a normal one in IP law; the judgment that some of them may end up being insufficiently transformative — that the AI simply has the ability to infringe — is irrelevant to existing IP law, but nevertheless appears to be the motivating thought behind many complaints.

NYT in particular claims that OpenAI’s use of its reporting as training inputs “seek[s] to free-ride on the Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.” (Guardian, 23 Dec 2023) Authors suing OpenAI have much the same motivation: “To preserve our literature, authors must have the ability to control if and how their works are used by generative AI.” (Guardian, 20 Sep 2023) The fear is not that GPT or other AIs will duplicate their training inputs, or produce modifications of them that are close enough to the original to count as infringements in court, but that the AIs will end up putting them out of their jobs by producing non-infringing works which perform the same roles in society that authors, musicians, newspaper publishers, etc. currently play. In other words, they fear for their incomes from future works, not the theft of their existing works.

The analogical case with humans would be where established authors, for example, sued new and upcoming authors for having learned their craft, in part, by reading the works of established authors. The new authors may well through their reading and study become better than the last generation of authors and so displace some of their sales. And they could not have done so without first reading those established authors. The same is true of painters who start out by copying copyrighted paintings to learn their craft. Needless to say, the same is true of every genre and form of creative work. If copyright law really prohibited learning from existing copyrighted works, without directly infringing on them, then creative arts would cease. That was never the intent, nor the practice, of existing copyright law.

Exactly the same reasoning applies to OpenAI and other Large Language Models and their training using existing copyrighted works. If existing copyright law can be used to put a stop to their training and becoming adept at their particular generative niche, then not only will this kind of AI cease to develop, but potentially lawyers will bring similar action against new human artists and stop them from creating new artwork. This whole effort is ill conceived and should not (but may) succeed.

Regardless, there is a real problem, newly arisen with generative AI, that does need to be addressed. Human artists may slowly replace a prior generation of artists, but not, in general, by erasing their ability to have an income, but simply by a natural replacement of the old with the new. Generative AI genuinely threatens to sweep away existing creative content in an unrestricted wave of similar, if non-infringing, new content. Much of the problem is just the unrestricted volume of similar work that can be rapidly created. It is a threat not just to mundane work and their producers, such as copy editing and report generation, but also to script writers, film directors, symphonic orchestras, painters, and so on. It is manifest that something needs to be done.

I suggest the answer, however, is not twisting existing copyright law into performing new functions badly, but in writing new laws that directly address the new problems.

New Law

I think the need for new copyright law is clear. As the Guardian has put it (28 Feb 2024):

The wave of lawsuits reflects a media industry-wide concern that generative AI will compete with established publishers as a source of information for internet users, while further sapping advertising revenues and undermining the quality of online news.

That’s a legitimate concern. Or, consider Aboriginal Australian painters who produce works in a particular range of styles and whose works bring high prices. Generative AIs can be trained to imitate them so well, and produce similar works en masse, so that their livelihoods might be destroyed. Society has an interest in protecting these kinds of businesses. While generative AI producers have a right to develop and apply their AI, creating new businesses or enhancing performance in old businesses, existing and future artists have a similar right to continue, and continue to develop, their own businesses. Governments have the difficult, but necessary, task of balancing these two needs in new legislation. The time for prevarication on regulating new technology is well and truly over.

Current copyright law, being focused entirely on the similarity of specific end-products of generative AI and the original works they were trained with, can only cover these cases by an interventionist court that deliberately distorts the law. What is needed are: a general debate in society about what protections are required or desirable and then new laws to achieve those ends without also killing off promising new technology. The expectation that courts will do the heavy legislative work misunderstands the role of the courts and, if met, will only lead to bad, unwritten IP law.

	kbkorb on Analysing Arguments Using Caus…
	ctwardy on Analysing Arguments Using Caus…
	kbkorb on MacIntyre’s Dark Winter Puts t…
	Tim van Gelder on MacIntyre’s Dark Winter Puts t…
	Tanya Murphy on Analysing Arguments Using Caus…

BayesianWatch

~ Bayesian argument analysis in action

Generative AI Does Not Violate Copyright, but Needs To

New Law

Leave a comment Cancel reply

New Law

Share this:

Related

Leave a comment Cancel reply