AI and Copyright Law: What We Know
At the moment, works created solely by artificial intelligence — even if produced from a text prompt written by a human — are not protected by copyright.
When it comes to training AI models, however, the use of copyrighted materials is fair game. That’s because of a fair use law that permits the use of copyrighted material under certain conditions without needing the permission of the owner. But pending lawsuits could change this.
What Is AI-Generated Content?
AI-generated content refers to written text, video, code, audio and other media produced by generative AI tools. These machines are trained on large amounts of data, allowing them to create relevant outputs in response to a word, phrase, question or other kind of input.
Generative AI has significantly altered the way we live, work and create in a short amount of time. As a result, the deluge of AI-generated text, images and music — and the process used to create them — has prompted a series of complicated legal questions. And they are challenging our understanding of ownership, fairness and the very nature of creativity itself.
Can AI Art Be Copyrighted?
It has long been the posture of the U.S. Copyright Office that there is no copyright protection for works created by non-humans, including machines. Therefore, the product of a generative AI model cannot be copyrighted.
The root of this issue lies in the way generative AI systems are trained. Like most other machine learning models, they work by identifying and replicating patterns in data. So, in order to generate an output like a written sentence or picture, it must first learn from the real work of actual humans.
If an AI image generator produces art that resembles the work of Georgia O’Keefe, for example, that means it had to be trained using the actual art of Georgia O’Keefe. Similarly, for an AI content generator to write in the style of Toni Morrison, it has to be trained with words written by Toni Morrison.
Legally, these AI systems — including image generators, AI music generators and chatbots like ChatGPT — cannot be considered the author of the material they produce. Their outputs are simply a culmination of human-made work, much of which has been scraped from the internet and is copyright-protected in one way or another.
So, how do we reconcile the rapidly evolving artificial intelligence industry with the knotty particulars of U.S. copyright law? That is something creatives, companies, courts and the United States government are trying to figure out.
Lines Get Blurry When Humans and AI Collaborate
Creative work that is the result of a collaboration between a human and machine, which is often the case with AI-generated creations, is a complicated matter.
“If a machine and a human work together, but you can separate what each of them has done, then [copyright] will only focus on the human part,” Daniel Gervais, a professor at Vanderbilt Law School, told Built In. He mainly focuses on intellectual property law, and has written extensively on how it relates to artificial intelligence.
If the human and machine’s contributions are more intertwined, a work’s eligibility for copyright depends on how much control or influence the human author had on the machine’s outputs. “It really needs to be an authorial kind of contribution,” Gervais said. “In that case, the fact that you worked with a machine would not exclude copyright protection.”
This threshold was put to the test in September of 2022, when the U.S. Copyright Office made history by granting the first known registration of a work produced with the help of text-to-image generator Midjourney — a graphic novel called Zarya of the Dawn. The 18-page narrative had all the trappings of a typical comic book — characters, dialogue and plenty of images, all of which were generated using Midjourney. The text was written by the book’s author Kristina Kashtanova.
Just a few months later, the office reconsidered its decision and wound up partially canceling the work’s copyright registration, claiming in a letter to Kashtanova’s attorney that it had “non-human authorship” that had not been taken into account. The book’s text, as well as the “selection, coordination, and arrangement” of its “written and visual elements,” remained protected. The images themselves did not, though, because they were “not the product of human authorship,” but rather of text prompts that generated unpredictable outputs based on its training data. The office also deemed whatever editing Kashtanova did to the images as “too minor and imperceptible to supply the necessary creativity for copyright protection.”
Since then, the office has released a more sweeping policy change to address all AI-human creative collaborations moving forward — a response to what it sees as new trends in registration activity. The document essentially doubles down on its stance with Zarya of the Dawn, reiterating that the term “author” is not extended to non-humans, including machines. It also states that if a human simply types in a prompt and a machine generates complex written, visual or musical works in response, the “traditional elements of authorship” have been executed by AI, a non-human. Therefore, it is not protected by copyright.
Federal courts have also affirmed the US Copyright Office’s position that AI-created artwork cannot be copyrighted. In August 2023, a judge in the US District Court for the District of Columbia sided with the agency against computer scientist Stephen Thaler, who was seeking copyright protection for an image created by AI software. At the time, Thaler’s attorney told Bloomberg Law that they intended to appeal the case.
Lawsuits Surge in the Wake of Generative AI
Some creators and companies believe their content has been stolen by generative AI companies, and are now seeking to strip these companies of the protective shield of fair use in a series of pending lawsuits.
Company Lawsuits
Getty Images is suing Stability AI (the company behind Stable Diffusion) for copying and processing millions of images that are protected by copyright, as well as their associated metadata owned by Getty Images, without getting permission or providing compensation.
TikTok settled a lawsuit in 2021 with voice actress Bev Standing, who claims the company used her voice without permission for its text-to-speech feature.
The New York Times joined the legal struggle as well near the end of 2023, suing Open AI and Microsoft for using millions of NYT articles to train AI models. According to NPR’s sources, the Times is concerned that generative AI tools will repurpose its reporting and display it to readers who would otherwise visit its site. If courts find that OpenAI illegally used Times articles to train its models, OpenAI could be forced to destroy its LLM dataset and rebuild it from scratch.
Major music publishers like Universal Music are targeting Anthropic, claiming the company illegally trained its chatbot Claude AI on copyrighted song lyrics. The publishers also called for greater “guardrails” to ensure Anthropic doesn’t copy song lyrics moving forward.
Class-Action Lawsuits
Artists Sarah Anderson, Kelly McKernan and Karla Ortiz have filed a class-action copyright infringement lawsuit against both Stability AI and Midjourney, both of which use Stable Diffusion to generate their images. The suit claims that these artists’ work was wrongfully used to train Stable Diffusion, and that the images generated in the style of those authors directly compete with their own work — an important point in the matter of fair use.
“Until now, when a purchaser seeks a new image ‘in the style’ of a given artist, they must pay to commission or license an original image from that artist. Now, those purchasers can use the artist’s works contained in Stable Diffusion along with the artist’s name to generate new works in the artist’s style without compensating the artist at all,” the complaint reads. “The harm to artists is not hypothetical — works generated by AI image products ‘in the style’ of a particular artist are already sold on the internet, siphoning commissions from the artists themselves.”
Writers have also tried to bring class-action lawsuits against top AI companies. Nonfiction writers Nicholas Basbanes and Nicholas Gage have sued OpenAI and Microsoft, claiming the companies “simply stole” content from their works and must compensate them. The pair of writers want to represent a class of writers that could number in the tens of thousands. This comes on the heels of fiction writers suing OpenAI and seeking to establish a class-action lawsuit in late 2023.
Questions Around Authorship and Competition
The U.S. Copyright Office’s stance on excluding machines from being considered authors could throw a wrench in the Stable Diffusion lawsuit and many others, according to Rob Heverly, an associate professor at Albany Law School who specializes in the intersection of technology and law.
“In order for there to be infringement, there has to be an author. So, if there isn’t an author, I don’t know that there can be infringement,” Heverly told Built In. “If we’re not going to hold the technology maker liable for the technology itself, then the creator of the output is the AI. But we’ve already said they’re not an author. So if they’re not an author then they can’t create an infringing work.”
Yet, amid all these lawsuits against AI companies, the scope of fair use in generative AI may come down to a Supreme Court decision on a 1981 photograph of rock musician Prince and a re-creation of the photograph made by pop artist Andy Warhol a few years later. The Court ruled against Warhol, concluding the piece wasn’t transformative enough to be a new piece of art and was guilty of copyright infringement as a result.
A key factor in the Supreme Court’s ruling was the commercial usage of the altered photo and how it targeted the same market as that of the original photo. This could bleed over into the generative AI space as companies claim AI-generated content directly competes with the original content AI tools are trained on. Still, the fair use doctrine’s place in the ongoing legal saga of generative AI is up in the air as the courts continue to untangle a web of lawsuits.
Creators and Companies Alike Take Action
While generative AI hangs in legal limbo, creators are still worried about their work or style being used to train generators without permission or compensation.
“The large majority of independent artists make their living through commissioned works. And it is absolutely essential for them to keep posting samples of their art,” Ben Zhao, a computer science professor at the University of Chicago, told Built In. But, the websites they post their work on are being scraped by AI models in order to learn and then mimic that particular style. “Artists are literally being replaced by models that have been trained on their own work.”
To help, Zhao and his team designed a new tool called Glaze, which aims to prevent AI models from being able to learn a particular artist’s style. If an artist wants to put a creation online without the threat of an image generator copying their style, they can simply upload it to Glaze first and choose an art style different from their own. The software then makes mathematical changes to the artist’s work on a pixel level so that it looks different to a computer. To the human eye, the Glaze-d image looks no different from the original, but an AI model will read it as something completely different, rendering it useless as an effective piece of training data.
Elsewhere, other companies are taking a more offensive approach. Shutterstock, a stock imagery site that was “critical” to the training of OpenAI’s DALL-E, according to CEO Sam Altman, has gone so far as to pay content creators if their work is used in the development of generative AI models.
And Shutterstock isn’t alone. Generative AI startup Bria trains its models exclusively on what it calls “responsibly sourced” data sets, and it pays royalties to artists and stock image providers when their creations have been used to generate an image. “We pay back a royalty according to the output,” co-founder and CEO Yair Adato explained. “So if somebody generates a specific art in the style of the artist, then the artist will have the right to say how much money he wants on this synthetic creation. And then we will split the revenue.”
The Future of AI Copyright
If the use of creators’ work in generative AI models continues to go unchecked, many experts in this space believe it could spell big trouble — not only for the human creators themselves, but the technology too.
“When these AI models start to hurt the very people who generate the data that it feeds on — the artists — it’s destroying its own future,” Zhao said. “So really, when you think about it, it is in the best interest of AI models and model creators to help preserve these industries. So that there is a sustainable cycle of creativity and improvement for the models.”
In the U.S., much of this preservation will be incumbent on the courts, where creators and companies are duking it out right now. Looking ahead, the level at which U.S. courts protect and measure human-made inputs in generative AI models could be reminiscent of what we’ve seen globally, particularly in other Western nations.
The United Kingdom is one of only a handful of countries to offer copyright protection for works generated solely by a computer. The European Union, which has a much more preemptive approach to legislation than the U.S., is in the process of drafting a sweeping AI Act that will address a lot of the concerns with generative AI. And it already has a legislative framework for text and data mining that allows only nonprofits and universities to freely scrape the internet without consent — not companies.
If it is ultimately determined that AI companies have infringed on certain creators’ copyrighted work, it could mean a lot more lawsuits in the coming years — and a potentially expensive penalty for the companies at fault.
“One thing you have to know about copyright law is, for infringement of one thing only — it could be a text, an image, a song — you can ask the court for $150,000,” Gervais said. “So imagine the people who are scraping millions and millions of works.”
Can AI content be copyrighted?
No — AI content and any works created solely by AI cannot be copyrighted in the United States.
Does generative AI violate copyright laws?
It depends — generative AI may violate copyright laws when the program has access to a copyright owner’s works and is generating outputs that are “substantially similar” to the copyright owner’s existing works, according to the Congressional Research Service. However, there is no federal legal consensus for determining substantial similarity.
Training generative AI models using copyrighted materials is protected under certain conditions by the fair use doctrine of the U.S. copyright statute.
Can AI be sued for copyright?
Companies that have developed and are responsible for AI systems may be able to be sued for copyright infringement. There are several cases of AI companies being sued due to potentially using copyrighted works to illegally train AI models or generate AI content.
Why can’t AI content be copyrighted?
For a product to be copyrighted, a human creator is needed. AI-generated content can’t be copyrighted because it isn’t considered to be the work of a human creator.
How do you avoid copyright infringement in AI?
When using generative AI tools, review any terms of service, license agreements or contracts. These clarify what a tool’s intended purpose is and whether any content created with it can be used for commercial purposes.
Source : https://builtin.com/artificial-intelligence/ai-copyright
Author :
Date : 2023-04-12 21:01:20