Legal Frontiers: Navigating the Complex Landscape of Generative AI Regulation

Since the release of ChatGPT on November 30, 2022, classrooms and workplaces have been revolutionized by the widespread adoption of artificial intelligence (AI). Whether used for searching information, prompting creativity, or even homework solutions, it is on a fast track to becoming an integral part of our lives. However, as AI technologies continue to permeate different areas, it has also brought forth complex legal challenges, particularly in the sectors of intellectual property and antitrust law. While the current frameworks of copyright law sufficiently govern copyright issuance relating to artificial intelligence, the recent and ongoing AI lawsuit explosion raises important questions regarding copyright infringement disputes and concerns over fair competition within the generative AI space.

The AI industry currently faces a multitude of lawsuits over the use of copyrighted materials in AI training. Notably, OpenAI faces charges from the Authors Guild and individual authors such as Paul Tremblay, Michael Chabon, and Sarah Silverman; Chabon and Silverman, among others, have also filed infringement claims against Meta Platforms; a class action lawsuit has been proposed against Alphabet Inc., Midjourney, and Stability AI; and Stability AI is also currently involved in a copyright infringement case filed by Getty Images. [1]

The landmark case Thaler v. Perlmutter (2023) takes the first step in defining the scope of copyright protection for AI-generated content. Thaler v. Perlmutter was adjudicated on August 18, 2023, by the United States District Court for the District of Columbia, and it centers on the eligibility of AI-generated works for copyright protection. The Court held that: “[by] its plain text, the 1976 Act thus requires a copyrightable work to have an originator with the capacity for intellectual, creative, or artistic labor. Must that originator be a human being to claim copyright protection? The answer is yes.” [2] This ruling is pivotal, as it sets a precedent that AI-generated works must contain human creative contributions in order to qualify for copyright protection under the current legal framework.

While copyright law sufficiently dictates the copyright status of works generated solely by AI, it encounters challenges when addressing copyright infringement claims for large language models (LLMs), a type of generative AI technology that focuses on understanding and generating human language. These models utilize copyrighted materials in their training processes when developing AI, as large datasets are inputted into a machine-learning model, which allow the models to learn from patterns and information contained within the data. This process raises concerns regarding whether the use of copyrighted content for training constitutes fair use or if it establishes grounds for infringement claims.

Prior to the explosion of AI related lawsuits, the US Patent and Trademark Office (USPTO) issued a report in October 2020 that investigated the adequacy of existing statutory language and related case law to dictate such scenarios. Their language suggests that utilizing substantial copyrighted works can be considered copyright infringement even if the material is used for “non-expressive” purposes, asserting that “the ingestion of copyrighted works for purposes of machine learning will almost by definition involve the reproduction of entire works or substantial portions thereof.” [3] However, this “ingestion” use may be eligible for an exception to the reproduction right through the fair use doctrine in section 107 of the Copyright Act, 17 U.S.C. § 107, which is aimed at promoting greater freedom of expression. [4]

Determining if the learning process of AI algorithms constitute copyright infringement (as they involve ingesting copyrighted works) must be determined on a case-by-case basis. The ongoing debate features rights holders that support compensation for authors by AI trainers as a business expense in instances when the AI program uses their copyrighted materials, while opponents suggest that the content used for AI's training and testing should be automatically considered fair use because unrestricted AI development promotes innovation. [5]

OpenAI LP, one of the major developers in the AI space, argues that its process of training AI systems constitutes fair use under current law. OpenAI presented its claims for the legality of its AI training program by citing Campbell v. Acuff-Rose Music (1994), where the Supreme Court underscored the importance of “transformative use” in determining fair use, suggesting that adding new expression or meaning to the original work is more likely to be considered fair use. [6] Transformative uses support the goal of copyright in promoting creativity and innovation. In the context of AI training, OpenAI argues that this principle strongly supports arguments for fair use because while the copyrighted works are primarily created for human consumption, the reproduction and digitization of works in AI training fundamentally alters copyrighted works for the new purpose of creating AI systems. [7] According to former Supreme Court Justice David Souter, this “conclusion is strengthened by reference to existing analogous case law holding that the reproduction of copyrighted works as one step in the process of computational data analysis is a fair use of those works.” [8] However, OpenAI recognizes that the generated output from AI programs may result in copyright infringement but argues that those instances should be individually evaluated on a case-specific basis in a manner similar to human-generated works. [9]

On December 27, 2023, The New York Times filed a lawsuit against OpenAI and Microsoft in the U.S. District Court for the Southern District of New York, claiming that the product of their respective AI programs competes with and threatens the Times’ ability to provide news services. [10] The New York Times case differs from Campbell v. Acuff-Rose Music in that the complaint from the Times cites OpenAI’s inclusion of a disproportionate amount of material from the Times in its dataset used for AI training. The complaint states that “the most highly weighted dataset in GPT-3, Common Crawl, is a ‘copy of the Internet’... [the] domain www.nytimes.com is the most highly represented proprietary source.” [11] The Times asserts that “by OpenAI’s own admission, high-quality content, including content from The Times, was more important and valuable for training the GPT models as compared to content taken from other, lower-quality sources.” [12]

Additionally, the Times challenged OpenAI and Microsoft’s citation of the “fair use” doctrine in the Copyright Act. The Times argued that because the generated output of the AI program possesses the ability to summarize articles that are typically only accessible through a Times subscription, this can result in negative commercial impacts for the news publication: “Defendants’ unlawful conduct threatens to divert readers, including current and potential subscribers, away from The Times, thereby reducing the subscription, advertising, licensing, and affiliate revenues that fund The Times’s ability to continue producing its current level of groundbreaking journalism.” [13] The Times’ allegations contest OpenAI’s invocation of the “fair use” doctrine by asserting that OpenAI breached the fourth factor – consideration for the effect of the use upon the potential market for or value of the copyrighted work – as the generative AI outputs generate an adverse effect on the value of New York Times articles.

Notably, on February 26, 2024, OpenAI motioned to partially dismiss Count I (Direct Infringement) for specific conduct from over three years ago; additionally, they moved to fully dismiss Count IV (Contributory Infringement); Count V (Copyright Management Information Removal); and Count VI (Unfair Competition by Misappropriation) on the grounds of the Copyright Act. [14] In considering the legality of these arguments, the outcome of this lawsuit will set precedence for the valuation and legal use of copyrighted materials in AI training and will be crucial in guiding the direction of future AI development. OpenAI had previously emphasized the importance of copyrighted works for innovation in AI, stating: “[it] would be impossible to train today’s leading AI models without copyrighted materials… Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.” [15]

While the New York Times lawsuit underscores the complexities of using copyrighted materials in AI development, other media companies have opted to collaborate with AI firms, seeking a balanced solution that both respects copyright and promotes technological progress. In  December of 2023, Axel Springer announced a global partnership with OpenAI. [16] Axel Springer is a media company based in Berlin, Germany, and its media brands include Politico and Business Insider, among other European media outlets. Their collaboration involves using Axel Springer media content for the training of LLMs and providing summaries with proper attributions and links to full news articles for ChatGPT users. Additionally, the collaboration will generate substantial revenue for Axel Springer and support its ventures in incorporating AI into its approach to journalism. A similar agreement has also been established between OpenAI and the Associated Press. [17]

While this emerging trend of partnerships between AI and media firms works to provide a mutually beneficial outcome by ensuring copyright protection and fostering innovation, the solidification of this norm may also raise antitrust concerns in the generative AI space. Due to the exorbitant financial resources that are necessary to lawfully acquire data for LLM training, ranging between $1 million to $5 million annually, [18] AI startups are compelled to collaborate with established tech giants that already own large datasets, or have the financial power to defend against copyright infringement allegations. In 2023, tech giants Microsoft, Google, and Amazon collectively invested over $19 billion in key generative AI companies, primarily OpenAI, Anthropic, and Inflection. [19] The dependency of generative AI companies on established tech giants for protection against potential infringement cases stifles the diversity of products in the market and limits the direction of AI innovation by consolidating power and influence in the generative AI space to a few major players.

The potential for monopolistic practices in the generative AI space can pose serious risks to competition, diversity, and accessibility within the sector. Aside from tech companies’ integration of AI models in their own products and search engines, their influence over AI companies may also steer the research and development of this technology towards the direction of greatest profitability rather than public interest, which was the founding mission of the nonprofit OpenAI. [20]

The Federal Trade Commission (FTC) shares this concern and launched an investigation into the partnerships between AI startups and tech investors on January 25, 2024, under the FTC Act. [21] Section 6(b) of the FTC Act authorizes the Commission to request special reports from companies to gather further information on the business. The compulsory orders were issued to five companies: Amazon, Google, Microsoft, Anthropic, and OpenAI, requesting information on “certain investments in or partnerships with Artificial Intelligence developers and the potential impact of such partnerships and investments on competition.” [22] The investigation aims not only to identify applications of AI that are harmful to consumers, but also to examine the potential exploitation of the emerging generative AI market by leaders. This could include OpenAI and Anthropic, with the support of tech giants Amazon, Google, and Microsoft. If the review determines that these partnerships foster anti-competitive practices, it could lead to formal investigations and enforcement actions, including mandates to block certain agreements or to alter investment strategies.

The impact of such actions on AI innovation remains unclear. On one hand, stricter regulatory scrutiny may promote a healthier competitive environment, thereby encouraging more innovation; on the other hand, it can slow down innovation if companies become more cautious in their investments in the AI space. Regardless, the FTC's investigation will likely promote greater transparency from large tech companies when entering into partnerships with or making investments in AI startups.

The unfolding AI landscape, marked by developing lawsuits and investigations, sets the stage for profound changes in copyright and antitrust regulations as they pertain to AI. While the potential outcomes of these legal battles remain unclear, they will nevertheless be influential on how AI companies collaborate with both news organizations and technology companies in the future. The growing trend of strategic collaborations between AI firms and tech giants is partially driven by the need to navigate complex copyright landscapes with infringement lawsuits from news media outlets and authors and share costly data resources, such as with the Times. While some partnerships may be mutually beneficial, others raise significant antitrust concerns by concentrating market power and limiting competition. Given the attitudes and converging interests of news media outlets, tech giants, and regulatory agencies, the AI field is poised for drastic transformations in both its technological and legal dimensions, and New York Times v. OpenAI can shed light into this new area.
Edited by Stella Tallmon

[1] Zirpoli, Christopher T. “Generative Artificial Intelligence and Copyright Law.” CRS Reports, September 29, 2023. https://crsreports.congress.gov/product/pdf/LSB/LSB10922.

[2] Thaler v. Perlmutter, Case 1:22-cv-01564-BAH, 1 (D.D.C., 2023). https://www.copyright.gov/ai/docs/district-court-decision-affirming-refusal-of-registration.pdf

[3] “Public Views on Artificial Intelligence and Intellectual Property Policy.” USPTO AI Report, October 7, 2020. https://www.uspto.gov/sites/default/files/documents/USPTO_AI-Report_2020-10-07.pdf.  

[4] “U.S. Copyright Office Fair Use Index.” US Copyright Office. Accessed April 13, 2024. https://www.copyright.gov/fair-use/.

[5] Zirpoli, Christopher T. “Generative Artificial Intelligence and Copyright Law.” CRS Reports (September 29, 2023). https://crsreports.congress.gov/product/pdf/LSB/LSB10922.

[6] Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994).

[7] “Before the United States Patent and Trademark Office: Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation.” Docket No. PTO–C–2019–0038, October 30, 2019. https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf.

[8] “Before the United States Patent and Trademark Office: Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation.” Docket No. PTO–C–2019–0038, October 30, 2019. https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf. 

[9] “Before the United States Patent and Trademark Office: Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation.” Docket No. PTO–C–2019–0038, October 30, 2019. https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf.

[10] The New York Times v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Dec 27, 2023) ECF No. 1. https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf

[11] The New York Times v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Dec 27, 2023) ECF No. 1. https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf

[12] The New York Times v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Dec 27, 2023) ECF No. 1. https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf.

[13] The New York Times v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Dec 27, 2023) ECF No. 1. https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf.

[14] The New York Times Company v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y. Feb 26, 2024) ECF No. 52.

[15] “OpenAI—Written Evidence (LLM0113).” House of Lords Communications and Digital Select Committee Inquiry: Large language models, December 5, 2023. https://committees.parliament.uk/writtenevidence/126981/pdf

[16] Sisani, Adib, and Julia Sommerfeld. “Axel Springer and OpenAI Partner to Deepen Beneficial Use of AI in Journalism.” Axel Springer, December 13, 2023. https://www.axelspringer.com/en/ax-press-release/axel-springer-and-openai-partner-to-deepen-beneficial-use-of-ai-in-journalism

[17] Easton, Lauren, and Niko Felix. “AP, OpenAI Agree to Share Select News Content and Technology in New Collaboration.” The Associated Press, July 13, 2023. https://www.ap.org/media-center/press-releases/2023/ap-open-ai-agree-to-share-select-news-content-and-technology-in-new-collaboration/.

[18] Sapienza, Brandon. “OpenAI Offers $1m-$5M per Year to License News: Information.” Bloomberg Law, January 4, 2024. https://news.bloomberglaw.com/artificial-intelligence/openai-offers-1m-5m-per-year-to-license-news-information.

[19] Nylen, Leah. “Alphabet, Amazon, Microsoft Face FTC Inquiry on Openai, Anthropic Partnerships.” Bloomberg.com, January 25, 2024. https://www.bloomberg.com/news/articles/2024-01-25/alphabet-amazon-anthropic-microsoft-openai-get-ftc-inquiry-lrthp0es?embedded-checkout=true.

[20] Brockman, Greg, Ilya Sutskever, and OpenAI. “Introducing OpenAI.” Introducing OpenAI, December 11, 2015. https://openai.com/blog/introducing-openai.

[21] Graham, Victoria. “FTC Launches Inquiry into Generative AI Investments and Partnerships.” FTC Office of Technology, January 25, 2024. https://www.ftc.gov/news-events/news/press-releases/2024/01/ftc-launches-inquiry-generative-ai-investments-partnerships.

[22] Federal Trade Commission,“Order to File a Special Report.” AI Investments 6(b) order and resolution, January 24, 2024. https://www.ftc.gov/system/files/ftc_gov/pdf/P246201_AI_Investments_6(b)_Order_and_Resolution.pdf

Michelle Lian