OpenAI's Counter to The New York Times Allegations and the Larger Legal Landscape
OpenAI has taken a decisive stance in response to a lawsuit filed by The New York Times, alleging unauthorized usage of their articles for training their Large Language Model (LLM). The company has addressed the accusations in a detailed letter, shedding light on the intricacies of the dispute and providing insights into its data usage practices.
OpenAI begins by refuting The New York Times's claims, asserting that the publication fabricated prompts to induce verbatim regurgitation of data related to NYT articles. The company suggests that the induced regurgitations were often from years-old articles scattered across multiple third-party websites. Interestingly, OpenAI contends that the prompts were manipulated intentionally, indicating either specific instructions to the model or cherry-picking examples from numerous attempts.
The revelation that OpenAI had no prior knowledge of the lawsuit until reading about it in The New York Times adds a layer of surprise and disappointment for the company. OpenAI emphasizes that, contrary to the NYT's assertions, their content did not significantly contribute to the training of existing models and would not be impactful for future training.
The letter addresses The New York Times's claim of regurgitation during their collaboration with OpenAI. OpenAI asserts that despite their commitment to investigate and rectify any issues, NYT failed to provide examples when questioned about such occurrences. The company underscores its dedication to addressing regurgitation concerns, citing the immediate removal of a ChatGPT feature in July due to unintended content reproduction.
OpenAI delves into broader issues, discussing the licensing deals with news agencies such as the Associated Press, Axel Springer, American Journalism Project, and NYU. The company asserts that training AI models using publicly available internet materials falls within fair use regulations, citing long-standing precedents and emphasizing its role in fostering innovation and maintaining competitiveness in the U.S.
However, OpenAI acknowledges the importance of respecting individuals' rights and provides an opt-out option for those who prefer not to have their data used for training AI models. Notably, The New York Times exercised this option in August 2023, as highlighted by OpenAI.
In a larger context, OpenAI faces faces broader legal challenges, since another lawsuit has been filed by two authors claiming unauthorized use of their published work in training AI models. This legal landscape underscores the evolving dynamics and ethical considerations surrounding the use of data in advancing artificial intelligence.
OpenAI concludes by emphasizing its commitment to ethical practices and its role as a responsible participant in the AI industry, showcasing its opt-out process for publishers and the continuous efforts to address concerns raised by collaborators and individuals alike.
OpenAI Lawsuit Data Usage Artificial Intelligence Ethics Fair Use Copyright AI Models Technology News RSMax
 COMMENTS