Federal Court Orders OpenAI to Produce Millions of Anonymized ChatGPT Conversations in Copyright Litigation

NEW YORK, NY – A U.S. federal magistrate judge in Manhattan has ruled that OpenAI must hand over approximately 20 million anonymized user chat logs to news publishers pursuing copyright claims, rejecting the company’s arguments that doing so would unduly compromise user privacy and burden its operations.
The discovery directive arises in the context of a copyright infringement suit originally filed by The New York Times and consolidated with similar claims from other media organizations, which assert that OpenAI trained its large language models on copyrighted news content without permission.
The procedural conflict dates back to mid-2025, when plaintiffs first sought access to OpenAI’s internal chat logs as part of evidence to test whether ChatGPT outputs reproduced proprietary content.
Magistrate Judge Ona T. Wang issued a production order in early November 2025, directing OpenAI to turn over a sample of 20 million consumer ChatGPT conversations de-identified to strip direct personal identifiers.
OpenAI sought to block or stay that order, arguing in court filings that users’ private conversations would still face significant privacy risks and that most of the logs were irrelevant to the plaintiffs’ claims. The company also proposed alternative methods, such as running specific keyword searches to locate relevant material, rather than transferring raw data.
In early December 2025, the judge denied OpenAI’s request for reconsideration, reaffirming the earlier directive and concluding that the data was proportional to the litigation needs and could be protected under existing safeguards and anonymization procedures.
The logs in question are drawn from user interactions spanning approximately December 2022 through November 2024 and represent a random or statistically valid sample of conversations retained by OpenAI’s systems.
While the records do not include enterprise accounts, subscription business logs, or API customer data, each selected log contains a full sequence of prompts and model responses – potentially tens of millions of individual entries.
The judge’s ruling underscores that protective orders and anonymization are intended to mitigate privacy concerns, but the court has nevertheless emphasized that relevance to the case warrants production.
OpenAI has appealed the magistrate’s production order to a district court judge, arguing that it is overly broad and requires disclosing chats unrelated to the dispute. Company representatives have described the requirement as an intrusion that may violate longstanding privacy expectations between users and the platform.
Plaintiffs, including The New York Times and others, contend that access to user logs is critical to proving their allegations that the AI system outputs text that is substantially similar to copyrighted material, which they argue undermines OpenAI’s reliance on a “fair use” defense.
Beyond this specific case, the decision has broader implications for how courts balance user privacy against evidentiary needs in litigation involving artificial intelligence and data-driven technologies. Legal analysts say it could influence future discovery disputes in similar copyright and data governance litigation.
Under the current schedule, OpenAI is expected to begin providing the de-identified logs to plaintiffs once anonymization is complete unless the company secures an emergency stay from an appellate court. The ongoing litigation is expected to proceed through 2026, with the discovery phase playing a central role in shaping arguments on both sides.
Key Facts and Details
| Item | Detail |
|---|---|
| Case Type | Federal copyright infringement litigation |
| Primary Defendant | OpenAI |
| Lead Plaintiff | The New York Times (with other publishers in related actions) |
| Court | U.S. District Court, Southern District of New York |
| Presiding Magistrate Judge | Ona T. Wang |
| Ruling Issued | Late 2025 (order reaffirmed December 2025) |
| Discovery Ordered | Approximately 20 million anonymized ChatGPT user conversations |
| Time Period Covered | Roughly December 2022 – November 2024 |
| Data Scope | Consumer ChatGPT logs only (no enterprise, API, or business accounts) |
| Privacy Safeguards | De-identification and protective order |
| Current Status | OpenAI appealing discovery order |
Timeline of Key Developments
| Date | Event |
|---|---|
| Late 2023 | Major publishers, led by The New York Times, file copyright claims against OpenAI |
| Mid-2025 | Plaintiffs request ChatGPT logs to evaluate alleged copyrighted output |
| Nov. 2025 | Magistrate judge orders OpenAI to produce 20M anonymized chat logs |
| Nov.–Dec. 2025 | OpenAI seeks reconsideration and stay of the order |
| Dec. 2025 | Court denies reconsideration, reaffirming discovery directive |
| Jan. 2026 | Appeal pending before district judge; production preparations underway |