HD-INC-002

Legal services · United States · 2023 · Hallucination & fabrication

Mata v. Avianca, the lawyer who cited six cases that did not exist and asked ChatGPT to confirm them

By Ellie Harris · Filed 1 March 2023

Alleged: Levidow, Levidow & Oberman P.C. developed or deployed the AI system implicated in this incident. Details are drawn from public reports; parties are presumed innocent of any wrongdoing not established by an official finding.

What happened

On 27 or 28 August 2019, Roberto Mata was on an overnight Avianca flight from El Salvador to John F. Kennedy Airport in New York when a metal serving cart struck his knee. In 2022 he sued Avianca in the United States District Court for the Southern District of New York. The airline moved to dismiss, arguing the claim was time-barred under the Montreal Convention. Mata’s lawyer, Steven A. Schwartz of the New York firm Levidow, Levidow & Oberman, a personal-injury attorney admitted to the New York bar for more than three decades, filed an affidavit in March 2023 opposing the dismissal. The affidavit cited six federal decisions in support of his theory that a bankruptcy stay tolled the limitations period: Varghese v. China Southern Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Estate of Durden v. KLM Royal Dutch Airlines, and Miller v. United Airlines.

Avianca’s counsel could not find any of the cases. The court’s own staff could not find them. Schwartz had used ChatGPT to do the research. He had then asked ChatGPT itself whether the cases were real. ChatGPT had confirmed they were. None of them existed.

On 22 June 2023, Judge P. Kevin Castel imposed Rule 11 sanctions on Schwartz, on co-counsel Peter LoDuca (who had signed the affidavit because of the case’s admission requirements but had no role in the research and later swore he had “no reason to doubt the sincerity” of Schwartz’s work), and on the firm itself. The total sanction was USD 5,000. Castel also ordered the lawyers to send a copy of the sanctions order and the affidavit to Mata himself, and separately to each of the federal judges whose names had been falsely associated with the fabricated cases. The forty-six-page sanctions opinion is now the founding document of the AI-in-court literature and is cited in nearly every later case involving AI-fabricated submissions.

Schwartz’s defence was that he had not understood ChatGPT could fabricate. He said he believed it was operating like a search engine connected to a real database of cases. Castel found subjective bad faith on a conscious-avoidance theory. A reasonable lawyer would have checked the citations against a real legal database before filing, regardless of where the citations came from.

What an auditable version would have shown

Mata is the case that taught the United States legal profession the difference between retrieved facts and generated facts. Schwartz believed ChatGPT was retrieving from a corpus of real decisions. The model was producing plausible-sounding text. When Schwartz asked the model whether Varghese v. China Southern Airlines was real, the model’s answer came from the same generative process that had produced the citation in the first place. Asking the model to verify the model was a closed loop.

An auditable record on the research session would have tagged each citation with its source at the moment of generation. A real retrieved citation comes from a connected legal database such as Westlaw, LexisNexis, or PACER and carries that source’s identifier. A model-generated citation has no external referent at all. The audit log makes the distinction automatically, and the distinction shows up in the lawyer’s writing tool, on the reviewer’s checklist, and, if the document ever reaches a courtroom, in the record before the judge.

With that record in place, Schwartz’s question to ChatGPT could not have produced a misleading answer because the verification would have been routed away from the model entirely. The legal database would have returned no match for Varghese. The citation would have been flagged. The brief would not have been filed in the form it was.

Where the gap was

The technology to ground citations against real databases had existed for decades by March 2023. Westlaw and LexisNexis had been doing it since the early 1980s. Schwartz did not use those tools. He used a general-purpose chatbot trained on text from the web, which had learned the surface form of case citations without having any reliable connection to which cases were real.

The gap was a lawyer treating a general-purpose AI tool as if it were a connected legal research system. The market has since adjusted in the dedicated legal AI category. The major legal AI tools — Harvey, Spellbook, Lexis+ AI, and Westlaw Precision AI — ground their output against verified case databases and surface the citation source. The category problem was understood and fixed. The bigger exposure now sits with any lawyer who reaches past those tools for a general-purpose chatbot when they are in a hurry, where the original architecture that produced Schwartz’s brief is still the default.

The repetition continues. The Australian cases catalogued elsewhere in this library all involve the same architecture: a lawyer using a general-purpose chatbot for legal research, the chatbot fabricating citations, the lawyer not verifying them before filing. The technology has moved on. The workflow inside many firms has not moved with it. As of mid-2026, public trackers record more than 1,200 cases worldwide in which courts have flagged AI-fabricated submissions between 2023 and 2026, with sanctions imposed in hundreds of them.

What governance should have looked like

When anyone asks an AI tool whether a citation is real, the question gets routed to a real legal database, never back to the model. The model is allowed to suggest citations. The verification step lives outside the model entirely and is non-negotiable before any document leaves the firm.

The decisive field is model_consulted_for_verification. In Mata, that flag would have been set to True, because Schwartz asked the model itself whether the citations were real. With this pattern in place, the flag is always False. The model proposes citations. A real database disposes of the question of whether they exist.

The verification gate is one layer. Schwartz’s firm had several others available. Mandatory database verification of every cited case before any brief leaves the firm, treated as a workflow step rather than an assumption. Source-type display in the writing environment, so a lawyer can see at a glance which sentences are AI-generated and which are retrieved from a real index. AI disclosure on filings, which most US federal courts and several state and Australian courts have since required by standing order. Junior-attorney citation review as a named workflow stage, separate from substantive review, so the verification work is not folded into “drafting” and lost. None of these are exotic. They are documented practice in any firm that has read the Mata sanctions opinion. The cumulative cost of implementing them is less than the cost of a single Rule 11 sanction and the bar referral that often follows.

The reference implementation of CitationVerifier, VerificationGate, and ConductRecord is open source. It lives at github.com/saffronandindia/headlights-oss, Apache 2.0 licensed, free for any firm to install. The repository is public now.

Sources

The mailing list

Fresh incident reports every week. One email to match.

We add new incidents to the library regularly, and send a single short email each week with what's new. The library stays free and open; this is just how you keep up with it.

No tracking. Unsubscribe in one click.

The record

An auditable system would have produced a signed, tamper-evident record the moment this happened: what the system did, the version that did it, the basis it acted on, and the action taken, and Levidow, Levidow & Oberman P.C. could have produced it on demand.

This is the record the system as deployed did not produce in a signed, auditable form.

What this teaches

Capture what happened when it happens

What the system did, the version that did it, the basis it acted on, and the action taken, recorded at the moment, not reconstructed after.

Sign it, so no one has to trust the record-keeper

A tamper-evident entry. Edit it later and the signature breaks. The record does not ask for the benefit of the doubt.

Make it verifiable by anyone

A court, a regulator, a customer's lawyer can check the record themselves, without taking the company, or us, at our word.

Also in the library

HD-INC-001 Air Canada chatbot promised a bereavement refund policy that did not exist Aviation · 2022 HD-INC-003 Michael Cohen gave his lawyer fake case citations he had got from Google Bard, and his lawyer filed them in a federal court Legal services · 2023 HD-INC-005 Cursor's AI support bot, signing emails as "Sam", invented a single-device subscription policy that never existed, and developers cancelled Technology · 2025

Headlights summarises publicly reported AI incidents. All summaries are independently written, attributed to their original sources, and intended for research and educational purposes. Allegations are identified as such until established through official findings.

Last reviewed June 2026. This report is based on the sources listed above and reflects information available at the time of review; later developments may not be captured. Where a person is described as charged with or alleged to have done something, that allegation is unproven unless a conviction or a court or regulatory finding is stated. Headlights publishes journalism and commentary, not legal advice.

Want to write back?

Direct to my inbox.

ellie@useheadlights.com →