Lawletter Vol 48 No. 3
ChatGPT, LLMs, and Legal Research
Brett R. Turner—Senior Attorney
What Is ChatGPT? What Are LLMs?
ChatGPT is one particular brand of a large language model, or LLM. LLMs are a recent technological advance in how computers and humans communicate with one another. In one direction, LLMs parse plain-language instructions and convert them into language which a computer can understand. In the opposite direction, LLMs allow computers to translate their output into ordinary language for humans, including not only sentences but also entire written products, such as memos or briefs.
More specifically, LLMs work by starting with certain words (the prompt) and finding other words which are associated with those words in certain training material. An algorithm is then used to convert the chosen words into a product using ordinary human language.
ChatGPT has been analogized to the automatic chat bots found on many support websites. The software begins with a prompt, scans through a specific list of documents, and produces the content of those materials in ordinary human language.
The Cloud in the Silver Lining: False Authority
While ChatGPT is good at communicating with humans, at its core it is only a communications device. It finds words which are commonly associated with other words, but it does not actually understand the prompt or think for itself. It only reformats and reproduces words and sentences found in the training materials.
ChatGPT is capable of producing a written document which looks, to the casual eye at least, like a memorandum or brief. Upon careful examination, however, the document is nothing of the sort. It produces relevant sentences, cites authority in support of those sentences, and even provides copies of the cases. But the citations are very frequently false, and the cases provided are not those that have actually been decided by any court. Rather, they too are products of ChatGPT.
The Consequences of False Authority: Sanctions
Because ChatGPT makes up its own law, attorneys using the software blindly have found themselves in serious trouble. In Mata v. Avianca, Inc., No. 22‑cv‑1461 (PKC), 2023 U.S. Dist. LEXIS 108263 (S.D.N.Y. June 22, 2023), an attorney submitted to the court a brief written by ChatGPT. Upon request by the court, the attorney then provided copies of cases. Both the citations and the cases were invented entirely by the software. To make the situation worse, the attorney did not immediately admit ChatGPT's involvement, but rather he "did not begin to dribble out the truth" until almost three months after the brief was submitted. Id. at *3. The attorney and his law firm were found jointly liable for $5,000 in Rule 11 sanctions.
The sanctions in Mata may be only the tip of the iceberg. The attorney in that case could at least claim that he had no reason to suspect that ChatGPT would make up case law. The judge stated openly that he would have been more sympathetic if the attorney had been quicker to admit the use of ChatGPT. But ChatGPT has received a good amount of attention: as of this writing, Lexis shows that the term "ChatGPT" has been cited in no less than 260 law-related secondary sources. It is much less likely today that an attorney can plausibly claim to be unaware of ChatGPT's limitations—and ignorance will probably become even less defensible as the limitations of LLMs get more public attention. So the sanctions for using ChatGPT without supervision are likely to increase in the future.
Relying upon Mata, a federal judge in Texas issued a standing order that "[a]ll attorneys appearing before the Court must file on the docket a certificate attesting either that no portion of the filing was drafted by generative artificial intelligence (such as Chat GPT . . .) or that any language drafted by generative artificial intelligence was checked for accuracy, using print reporters or traditional legal databases, by a human being." https://www.txnd.uscourts.gov/judge/judge-brantley-starr.
The problems with ChatGPT are not by any means limited to case law. A federal bankruptcy judge asked ChatGPT to "prepare an essay about the evolution of social media and its impact on creating personas and marketing products." In re Vital Pharm., 652 B.R. 392, 398 n.12 (Bankr. S.D. Fla. 2023). The essay cited five sources. None of them actually existed. The judge gave up on ChatGPT and did his own research.
Note also that the use of ChatGPT may be easily detected. Among the more enthusiastic users of the software are college undergraduates, eager to submit the best possible essays to their professors with the least possible expenditure of time and effort. An assistant professor of history at North Carolina State, Bret Devereaux, reports that he has already received papers written by ChatGPT. But he also states that "ChatGPT responses seem to be actually pretty easy to spot once you know how to look for the limitations built into the system." Bret Devereaux, Collections: On ChatGPT (Feb. 27, 2023), https://acoup.blog/2023/02/17/collections-on-chatgpt/. If college professors can spot ChatGPT essays quickly, it seems likely that trial judges will be able to spot ChatGPT briefs quickly as well.
The Present Reality
Existing case law suggests a bottom-line lesson: do not ever, ever use ChatGPT or any other LLM to write any form of brief or memorandum without extremely close supervision. It may be possible to obtain some value from the software if the sources are checked closely. But a Florida bankruptcy judge found that citation of false sources was so common that, to fix the problem, he had to do the same work he would have done to write the product himself in the first place. It is, therefore, questionable whether, at present, there is much value in using ChatGPT itself to draft any court document.
To say that LLMs have very limited usefulness at present is not to say that they will never be useful at all. The software is a capable assembler of words. What is needed, above all, is a way to tell the software not to make up authority. This may be harder than it looks, because the software essentially looks for words that are used together in its source material. It is only a sophisticated word-association machine; it does not actually think or reason. A collection of words which are commonly used together in different contexts is not necessarily true. "Prior" may be commonly used with "case law," and "real" may be commonly used with "problem," but it is not at all true that all prior case law is a real problem. Indeed, experts have suggested that it may be impossible to completely prevent any LLM from relying on false authority. See Lucas Mearian, AI Deep Fakes, Mistakes, and Biases May Be Unavoidable, but Controllable, Computerworld (May 4, 2023), https://www.computerworld.com/article/3695508/ai-deep-fakes-mistakes-and-biases-may-be-unavoidable-but-controllable.html.
More generally, the main path to better software for helping attorneys make arguments may lie elsewhere. ChatGPT helps computers and humans communicate, but it does nothing to help computers to find helpful case law. Rather, ChatGPT avoids the issue by making up its own case law. By doing so, ChatGPT is running directly away from the core problem of legal research—finding real, actual, existing case law which helps an attorney make an argument.
At the National Legal Research Group, we have been using cutting-edge digital research tools for many years. Computers are an essential tool which our research attorneys use every day to find cases for our clients. But even the best computer software for legal research is no substitute for the reasoning and writing ability of the human brain. We are following developments in the LLM field closely, and we are committed to using computers in the most efficient and effective way possible. At present, however, LLMs in general and ChatGPT in particular, when used for legal research, create much more risk than benefit.