You paste a question into an AI assistant and get back two confident paragraphs: a clear explanation, a plausible statistic, and a specific citation — author name, title, year, page number. The prose is fluent, the structure is logical, and nothing about it looks invented. But the citation does not exist, the statistic has no traceable origin, and the confident tone was generated the same way the words were: one token at a time, based on what sounds right. By the end of this article you will be able to apply a three-step protocol to any AI-generated factual claim before you use it — and you will understand why the model's apparent confidence is not evidence of anything.

The Fluent Witness with No Memory

A large language model does not remember facts. It does not consult a database, retrieve a stored document, or check a claim against a source before writing it. What it does is predict the most plausible sequence of words given the prompt and everything it was trained on. When that prediction produces a sentence about a court ruling or a statistic or a study, it sounds authoritative because authoritative-sounding sentences were common in its training data. The fluency is real. The memory is not.

This matters for one non-obvious reason: AI confidence carries zero evidential weight. A model that says "studies confirm" or "according to the SEC filing" is not reporting what it found — it is generating text that fits the pattern of how verified claims are typically phrased. There is no internal experience of certainty behind those words. There is no internal experience of anything. The confidence is a stylistic feature, not a signal.

The practical consequence is asymmetric. When an AI gets something right, you cannot know it was right because it checked. When it gets something wrong, it will phrase the wrong thing with exactly the same fluency and conviction. You have no instrument, from within the model's output alone, to distinguish between the two.

The Three-Step Verification Protocol

The protocol has three steps and one prerequisite. The prerequisite: do not ask the AI to verify its own claim. Asking the model "are you sure?" or "can you give me the source?" does not constitute verification — it produces more output from the same system that generated the original claim. A model asked whether a fabricated case exists will often confirm that it exists, and may add further plausible-sounding details. This is not correction; it is elaboration.

Step 1 — Separate reasoning from factual claims

Read the AI's output and draw a hard line between two categories. The first is reasoning: logical inferences, frameworks, analogies, structural explanations of how something works. The second is factual claims: specific names, dates, figures, citations, case numbers, regulatory rulings, study findings. The reasoning can be evaluated on its internal logic and may be genuinely useful. The factual claims cannot be accepted on the basis of the output alone — each one requires independent verification.

Step 2 — Find the primary source independently

For each factual claim you intend to use, locate the primary source through your own search — not by asking the AI for it. A primary source is the original document: the court opinion, the academic paper, the regulatory filing, the official data release. If the AI cited a specific case, search the relevant legal database directly. If it cited a study, search PubMed, SSRN, or the relevant institutional repository. If you cannot locate the primary source through independent search, the claim is unverified regardless of how precisely it was stated.

Step 3 — Confirm it says what the AI said it says

Once you have the primary source in hand, read the relevant passage. Confirm that the source actually says what the AI attributed to it. This step exists because of a specific failure mode: AI output sometimes locates a real source but misrepresents its content, quotes it inaccurately, or attributes a finding to a paper that reached the opposite conclusion. A real citation does not guarantee an accurate characterization of what that citation contains.

Only after Step 3 is complete — primary source located independently, content confirmed — can the factual claim be used in a decision.

Citation Laundering

When a fabricated or mischaracterized claim acquires a real-looking source attached to it, it becomes harder to challenge, easier to forward, and capable of moving through a research or decision process unchallenged. This is citation laundering: a claim that lacks real evidentiary support appears to have it because it arrives dressed in the format of verified information.

Citation laundering is particularly dangerous in financial research contexts, where a specific figure or a regulatory ruling can alter a position thesis entirely. A well-formed false claim that gets forwarded without Step 2 or Step 3 being run will eventually find its way into a document, a brief, a recommendation, or a portfolio decision. The fluency that made it convincing is the same fluency that makes it hard to flag in review.

What Happened in a Manhattan Courtroom in 2023

In Mata v. Avianca, Inc. (US District Court, Southern District of New York; 678 F. Supp. 3d 443), attorneys Steven A. Schwartz and Peter LoDuca of Levidow, Levidow & Oberman P.C. filed a legal brief containing six case citations that did not exist. All six were fabricated by ChatGPT. The cases had realistic case numbers, plausible court names, and invented quoted passages written in the style of genuine judicial opinions.

When one attorney consulted ChatGPT directly to check whether the cited cases were real, the model responded by affirming that they were genuine and asserted they could be found in Westlaw and LexisNexis. One of the invented cases was attributed to a lawsuit styled "Varghese v. China Southern Airlines." It does not exist. On June 22, 2023, Judge P. Kevin Castel sanctioned the two attorneys and their firm $5,000 jointly and severally, and the case became the most cited legal example of AI hallucination with real professional consequences.

The attorneys' error was not that they used AI. It was that they accepted AI output as a substitute for independent source verification. Asking the model to confirm its own citations is not Step 2. It is Step 0 repeated.

What AI Is Genuinely Useful For

Nothing in this article argues against using AI tools in research or decision-support workflows. The argument is narrower: AI-generated factual claims require independent verification before use. AI-generated reasoning — how to think about a problem, what questions to ask, how to structure an analysis — can be evaluated on its internal logic without a primary source, because the value is in the framework, not in a verifiable assertion about the world.

Concretely: AI is developing as a tool for brainstorming, for explaining concepts you can then verify, for producing structural outlines that you fill with independently sourced material, and for generating a first draft of reasoning that a human then stress-tests. These are activities where the output is checkable by other means. Factual claims — the specific, the named, the cited — are not in that category. The protocol above applies to all of them.

Speed Run Simulator Exercise: Verify the Research Card

Open Abu Terminal and navigate to the Speed Run. In the simulator's research-card format, you will be shown a market claim written in fluent, confident prose — the kind of output a well-prompted AI assistant would produce. The card includes a specific supporting citation: a plausible author, title, year, and finding.

Your task is not to decide whether the claim sounds right. Your task is to run the three-step protocol. First, classify what in the card is a factual claim versus a reasoning structure. Second, attempt to locate the cited primary source independently — the simulator provides a search surface to try. Third, if you find the source, confirm whether the content matches what the card attributed to it.

After the drill, Abu marks the claim as verified or unverified and shows what the correct source actually says — or confirms that no such source exists. The behavioral pattern the drill is designed to surface: the pull toward accepting a well-formed claim without running Step 2. That pull is strongest when the claim supports a view you already hold, when you are under time pressure, and when the prose is unusually polished. Notice each of those conditions when they appear.

Run the drill three times in a single session. On the third repetition, compare your verification behavior to the first. The question the debrief is asking is whether your process stayed consistent, or whether familiarity with the drill's format caused you to take shortcuts on the later runs.

Authoritative references

Primary and authoritative material used to verify the educational framework and factual context.

Artificial Intelligence Risk Management Framework (National Institute of Standards and Technology)
Resources for Investors (U.S. Securities and Exchange Commission)
Checklist Before You Trade (U.S. Commodity Futures Trading Commission)

AI Output Verification: Treat Confidence as Zero Evidence