The Reasoning Reconstruction Illusion in AI

Heraclitus wrote that no one steps into the same river twice. Generative AI systems behave in a similar way. A paragraph is produced under a particular configuration of the model, the prompt, and the moment, and once the text appears, that configuration has already moved on. You can read the paragraph again. You cannot re-enter the process that wrote it.

There is a grammar to this that organisations have not yet absorbed. You can say the model was accurate, meaning that a specific output, from a specific prompt, under specific conditions, on a specific date, checked against a specific standard, was correct. You cannot say the model is accurate, because the next output is a different event under different conditions, and the system has no memory of being right last time and no obligation to be right next time. The difference between "was" and "is" contains the entire problem. Governance systems are being built around "is." The technology only ever produces "was."

Running a prompt again will produce a different generation, and asking the model to explain the paragraph will produce another generation entirely.

The earlier pieces in this sequence described how that becomes a problem. The Single-Path Illusion examined how AI narrows the options before anyone notices. Residual Logic described how the reasoning it embeds persists because the draft arrives too polished to invite reconstruction. The Signature Fiction followed the document to the point of signing, where the governance bargain assumes the signer can defend how they got there. This piece is about what happens when someone finally asks.

When scrutiny arrives, the instinct is to ask the system to explain the document. Why was this risk emphasised? Why was that scenario excluded? Why this framing rather than another? The model will answer, and it will produce a fluent explanation of its "reasoning."

But the explanatory reasoning is itself another generation, one that did not exist when the paragraph was written and that was created in response to a new prompt under new conditions.

The reasoning was invented after the fact.

When checking becomes invention

The Signature Fiction used the Mata v Avianca case to illustrate what happens when AI-drafted citations go unverified. There is a second layer to that case. When attorney Steven Schwartz began to doubt the citations ChatGPT had produced, he did exactly what this article describes: he asked the system to explain itself. He prompted ChatGPT to confirm whether the cases were real. It assured him they could be found on Westlaw and LexisNexis. He then asked it to produce the opinion documents, and it generated those too, complete with realistic reasoning and judicial language. Schwartz submitted them as an affidavit.

Every step of the verification was another generation.

The lawyer asked for reconstruction, received fabrication, and could not easily tell the difference because the fabrication took the form of reasoning. Each time he asked the system to confirm its earlier work, it simply generated a new answer that pointed in the same direction. It was not checking, it was guessing again.

What the audit file actually contains

The organisation now has three things: the signed document, the prompt trail that produced it, and the explanation generated afterwards. Taken together, they feel like a reconstruction of reasoning, and they are not.

The reasoning that survived review was never stored in a form that could be audited, and the signature that went on the document assumed a chain of accountability that could be walked back through. The explanation produced under scrutiny looks like that chain, reads like that chain, and satisfies the immediate demand for one, but it was manufactured in the moment of asking, and the model that produced the original may since have been updated. The version of the system that wrote the document no longer exists.

When better models make it worse

Research from Anthropic (Lanham et al., July 2023) found that as language models become larger and more capable, they produce less faithful reasoning on most tasks. The gap between what the model does and what it says it did widens with each generation of the technology. The assumption is that newer, more powerful models will close the reconstruction problem. The evidence suggests they are making it worse, because the outputs become more fluent, more plausible, and less tethered to the computational process that actually produced them.

Organisations are building accountability systems around the idea that reasoning can be retrieved, while the AI systems they are using produce reasoning that can only be re-generated.

There is a mathematical dimension to this that is worth stating plainly. Cynthia Rudin, a computer scientist at Duke, argued in Nature Machine Intelligence that a fully faithful explanation of how a model reached its output would have to reproduce every calculation the model performed, across millions of parameters and billions of operations. At that point, the explanation would be as complex as the model itself. It would just be the model running again. The moment you simplify that into something a human can read, you have removed detail, and the detail you removed is the reasoning you were trying to explain. Every legible explanation has removed something, and the thing it removed is the reasoning. The explanation is either faithful and incomprehensible, or comprehensible and unfaithful. That is not a trade-off that better engineering resolves. It is the trade-off.

What governance requires is the ability to retrieve the reasoning that produced the text, and what the system can produce is only more text.

The instinct, once this is clear, will be to reach for process.

The immediate pressure will be to treat this as a documentation problem. If the reasoning cannot be reconstructed, perhaps it can be captured during generation instead: lock in the context, preserve the intermediate states, record the configuration at the moment of output. That instinct assumes the problem is one of storage, and it may not be. The reasoning organisations need to retrieve may not exist in a form that can be stored at all.

What happens when preservation turns out to be the wrong metaphor entirely is where this goes next.

References

James H. Curlin IV (2025) ChatGPT Didn't Write This . . . or Did It? The Emergence of Generative AI in the Legal Field and Lessons from Mata v. Avianca. Arkansas Law Review. Available at: scholarworks.uark.edu [Accessed 7 Mar. 2026]. A law review examination of the ethical pitfalls and lessons from the landmark Mata v. Avianca case.
Lanham, T. et al. (2023) Measuring Faithfulness in Chain-of-Thought Reasoning. arXiv. Available at: arxiv.org/abs/2307.13702 [Accessed 7 Mar. 2026]. Investigates whether chain-of-thought reasoning faithfully explains a model's actual reasoning process.
Multiple authors (2024) Towards Faithful Model Explanation in NLP: A Survey. Computational Linguistics, MIT Press. Available at: direct.mit.edu [Accessed 7 Mar. 2026]. Reviews over 110 explanation methods through the lens of faithfulness.
Yeo, W.J. et al. (2024) How Interpretable are Reasoning Explanations from Prompting Large Language Models? arXiv. Available at: arxiv.org/abs/2402.11863 [Accessed 7 Mar. 2026]. Evaluates interpretability of LLM reasoning explanations across faithfulness, robustness, and utility.
Rudin, C. (2019) Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence, 1, pp. 206–215. Available at: doi.org/10.1038/s42256-019-0048-x [Accessed 7 Mar. 2026]. Argues that a fully faithful explanation of a black box would have to equal the model itself.

The AI Reconstruction Illusion: Generation is an event, not an object

When checking becomes invention

What the audit file actually contains

When better models make it worse

References

Further reading

The analysis