What Legal AI Confidentiality Actually Requires

Legal AI risk is usually described too narrowly.

The common warning is familiar: do not put confidential client information into a consumer chatbot. That warning is correct. It is also incomplete.

Two recent cases make the larger problem visible from different angles.

United States v. Heppner put one side of the problem into the legal record: what can happen when legal work flows through a consumer AI system without the confidentiality boundaries lawyers expect.

Fortis Advisors LLC v. Krafton, Inc., decided by the Delaware Court of Chancery on March 16, 2026, puts another side into view. In a high-stakes commercial dispute over a $250 million earnout, the court's opinion describes Krafton's CEO consulting ChatGPT about a "no-deal" response strategy and later admitting at trial that he had deleted specific, relevant ChatGPT logs.

Those are not the same case. They do not stand for the same doctrinal point.

But together they point to the same conclusion.

Legal AI confidentiality cannot be evaluated apart from system design.

Heppner made the confidentiality problem visible

Heppner matters because it forced a question many lawyers still try to avoid.

What happens when legal work runs through a consumer AI system that sits outside the controlled workflow the profession normally expects?

That is not just a model question. It is a system question.

Where does the data go? What terms govern it? What confidentiality expectations survive the transfer? What workflow boundary exists between draft generation and actual legal use? Can the lawyer show what was used, what was reviewed, and what was approved?

Those questions are easy to blur when the interface looks simple. A chat box feels informal. The legal consequences are not.

Heppner made that visible.

Fortis makes the discovery problem visible

Fortis should get the attention of a different audience.

Heppner could still be misread by some corporate legal teams as a case about a criminal defendant, a consumer tool, and an unusual fact pattern. That would be the wrong read, but it is an easy one.

Fortis is harder to wave away.

This is a Delaware Court of Chancery case. It involves a major commercial dispute. It involves executive decision-making in the middle of an earnout fight. And the opinion does not treat the AI chat as irrelevant side noise. It treats it as part of the factual record.

The court recounts that Krafton's CEO consulted ChatGPT while trying to develop a strategy around the earnout dispute. The opinion also notes that he admitted at trial he had deleted specific, relevant ChatGPT logs.

That matters.

Not because every AI interaction will become trial evidence. Not because every deleted chat will be recoverable in the same way. But because the case makes a broader point unavoidable:

consumer-style AI interactions can become part of the evidentiary story in consequential business litigation.

That is not a hypothetical compliance concern. It is now a visible litigation fact pattern.

The buyer question has changed

For legal buyers, the first question should no longer be just which model the vendor uses.

The harder and more important questions are architectural:

Where is the data stored?
Is the workspace tenant-bounded?
Are customer inputs used for model training?
What is retained?
What is logged?
What can be exported?
What can be deleted?
What reaches the model?
What remains tied to the underlying record?
What review boundary exists before work leaves the system?

Those are not secondary implementation details. They are part of the legal-risk posture of the product.

A tool that produces polished output in a loose chat surface is not equivalent to a tool that prepares work inside a bounded legal workflow.

The output may look similar.

The confidentiality posture is not.

The useful framework is a spectrum

The legal market still talks about AI tools too often as if they belong to one category.

They do not.

There is a real spectrum of confidentiality and discovery risk.

At one end is the consumer chatbot: broad, convenient, weakly bounded, often detached from the matter record, and prone to creating loose conversational artifacts that may not map cleanly to legal supervision.

In the middle are enterprise chat layers that improve terms, access controls, and provider posture but still behave like general-purpose chat surfaces.

Further along are API-based legal workflows with bounded retrieval, controlled storage, tenant-scoped records, and explicit review states.

At the far end are tightly controlled systems built for legal work from the start: systems where the work stays tied to the relevant record, context is scoped deliberately, outputs move through approval states, and supervision is built into the workflow rather than added after the fact.

Those are not branding variations.

They are different risk categories.

What confidentiality-aware legal AI actually looks like

A legal AI system designed for real professional use should not start from the assumption that freeform conversation is the natural operating model.

It should start from the opposite direction.

What is the record?

What information is relevant to this task?

What should be excluded?

What is draft?

What is pending review?

What is approved?

What must not leave the system before human judgment is applied?

That leads to a different product shape.

A serious legal AI system should have bounded retrieval rather than open-ended context loading. It should keep work tied to a tenant-scoped workspace rather than a generic conversation thread. It should preserve review and approval states rather than collapsing everything into "the model answered." It should support supervision by making the workflow legible. And it should be clear about provider boundaries, retention posture, and whether user inputs are used to train underlying models.

That is what legal confidentiality looks like as system design.

Not merely as a policy PDF.

Why this matters now

Heppner and Fortis do not say the same thing.

Heppner makes the confidentiality problem harder to ignore.

Fortis makes the discoverability problem harder to ignore.

Together they make a broader point.

Legal AI tools are not interchangeable. The difference between a consumer chat surface and a system built for legal workflow is not just product taste. It is part of the privilege, confidentiality, supervision, and litigation-risk posture of the work.

This is why architecture independence matters. It is why bounded retrieval matters. It is why state transitions matter. It is why review gates matter. And it is why the legal market should stop treating model selection as the whole legal-AI question.

The question is not only whether the system can generate useful output.

The question is what kind of system it is.

That is not just a product choice.

It is a legal-risk choice.

FlowCounsel builds AI-enabled software for legal teams. FlowLawyers is the consumer-facing legal help platform with attorney discovery, legal aid routing, state-specific legal information, and document tools. Neither provides legal advice. Attorney supervision of legal AI output is required.