Medical Imaging

AI-Generated DICOM Studies in Research: Innovation Needs Provenance, Governance, and Real-World Discipline

Why healthcare should identify truly AI-generated studies clearly — while not confusing de-identified or privacy-protected originals with synthetic imaging. And why the bias conversation in imaging AI still isn't asking the right questions.

Scales of justice balancing real clinical imaging — MRI and CT scans from a rural hospital — against an AI-generated holographic brain and data infrastructure, representing the governance tension between synthetic DICOM studies and authentic clinical data in healthcare AI research.

The governance question is not whether AI-generated images look convincing. It is whether they are governed well enough to deserve trust.

There is a difference between using AI to help interpret medical imaging and using AI-generated DICOM studies as though they are interchangeable with real clinical data.

That difference matters.

Over the last year, the conversation around synthetic and AI-generated medical imaging has become more visible across radiology, healthcare AI, and governance circles. The emerging view is not a blanket rejection of synthetic imaging — but it is also far from a blank check. The direction of the discussion is cautious: synthetic data may have a place in augmentation, rare-condition modeling, privacy-preserving development, and selected research use cases, but it should not be treated as a quiet substitute for authentic clinical imaging without rigorous oversight, validation, and traceability.

I understand the attraction.

Synthetic imaging appears to solve several difficult problems at once. It can expand limited datasets. It can help model rare findings. It can reduce some privacy barriers. It can support early experimentation when access to real data is constrained. But none of that changes the core concern: if the source data is biased, incomplete, or structurally narrow, the synthetic output can carry those same weaknesses forward while looking broader, cleaner, and more convincing than it really is. Realism can mask representativeness — and that is one of the more dangerous forms of bias in clinical AI development.

Where the Danger Begins: Inherited Bias in Synthetic Data

A synthetic DICOM study does not come from nowhere. It inherits assumptions from the original dataset — from how the cases were selected, how labels were assigned, which sites and vendors were represented, which scanner generations were included, and which clinical realities were left out entirely.

If the originating data underrepresents certain patient populations, care settings, acquisition conditions, or equipment generations, the generated data may reinforce those same gaps while making them harder to detect. Synthetic imaging risks creating the illusion of completeness without delivering the reality of representativeness.

A 2025 systematic review on radiology AI generalizability found that model performance often shifts meaningfully across institutions, scanner generations, and protocol variations — which is exactly why external validation on real-world, multisite data matters so much. That finding has a direct implication for synthetic data: if the training distribution is narrow, augmenting from that same narrow distribution does not solve the generalizability problem. It extends it.

The Core Risk

Synthetic imaging can create the illusion of completeness without delivering the reality of representativeness. If source data underrepresents certain populations, care settings, or equipment generations, generated data inherits those same gaps — while looking cleaner, broader, and more authoritative than it actually is.

Modality-Generation Bias: The Conversation Healthcare Isn't Having

This is why I believe the bias conversation in imaging AI is still too narrow.

We rightly talk about demographic bias, labeling bias, and institutional bias. But we should also be talking far more seriously about modality-generation bias — the systematic underrepresentation of older imaging equipment in AI training and validation datasets.

If training and validation data comes mostly from the newest CT, MRI, ultrasound, mammography, or other imaging platforms, we are not training on healthcare as it actually exists. We are training on the best-equipped slice of healthcare. Real imaging still comes from academic centers, community hospitals, rural sites, outpatient imaging centers, safety-net systems, and facilities still operating older scanners, older software versions, and less standardized acquisition environments.

If we are serious about reducing bias, the answer is not to build beautiful datasets from only the newest modalities. The answer is to build datasets that reflect the actual clinical landscape — including facilities still using older generations of equipment. Otherwise, we risk training AI for premium environments and deploying it across a healthcare system that is far more heterogeneous. That is not just a technical problem. It is a governance problem, and it is an equity problem.

"A model trained only on premium equipment is not trained on healthcare as it exists. It is trained on healthcare as the best-resourced organizations experience it."

De-Identification Is Not Synthetic Generation — and the Distinction Is Critical

One distinction needs to be made very clearly, because it has meaningful implications for both governance and research integrity.

A study should not be treated as AI-generated simply because it has been de-identified, had PHI removed from DICOM tags, or been altered for privacy protection — such as MRI defacing to reduce facial-recognition risk. Those are better understood as privacy-preserving transformations of an original clinical acquisition, not the creation of a synthetic or AI-generated study.

That distinction is essential.

A real patient study that has had identifiers removed under 45 CFR § 164.514(b) remains rooted in an actual clinical acquisition. A brain MRI that has been defaced for research privacy is still a real MRI — even if it has been modified to protect the patient. That is fundamentally different from an image or series that was wholly generated by a generative AI model, or materially synthesized rather than acquired directly from the patient in the ordinary imaging workflow.

If we fail to separate those concepts, we create confusion in exactly the wrong place. We blur the line between legitimate privacy protection and synthetic image generation. We risk making good de-identification practices sound suspicious, while at the same time failing to identify truly AI-generated content with the clarity it deserves. That is a governance failure — not a technical oversight.

The Case for Explicit DICOM Provenance Standards

This is why I believe healthcare should look seriously at whether DICOM standards need a more explicit, machine-readable framework for identifying imaging provenance.

DICOM already contains mechanisms that support derived-image description and source-image referencing. DICOM's AI guidance points toward the use of derived image identification rather than pretending algorithm-produced content is original acquisition. But as AI-generated imaging becomes more realistic, existing metadata conventions may not be sufficient to provide a simple, universal, portable signal of what an imaging object actually is.

In March 2026, RSNA reported research showing that AI-generated radiographs were realistic enough to fool radiologists and AI detection systems alike. That is not a technical curiosity. It is a warning about trust, authenticity, research contamination, fraud risk, and the integrity of the imaging record. Once AI-generated imaging becomes realistic enough to deceive experts, provenance can no longer be treated as optional.

A Four-State Provenance Framework

The answer should not be an overly simplistic yes-or-no label that marks every modified image as "AI-generated." That would be a mistake. What healthcare actually needs is a more precise provenance framework that can distinguish at least four different states:

🏥
Original Acquired Study

Directly acquired from a patient examination. Unmodified. Full clinical provenance intact.

🔒
Privacy-Protected Original

Real acquisition with de-identification, PHI removal, or MRI defacing applied. Clinically grounded. Not synthetic.

⚙️
AI-Derived Object

Based on a real acquired study but materially processed or transformed by an AI algorithm. Partially synthetic.

🤖
Fully AI-Generated Study

Wholly created by a generative model. No underlying patient acquisition. Requires the highest provenance scrutiny.

That is a much more useful standard than collapsing everything into a single bucket — and it aligns with the direction WHO's guidance on AI for health is already pointing, with its emphasis on transparency, accountability, safety, and human oversight across the full AI lifecycle.

What Governance for Synthetic Imaging Actually Requires

I have spent enough time around imaging workflows, healthcare data movement, and operational reality to know that healthcare does not run on theory. It runs on messy, heterogeneous, imperfect reality — different scanners, different budgets, different software versions, different upgrade cycles, different protocols, different patient populations, different constraints. That is the real environment our research should reflect.

So when I see enthusiasm around AI-generated DICOM studies, my question is not whether the images look convincing. My question is whether they are governed well enough to deserve trust. To earn that trust, synthetic DICOM studies used in research need to clear a meaningful bar:

Governance Requirements for Synthetic DICOM Studies
  • Explicit labeling — machine-readable DICOM metadata identifying the study as AI-generated, not dependent on visual inspection alone.
  • Source documentation — formal disclosure of the source dataset's scope: demographics, institution types, scanner generations, vendors, and protocol variability.
  • Traceable provenance — documented generation history including model architecture, training data lineage, and known limitations inherited from source data.
  • External real-world validation — performance tested against genuine, multisite, multivendor datasets spanning both newer and older-generation modalities in active clinical use.
  • Intended-use specificity — explicit documentation of which research or development use cases the synthetic data is and is not appropriate for.

Where I Land

I am not against synthetic medical imaging. I am against treating it as a shortcut around the hard work of governance.

AI-generated DICOM studies may have a legitimate role in augmentation, simulation, privacy-preserving development, and selected research use cases. But once they enter research pipelines, the burden of proof should go up — not down. And a real study does not become AI-generated simply because PHI was removed, DICOM identifiers were stripped, or an MRI was defaced for privacy.

Removing PHI is not the same as generating an image.
Defacing an MRI for privacy is not the same as creating a synthetic study.
And if DICOM evolves to address provenance more explicitly, it should help the industry tell those truths clearly.

Because a model trained only on premium equipment is not trained on healthcare as it exists. A synthetic study without traceable provenance is not a study that deserves automatic trust. And a governance model that fails to distinguish privacy-protected originals from truly AI-generated content is not a model built for clarity.

It is confusion dressed up as progress.

Sources

Ready to Build a Provenance-First Data Architecture?

Radiant AI Health Data is built around the principle that imaging data governance starts before AI ever touches a study. We'd welcome the conversation.

Contact Us Back to Insights