Radiology AI Startup Guide to Clinical Training Data
You built a radiology AI model. Now you need real DICOM data to make it work.
Every radiology AI startup follows the same trajectory: build a promising model on public datasets (ChestX-ray14, RSNA, MIMIC), achieve strong benchmark scores, then hit the wall when you need real clinical validation data from actual hospital workflows. The data exists — millions of scans across NHS trusts and private imaging centres — but it's locked behind governance that blocks export.
This guide covers your actual options, ranked by what radiology AI startups are using in 2026.
Option A: Rapha Protocol — train on hospital data without export
Submit your model through the Rapha secure API. It runs inside a hospital's SGX/TDX edge appliance, training directly against DICOM data in the PACS. You receive trained weights, metrics, and a Polygon proof receipt. No data export. No DUA negotiation. Per-job USDC settlement.
"We tried negotiating a DUA with an NHS trust for 6 months. Their legal team kept asking for new clauses. We needed access to 5,000 chest X-rays for validation. With Rapha, we submitted the job, it trained over a weekend, and we had our validated model by Monday. The governance difference is night and day."
Option B: Academic partnership — 12-18 month timeline
Partner with a university hospital. Co-author papers. Share IP. Limited to the specific research protocol. Cannot be used for commercial product development without renegotiation.
Option C: Public datasets only — limited ceiling
ChestX-ray14, MIMIC-CXR, PadChest, VinDr-CXR. Good for prototyping. Cannot capture real-world deployment conditions. Your model will underperform on actual hospital data.
Which radiology AI use cases work best?
- Chest X-ray classification — Detect pneumonia, pneumothorax, pleural effusion, nodules. Most NHS trusts have 100K+ chest X-rays in PACS.
- CT brain haemorrhage detection — Train on real emergency department CT scans. Time-critical model — needs diverse real-world data.
- Mammography screening AI — NHS breast screening programme generates 2M+ mammograms annually. Your model can train on actual screening data without export.
- MSK fracture detection — Train on real trauma X-rays from A&E departments. Public MSK datasets are small and staged.
Private-alpha. Data access depends on configured hospital nodes matching your imaging modality. Institutional approval required for production deployment.