How to train AI on real clinical data without moving patient data
Short answer: do not download hospital data. Route the model into the clinical data boundary. That is the core Rapha Protocol thesis: compute moves, patient data stays.
Rapha Protocol is building private-alpha infrastructure for AI teams that need real clinical signal without creating a raw-PHI export problem. The system is designed to send approved model workloads into controlled edge nodes, enforce policy before execution, return approved artifacts, and anchor proof metadata for auditability.
Problem
Clinical AI needs real data, but the data cannot simply leave.
Useful healthcare models need real clinical signal: EHR notes, labs, outcomes, radiology workflows, device telemetry, and patient-generated health data. The blocker is not model architecture. The blocker is governance. Hospitals cannot hand raw records to every AI company that wants better training data.
The usual cloud training pattern creates the wrong risk surface: copy sensitive data out, centralize it, then try to control the blast radius. Rapha Protocol inverts the path. The model goes to the controlled environment, not the other way around.
What Rapha Protocol does
Rapha Protocol routes AI workloads into controlled clinical edge nodes.
An AI team defines the training job, dataset intent, model artifact, output rules, and compute budget. Rapha Protocol is designed to route that workload into an on-prem or controlled edge node where the clinical data already lives. The raw dataset remains inside the hospital or institution-controlled boundary.
The goal is simple: give AI researchers access to real clinical learning signal while giving hospitals a defensible technical boundary. The researcher receives approved outputs such as trained weights, metrics, hashes, and receipts. They do not receive raw patient records.
Security model
Policy checks happen before the model touches data.
The edge runtime is built around fail-closed controls. OPA policy checks the workload. Dataset manifests must allowlist the selected cohort. Dataset mounts are expected to be read-only. The training runtime refuses execution when TEE posture, dataset path, trainer command, or required output artifacts are invalid.
This matters because the dangerous failure mode is silent exfiltration. Rapha Protocol treats network access, output files, logs, and trainer behavior as security boundaries. If a configured node cannot prove the required runtime posture, training should stop instead of guessing.
Output model
The output is a model artifact and receipt, not a data dump.
A successful job should return only approved outputs: trained weights, metrics, hashes, cryptographic receipt metadata, and settlement references. Raw PHI, DICOM exports, FHIR bundles, Apple Health samples, and genetic data should not be sent to the AI company, Polygon, IPFS, Vercel, Render, or any general-purpose web surface.
That is the practical difference between a data marketplace and compute-to-data infrastructure. Rapha Protocol is not trying to sell raw patient files. It is building the control plane for training against clinical data while the data wall remains intact.
Audit and settlement
Proof metadata can be anchored, but proof is not a clinical approval.
Rapha Protocol uses public proof surfaces to make execution claims auditable. Hashes, receipts, event metadata, and settlement references can be anchored on Polygon mainnet. The clearing-vault settlement path requires trusted-attestor verification before funds can release against a training job.
A mainnet proof receipt proves transaction inclusion and a cryptographic commitment. It does not prove model safety, clinical validity, regulatory clearance, HIPAA compliance, de-identification, or hospital approval by itself. Those still require contracts, security review, privacy review, and institutional signoff.
How it works
The workflow in five steps.
- Declare the job: the AI developer submits model artifact, cohort intent, output policy, and budget.
- Authenticate access: developer credentials and proof-session state are handled server-side, not trusted to the browser.
- Run at the edge: the workload executes beside local records under OPA policy and runtime checks.
- Return approved artifacts: the researcher receives trained weights, metrics, hashes, and receipt metadata.
- Verify and settle: trusted proof material can support settlement after the required attestation path is satisfied.
What is live, what is private alpha
- Live website and early-access intake: AI teams can register at /early-access.
- Secure Compute Console: live mode calls a configured enterprise-node API when available; demo mode is explicitly non-PHI and non-settleable.
- Enterprise node runtime: production runs require a valid trainer command, dataset manifest, OPA approval, mounted dataset path, artifact output, and verified TEE posture.
- Synthetic Render demo path: uses synthetic fixture data only. It is useful for smoke testing, not for clinical claims or USDC settlement.
- Production hospital onboarding: requires institutional approval, contracts, privacy/security review, and hardware attestation evidence.
Mainnet proof surface
Rapha Protocol has a public Polygon mainnet proof receipt and deployed settlement surfaces. This is a proof and audit surface, not a claim that production hospital PHI has been processed:
0xfadab8cc5e6bdb531d7ddfd64fd2a325a5dabda1c0f1eb7a21f05d15c618f9a0
Contract: 0xB27704CA8A01Bc151181D1d53E2F0eF11B39B32F
What this does and does not prove
The receipt proves that a public cryptographic commitment exists on Polygon mainnet. It does not prove clinical validity, model safety, regulatory clearance, de-identification, or healthcare compliance by itself.
Important: Rapha Protocol is private-alpha software. Public demos must not receive real PHI, DICOM exports, FHIR bundles, Apple Health exports, genetic data, private keys, seed phrases, or regulated production data. Production use requires written agreements, security review, privacy review, institutional approval, applicable BAA/DPA analysis, and verified hardware attestation.