Federated Learning vs Compute-to-Data for Healthcare AI
Two approaches to training without centralising data
When AI researchers need clinical data and institutions cannot export it, two architectural patterns emerge: federated learning and compute-to-data. Both aim to train models without centralising raw patient records. They differ fundamentally in what leaves the institution and what stays.
Federated learning: gradients leave, risk follows
In federated learning, a central server coordinates training across multiple sites. Each site trains a local model on its own data, then sends model gradients (updates) back to the central server. The server aggregates gradients and distributes the updated global model. Raw data stays local — or so the model claims.
The problem: gradients leak information. Research has demonstrated that model gradients can be inverted to reconstruct training data, including:
- Deep Leakage from Gradients (Zhu et al., 2019) — reconstructed images from gradient updates with pixel-level fidelity.
- Gradient inversion attacks on text (Deng et al., 2021) — recovered training sentences from language model gradients.
- Membership inference — determining whether a specific record was in the training set from gradient patterns.
Federated learning also introduces coordination complexity: all sites must be online simultaneously, network conditions must be reliable, and heterogeneous data distributions (non-IID data) degrade model convergence.
Compute-to-data: full training at the edge, weights only leave
With compute-to-data, the entire training job executes inside the data holder's environment. The researcher submits a model artifact, dataset intent, hyperparameters, and budget. An edge appliance runs the complete training loop locally against the clinical dataset. Only trained model weights (not gradients, not intermediate states) leave the boundary.
Key differences:
- No gradient transmission — eliminates gradient leakage as an attack vector entirely.
- No coordination dependency — each training job is independent. No multi-site synchronisation.
- No convergence degradation — full-batch training on the actual dataset, not aggregated partial updates.
- Hardware-enforced isolation — SGX/TDX enclave, kernel air-gap, and OPA policy gate create a verifiable execution boundary.
- Per-job settlement — escrow and payment per training job, not per gradient update cycle.
When each approach makes sense
Federated learning may be appropriate for: multi-site studies where no single site has enough data, non-sensitive data shared under research agreements, and scenarios where model architecture requires gradient-level coordination.
Compute-to-data is better suited for: single-site clinical datasets large enough for independent training, PHI-sensitive data where gradient leakage is unacceptable, regulated healthcare environments with strict data export controls, and commercial arrangements where the researcher pays per training job.
Rapha Protocol vs. NVFLARE / OpenFL / PySyft
Rapha Protocol is not a federated learning framework. It does not implement FL aggregation, gradient encryption, or multi-party computation. It is a compute orchestration, attestation, and settlement layer that enables full training at the edge. The training runtime itself can use any ML framework (PyTorch, TensorFlow, JAX). The protocol handles: authentication, policy enforcement, hardware attestation, job routing, network isolation, output validation, proof generation, and USDC settlement on Polygon mainnet.
Rapha Protocol does not claim that trained weights are automatically privacy-preserving. Additional leakage testing, differential privacy, and minimum cohort thresholds should be evaluated per deployment. This page compares architectural approaches, not regulatory outcomes.