Does Federated Learning Keep Patient Data Private?
Short answer: No. Federated learning transmits gradients that can be mathematically inverted to reconstruct training data.
Federated learning is marketed as a privacy solution: "data stays local, only model updates are shared." This framing omits the critical finding from peer-reviewed research: model gradients contain enough information to reconstruct training data with high fidelity.
"I've reproduced gradient inversion attacks on medical imaging data. Using the Deep Leakage from Gradients technique (Zhu et al., 2019), I can reconstruct chest X-rays from federated learning gradient updates with recognizable diagnostic features. If your federated learning system transmits gradients from a model training on patient data, those gradients are functionally equivalent to transmitting the data itself — just in a compressed, lossy format that the research community has repeatedly shown can be inverted. The 'privacy by staying local' claim is technically true for raw data, but misleading when applied to the actual information flow in FL systems."
Key research on gradient leakage
- Zhu et al. (2019) — Deep Leakage from Gradients: Demonstrated reconstruction of training images from gradient updates with pixel-level fidelity. CIFAR-10, MNIST, and face images successfully reconstructed.
- Geiping et al. (2020) — Inverting Gradients: Extended gradient inversion to ImageNet-scale images and transformer models. Showed reconstruction is possible even with large batches.
- Deng et al. (2021) — TAG: Gradient Attack on Transformer-based Language Models: Recovered training sentences from language model gradients. Directly applicable to clinical NLP on patient notes.
- Wei et al. (2020) — A Framework for Evaluating Gradient Leakage Attacks in Federated Learning: Systematic evaluation showing gradient leakage is a fundamental limitation, not a bug — it exists because gradients encode the relationship between data and model parameters.
The alternative: compute-to-data with hardware TEE
Rapha Protocol runs full training locally inside an SGX/TDX enclave at the hospital. No gradients are transmitted. No intermediate states leave the institution. Only trained model weights exit after training completes. The gradient leakage attack vector does not exist because gradients never cross the network boundary.
Community Q&A
"Adding DP noise to gradients can reduce inversion fidelity, but it also degrades model quality — especially on rare classes and minority populations, which are exactly the cases clinical AI needs to capture. There's no free lunch: you trade privacy for utility. With hardware TEE-based compute-to-data, you get both."
"Secure aggregation prevents the central server from seeing individual gradients, but it doesn't prevent gradient leakage. The gradients are still transmitted — just encrypted in transit. The aggregator still receives and processes them. And secure aggregation adds computational overhead that scales with the number of participants."