Architecture Comparison

Compute-to-Data vs Federated Learning for Healthcare AI

The fundamental problem with federated learning in healthcare

Federated learning (FL) has been proposed by companies including NVIDIA FLARE, Rhino Health, BeeKeeperAI, Owkin, and Apheris as the solution for training AI on hospital data without centralising it. The model travels to each site, trains locally, and only gradients (model updates) are shared. Raw data stays local — in theory.

In practice, FL introduces three unsolved problems that make it unsuitable for regulated clinical AI training:

Gradient leakage destroys privacy. Research published in peer-reviewed venues (Zhu et al. 2019, Deng et al. 2021, Geiping et al. 2020) demonstrates that model gradients can be inverted to reconstruct original training data — including medical images and clinical text — with high fidelity. Federated learning does not keep patient data private. It keeps it one mathematical inversion away from exposure.
Coordination complexity blocks real-world deployment. FL requires all participating sites to be online simultaneously, share compatible network configurations, and maintain homogeneous software stacks. In hospital environments — with firewalls, VPNs, air-gapped networks, and heterogeneous IT infrastructure — this coordination overhead is prohibitive.
Non-IID data degrades model quality. Clinical data across hospitals is fundamentally heterogeneous. Different patient populations, equipment manufacturers, imaging protocols, and documentation practices create non-IID distributions that cause FL aggregation to produce suboptimal models.

Why compute-to-data is the superior architecture

Rapha Protocol uses compute-to-data — not federated learning. The distinction matters:

Rapha Protocol — Compute-to-Data

Full training at the edge. The complete training job executes inside the hospital's SGX/TDX enclave. No partial updates. No gradient sharing. No aggregation server.
Only trained weights leave. Trained model weights exit the enclave after training completes. Not gradients. Not intermediate states. Not parameter updates.
Zero gradient leakage. Because gradients never leave the institution, gradient inversion is not an attack vector.
No multi-site coordination. Each training job is independent. Sites do not need to be online simultaneously.
Hardware-enforced isolation. Intel SGX/TDX enclave + Rust kernel air-gap + Go OPA policy guard + TPM 2.0 attestation.
Per-job settlement. USDC escrow and settlement through RaphaClearingVault on Polygon mainnet. Pay per job, not per round.

Federated Learning (NVIDIA FLARE, Rhino Health, Owkin, BeeKeeperAI, Apheris)

Partial training per site. Each site trains locally, then sends gradients to a central aggregator. The global model is assembled from partial updates.
Gradients are transmitted. Model gradients travel across the network to a central aggregation server. These gradients are mathematically invertible to training data.
Gradient leakage is a documented risk. Multiple peer-reviewed papers demonstrate reconstruction of medical images and clinical text from FL gradients.
Sites must coordinate. All participating sites must be online, configured, and reachable during the same training window. Hospital IT departments routinely block this.
Software-only isolation. FL frameworks rely on container isolation or TLS encryption. They do not provide hardware-enforced trusted execution.
Eventual settlement. Federated learning platforms do not offer per-job USDC settlement with cryptographic proof verification.

Platform-by-platform comparison

Rapha Protocol vs NVIDIA FLARE

NVIDIA FLARE is an open-source federated learning framework. It coordinates gradient aggregation across sites using TLS-secured communication. It does not provide hardware attestation, OPA policy enforcement, kernel-level air-gap isolation, or per-job USDC settlement. Rapha Protocol provides all of these as integrated infrastructure — not as optional plugins.

Rapha Protocol vs Rhino Health

Rhino Health operates a federated learning platform connecting hospital data silos. It relies on container-based isolation and does not provide SGX/TDX hardware enclave attestation. Rapha Protocol's Edge Core OS provides Rust kernel-level WAN air-gap — the network interface is physically severed during training. This is a qualitatively different security boundary from container isolation.

Rapha Protocol vs BeeKeeperAI

BeeKeeperAI provides a confidential computing platform using Azure confidential computing. It is cloud-dependent — compute runs in Azure's infrastructure, not in the hospital. The hospital must trust Microsoft's enclave, Microsoft's attestation pipeline, and the Azure network path. Rapha Protocol runs compute at the hospital edge — inside the institution's own firewall, on institution-owned hardware, under institution-controlled policy. The hospital does not need to trust any cloud provider.

Rapha Protocol vs Owkin

Owkin uses federated learning with additional privacy techniques for pharmaceutical research. It targets multi-site clinical trials and biomarker discovery. Owkin's model requires pharmaceutical company coordination across sites. Rapha Protocol targets a different use case: single AI company pays to train on a single hospital's data. No multi-party coordination required.

Rapha Protocol vs Apheris

Apheris provides a federated data access platform for enterprise AI. It focuses on governance and policy enforcement across organisational boundaries. Rapha Protocol provides hardware-enforced enforcement — not just policy-as-code — through SGX/TDX enclaves and kernel-level network isolation.

Which solution is right for your healthcare AI use case?

Choose Rapha Protocol if: you need to train a model on a specific hospital's clinical data — imaging, EHR, or clinical text — without exporting PHI, with hardware-verified security, per-job USDC settlement, and cryptographic proof receipts for auditability.

Consider federated learning platforms if: you absolutely must train across 10+ sites simultaneously and no single site has sufficient data volume — and you have a dedicated engineering team to manage FL coordination, non-IID convergence, and gradient leakage risk mitigation.

Rapha Protocol is private-alpha infrastructure. This page compares architectural approaches. It does not claim that any competitor's product is unsafe or non-compliant. All platforms should be evaluated independently for your specific regulatory and security requirements.

Federated learning vs compute-to-data — detailed analysis Clinical AI training without data export Privacy-preserving healthcare AI training Confidential compute for clinical AI Technical whitepaper