Federated learning, differential privacy, secure computation, and the privacy-utility tradeoff
Privacy regulations (GDPR, CCPA, HIPAA) and user expectations increasingly demand that AI systems handle sensitive data responsibly. Yet training ML models often requires large datasets—and that data might contain personal information you'd rather not expose. Privacy computing techniques enable useful computation on sensitive data while providing mathematical guarantees about what cannot be learned.
This article covers the main privacy-preserving technologies: federated learning, differential privacy, secure multi-party computation, and homomorphic encryption.
Federated learning trains models across distributed datasets without centralizing the data. The key insight: move the model to the data, not the data to the model.
1. SERVER: Broadcast model to participating devices
2. DEVICES: Train locally on local data
3. DEVICES: Send model updates (not data) to server
4. SERVER: Aggregate updates (FedAvg or variants)
5. SERVER: Update shared model
6. Repeat until convergence
Google uses federated learning for keyboard prediction on Android—your phone trains locally on your typing patterns and only shares model updates, not your keystrokes.
The basic federated algorithm:
Server:
w̄_k+1 = Σ(n_i / n) × w_i
Where:
n_i = number of samples on device i
n = total samples across all devices
w_i = model update from device i
This weighted average combines updates proportional to dataset size
Model updates are not the same as no information. Research has shown:
Solution: Combine federated learning with differential privacy or secure aggregation.
Differential privacy provides mathematical guarantees that individual data points cannot be identified—even with full access to the model and outputs.
A mechanism M provides (ε, δ)-differential privacy if for any
adjacent datasets D and D' (differing in one record) and any
output O:
Pr[M(D) ∈ O] ≤ e^ε × Pr[M(D') ∈ O] + δ
Where:
ε = privacy budget (smaller = more private)
δ = probability of violating ε guarantee
In simpler terms: removing or changing any single record doesn't significantly change the probability of any output.
Add noise to the output calibrated to the sensitivity of the query:
M(D) = f(D) + Laplace(Δf / ε)
Where:
f(D) = query result (e.g., count, mean)
Δf = sensitivity (max change from removing one record)
ε = privacy parameter
Laplace = Laplace distribution
For a count query, sensitivity is 1 (removing one person changes count by at most 1). For a sum, sensitivity is the maximum value any individual can contribute.
For queries with many outputs or compositions:
M(D) = f(D) + N(0, σ²)
Where σ² ≥ 2 × log(1.25/δ) × (Δf)² / ε²
Gaussian mechanism provides (ε, δ)-DP compared to pure ε-DP for Laplace.
Privacy budget (ε) depletes with each query:
Apple uses differential privacy for analytics:
DP-SGD (Differentially Private Stochastic Gradient Descent) trains neural networks with differential privacy guarantees:
For each batch:
1. Compute gradients
2. Clip gradients to maximum norm C
3. Add noise calibrated to C/ε
4. Average and update
Result: Trained model provides (ε, δ)-DP guarantee
OpenAI, Google, and Microsoft have published DP-trained models. Apple uses DP-SGD for iOS keyboard learning.
Secure Multi-Party Computation (MPC) enables multiple parties to compute a function on their combined inputs without revealing inputs to each other.
Two millionaires want to know who is richer without revealing their wealth.
Solution: Use garbled circuits and oblivious transfer
Result: Both learn only which is richer, nothing else about the other's wealth
Split a secret into shares distributed to parties:
Shamir's Secret Sharing:
- Split secret into n shares
- Any t shares can reconstruct secret
- Fewer than t shares reveal nothing
Example: (2,3) sharing of 42
Share 1: (1, 17)
Share 2: (2, -8)
Share 3: (3, 33)
Any 2 shares can reconstruct the line and find the secret (42)
| Protocol | Setting | Security | Performance |
|---|---|---|---|
| Garbled Circuits | 2-party | Semihonest, malicious | Moderate |
| Secret Sharing | n-party | Various | Varies |
| GMW (Goldreich-Micali-Wigderson) | n-party | Semihonest | Good for arithmetic |
| SPDZ | n-party | Malicious | Slower but actively secure |
Google open-sourced a library for private set intersection with aggregations:
Homomorphic encryption (HE) allows computation on encrypted data. You can encrypt data, perform computations, and decrypt results—without ever seeing the plaintext.
Classical: Enc(f(Dec(x))) = f(x) is impossible
HE: Enc(f(x)) = f(Enc(x)) is possible
Order of operations:
1. Client encrypts data: Enc(x)
2. Server computes on ciphertext: Enc(f(x)) = f(Enc(x))
3. Client decrypts: Dec(Enc(f(x))) = f(x)
Server never sees x or f(x)
| Type | Operations | Performance | Use Case |
|---|---|---|---|
| Somewhat (SHE) | Limited (+, -, ×) | Moderate | Specific applications |
| Leveled FHE | Bounded circuit depth | Slow | Research |
| FHEW/TFHE | Arbitrary (bootstrapping) | Very slow | Research |
| CKKS (approximate) | Addition, multiplication | Moderate | Machine learning |
CKKS (Cheon-Kim-Kim-Song) is the scheme most practical for ML:
CKKS properties:
- Encrypted vectors of real numbers
- Addition and multiplication work
- Results are approximate (rounding errors)
- Rescaling needed after multiplications
ML operations possible:
- Matrix multiplication (convolution)
- Activation functions (approximated)
- Polynomials
Microsoft SEAL, IBM HElib, and PALISADE are popular HE libraries.
HE is slow—10,000-100,000x slower than plaintext computation:
Practical systems combine HE with other techniques:
More privacy requires more noise or less information, which reduces model utility. This tradeoff is fundamental.
| Technique | Privacy Guarantee | Utility Impact | Computational Cost |
|---|---|---|---|
| Federated Learning | None alone | Minimal | Moderate |
| Federated + DP | Strong | Moderate | Moderate |
| Centralized DP | Strong | Moderate to High | Low |
| MPC | Strong | Minimal | High |
| HE | Strong | Minimal | Very High |
Production privacy systems often combine multiple approaches:
Federated Learning + Differential Privacy + Secure Aggregation
1. Federated: Model trained on distributed devices
2. DP: Each update clipped and noised
3. Secure Aggregation: Server sees only aggregate, not individual updates
Result: Privacy guarantees even against malicious server
Privacy computing techniques have matured significantly. Federated learning enables model training across distributed data. Differential privacy provides mathematical guarantees. MPC enables secure joint computation. Homomorphic encryption allows computation on encrypted data.
Each technique involves tradeoffs between privacy strength, utility, and computational cost. The right approach depends on the use case:
The field is advancing rapidly. As techniques mature and computation becomes cheaper, expect privacy-preserving ML to become increasingly practical.