Back

FEDERATED LEARNING & ENCRYPTION

Undergraduate thesis exploring privacy-utility trade-offs in federated learning using homomorphic encryption, differential privacy, and a novel sequential hybrid approach.

PyTorchOpacusPyfhelDockerCIFAR-10

Overview

This undergraduate thesis systematically evaluates software-based privacy mechanisms in federated learning. We implemented and compared five FL configurations — baseline, homomorphic encryption (HE), differential privacy (DP), standard hybrid (simultaneous HE+DP), and a novel sequential hybrid approach (DP during training, HE during aggregation) — on CIFAR-10 image classification. The sequential hybrid is our key contribution: by temporally separating privacy mechanisms, it prevents error compounding and improves learning stability over the standard simultaneous application. All experiments were containerized with Docker for reproducibility.

Key Results

68.34%

Baseline Accuracy

57.90%

HE Accuracy

35.05%

DP Accuracy

33.51%

Sequential Hybrid

92.1%

HE non-IID Retention

94%

HE Comm. Reduction

Methodology

Built a modular FL framework in PyTorch with a client-server architecture using FedAvg. Implemented DP via Opacus (gradient clipping + calibrated Gaussian noise) and HE via Pyfhel (CKKS scheme with parameter quantization and chunking). Evaluated across IID and non-IID (Dirichlet α=0.5, α=0.1) data distributions with 10 clients. Privacy was measured through membership inference attacks (threshold-based and shadow model approaches). Resource overhead tracked computation time, communication costs, and memory.

What We Built

  • Five FL configurations evaluated under consistent experimental conditions
  • Novel sequential hybrid approach — DP during training, HE during aggregation
  • Comprehensive MIA-based privacy evaluation (threshold + shadow model)
  • IID and non-IID data distribution analysis (Dirichlet α=0.5, α=0.1)
  • Full resource overhead analysis: computation, communication, memory
  • Dockerized reproducible experimental framework

Challenges

  • Balancing computational overhead of CKKS encryption with training efficiency on CIFAR-10 CNNs
  • Achieving meaningful privacy guarantees while maintaining model utility across five distinct configurations
  • Running concurrent Lambda-style experiments locally to simulate distributed FL clients
  • Calibrating differential privacy noise multipliers to find the privacy-utility sweet spot

Outcomes

  • Quantified a clear inverse relationship between privacy protection and model accuracy across all five approaches
  • Demonstrated the sequential hybrid improves learning stability over standard simultaneous application
  • Showed HE maintains 92.1% accuracy under highly non-IID conditions, while hybrid approaches retain only 56–66%
  • Published as an IEEE conference paper: 'Privacy-Utility Trade-offs in Federated Learning for 6G Networks'

Papers & Reports


Back to all projects