P

Federated Learning: Definition and Examples

Federated Learning is an AI model training technique where data remains on users' local devices, only the model parameters are shared and aggregated on a central server.

Full definition

Federated Learning is a decentralized approach to training machine learning models. Unlike traditional methods that require centralizing all data on a single server, this technique allows training a model using data distributed across many devices (smartphones, hospitals, companies) without ever transferring them.

The process works in several steps: a global model is sent to each participant, who trains it locally on their own data. Only the model weight updates (the gradients) are sent back to the central server, which aggregates them to improve the global model. This cycle repeats until convergence. The best-known aggregation algorithm is FedAvg (Federated Averaging), proposed by Google in 2017.

The major benefit of Federated Learning lies in privacy protection. Sensitive data — whether medical records, personal messages, or financial data — never leaves the owner's device. This approach directly meets regulatory requirements such as the GDPR in Europe, while allowing to benefit from the power of large volumes of diverse data.

This technique is not without challenges: participants' data is often heterogeneous (non-IID), network connections can be unstable, and measures must be taken against poisoning attacks where a malicious participant tries to corrupt the global model. Complementary techniques such as differential privacy and homomorphic encryption are often combined with Federated Learning to strengthen privacy guarantees.

Etymology

The term "Federated Learning" was introduced by Google in 2016 in a research paper by McMahan et al. The word "federated" refers to a federation, i.e., a union of autonomous entities that collaborate toward a common goal while maintaining their independence — here, each device or organization retains control of its data while contributing to a shared model.

Concrete examples

Predictive keyboard on smartphone

Explain how Google uses Federated Learning in Gboard to improve word suggestions without collecting users' messages.

Multi-hospital medical research

Design a Federated Learning architecture allowing 5 hospitals to collaborate to train a tumor detection model without sharing patient data.

Bank fraud detection

How could several banks use Federated Learning to train a common fraud detection model while respecting banking secrecy? Detail the steps and precautions.

Practical usage

In prompt engineering, understanding Federated Learning allows you to formulate precise questions about decentralized model training and data protection. You can ask an AI to design federated architectures, compare aggregation algorithms (FedAvg, FedProx), or evaluate trade-offs between model performance and privacy. It is a key concept for any project involving sensitive data or data distributed across multiple organizations.

Related concepts

Differential PrivacyDistributed Machine LearningEdge ComputingTransfer Learning

FAQ

What is the difference between Federated Learning and traditional centralized training?
In traditional training, all data is sent to a central server to train the model. In Federated Learning, data remains on local devices: only the updated model parameters are shared. This preserves data privacy while enabling collaborative learning.
Does Federated Learning fully guarantee data privacy?
Not on its own. Although raw data is not shared, the exchanged gradients can theoretically be exploited to reconstruct some information. That's why Federated Learning is often combined with complementary techniques like differential privacy (adding noise to gradients) or homomorphic encryption to strengthen privacy guarantees.
What are the main use cases of Federated Learning today?
The most common use cases include improving predictive keyboards on smartphones (Google Gboard, Apple), collaborative medical research between hospitals, fraud detection among financial institutions, and optimizing models on IoT devices. Any situation where data is sensitive, regulated, or too voluminous to be centralized is a good candidate.

See also

How to use this prompt

  1. Copy the prompt with the button above.
  2. Paste it into ChatGPT, Claude or your favorite AI assistant.
  3. Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

More definitions

Get new prompts every week

Join our newsletter.