P

Adapter Tuning: Definition and Examples

Efficient fine-tuning technique that involves inserting small trainable modules (called adapters) into a pre-trained language model, without modifying its original weights.

Full definition

Adapter Tuning is a method for adapting language models that relies on inserting small additional neural networks — adapters — between the existing layers of a pre-trained model. During training, only the parameters of these adapters are updated, while the original model's weights remain frozen. This allows adapting a model to a specific task by training only a tiny fraction of its total parameters.

This approach was popularized by the work of Houlsby et al. (2019) at Google Research. The central idea is that a massive model like GPT or BERT already contains a rich representation of language, and that adding a lightweight adaptation layer is sufficient to specialize it. Each adapter typically consists of a projection to a reduced dimension (bottleneck), a non-linear activation function, and then a projection back to the original dimension.

The main advantage of Adapter Tuning is its resource efficiency. Whereas traditional fine-tuning requires storing a complete copy of the model for each task, Adapter Tuning only requires storing the small added modules, typically less than 5% of the model's parameters. This makes it possible to deploy a single base model with multiple interchangeable adapter sets depending on the task.

Adapter Tuning is part of a broader family of techniques called PEFT (Parameter-Efficient Fine-Tuning), which also includes LoRA, Prefix Tuning, and Prompt Tuning. These methods all address the same challenge: how to customize increasingly large models without exploding computational and storage costs.

Etymology

The term "adapter" comes from English and refers to a device that makes a system compatible with a use different from the one for which it was designed. In machine learning, the concept was borrowed to describe these lightweight modules that "adapt" a general-purpose model to a specific task. The term "tuning" refers to the process of adjusting parameters.

Concrete examples

Adapt a general-purpose language model for sentiment classification in customer reviews, without modifying the base model.

You are a model specialized in sentiment analysis thanks to an adapter trained on French customer reviews. Classify the sentiment of this text as positive, negative, or neutral: "The product is okay but the delivery was catastrophic."

Use multiple adapters on the same base model to handle different languages or business domains.

Activate the 'legal-fr' adapter to analyze this contract. Identify potentially abusive clauses and explain them in plain language.

Explain to a technical team why choose Adapter Tuning over full fine-tuning for a project with limited resources.

Practical usage

In prompt engineering, understanding Adapter Tuning helps better choose your model adaptation strategy. If you need to specialize an LLM for a specific domain (medical, legal, technical) with a limited budget, Adapter Tuning is often the best compromise between performance and cost. Practically, libraries like Hugging Face PEFT allow implementing this technique in a few lines of code on open source models.

Related concepts

LoRA (Low-Rank Adaptation)Fine-TuningPEFT (Parameter-Efficient Fine-Tuning)Transfer Learning

FAQ

What is the difference between Adapter Tuning and LoRA?
Adapter Tuning inserts new modules (additional layers) between the existing layers of the model, while LoRA modifies existing weights via low-rank matrices without adding new layers. LoRA is generally lighter at inference because it does not add latency, whereas adapters introduce additional computations at each layer.
How many parameters are trained with Adapter Tuning?
Typically between 1% and 5% of the total model parameters, depending on the bottleneck size chosen for the adapters. For example, on a 7 billion parameter model, only 70 to 350 million parameters are trained, significantly reducing GPU memory requirements and computation time.
Can Adapter Tuning be used with models like GPT-4 or Claude?
No, Adapter Tuning requires access to the internal layers of the model, which is only possible with open source models (LLaMA, Mistral, etc.). For proprietary models like GPT-4 or Claude, one rather uses fine-tuning via API when available, or prompt engineering techniques to adapt the model's behavior without modifying its weights.

See also

How to use this prompt

  1. Copy the prompt with the button above.
  2. Paste it into ChatGPT, Claude or your favorite AI assistant.
  3. Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

More definitions

Get new prompts every week

Join our newsletter.