P Tuning: Definition and Examples
P-Tuning is a technique for adapting large language models by optimizing continuous embeddings ("learnable prompts") inserted into the model's input, without modifying the model's weights.
Full definition
P-Tuning (for Prompt Tuning) is a parameter-efficient fine-tuning method that allows adapting a large language model (LLM) to a specific task without modifying its billions of internal parameters. Instead of rewriting the neural network weights, learnable continuous vectors—called "soft prompts" or "virtual tokens"—are added directly to the input sequence. These vectors are optimized via backpropagation during training.
Unlike classic prompt engineering where instructions are manually written in natural language, P-Tuning works in the model's embedding space. The virtual tokens do not correspond to any real word: they are numerical representations that the optimization algorithm adjusts to maximize performance on the target task. This approach achieves performance comparable to full fine-tuning while modifying only a tiny fraction of the parameters.
There are two main versions of this technique. P-Tuning v1, introduced by Liu et al. in 2021, uses a small LSTM network to generate the embeddings for the learnable prompts. P-Tuning v2, published shortly after, extends the concept by inserting learnable prefixes at every layer of the transformer (not just at the input), significantly improving performance on complex tasks such as text understanding or entity extraction.
P-Tuning belongs to the broader family of PEFT (Parameter-Efficient Fine-Tuning) methods, alongside LoRA and Prefix Tuning. Its main advantage is that it allows the same model instance to serve multiple different tasks: simply load the set of soft prompts corresponding to each task, without duplicating the entire model in memory.
Etymology
The term "P-Tuning" is a contraction of "Prompt Tuning," combining the concept of prompt (instruction given to the model) and tuning (adjustment, optimization). It was introduced in the paper "GPT Understands, Too" by Xiao Liu et al. (Tsinghua University) in 2021, specifically to denote the optimization of continuous prompts in the embedding space.
Concrete examples
Sentiment classification on customer reviews without retraining the entire model
Train soft prompts of 20 virtual tokens on a dataset of annotated reviews, then insert them before each review to be classified to obtain a positive/negative prediction.
Named entity extraction in medical documents with a generalist model
Use P-Tuning v2 with learnable prefixes at every layer of the transformer to specialize a generalist model for medical entity recognition (medications, symptoms, pathologies).
Multi-task deployment on a single inference server
Store a set of soft prompts per task (summarization, translation, Q&A) and dynamically load the correct set at inference time, while the base model remains shared in GPU memory.
Practical usage
In practice, P-Tuning is particularly useful when you need to adapt an LLM to a business task without sufficient GPU resources for full fine-tuning. You can use frameworks like PEFT from Hugging Face to implement P-Tuning v2 in just a few lines of code. It is an ideal solution for multi-task production deployments where GPU memory is a critical constraint.
Related concepts
FAQ
What is the difference between P-Tuning and classic prompt engineering?
What is the difference between P-Tuning v1 and P-Tuning v2?
Is P-Tuning as performant as full fine-tuning?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Perplexity Metric: Definition and Examples
Perplexity is an evaluation metric for language models that measures how "surprised" a model is by a given text. The lower the perplexity, the more effectively the model predicts the word sequence.
Persona Prompting: Definition and Examples
A prompt engineering technique that involves assigning a specific role, identity, or character to the AI to guide the style, tone, and expertise of its responses.
Phi 3: Definition and Examples
Phi 3 is a family of small language models (SLMs) developed by Microsoft Research, designed to deliver performance close to large models while being compact enough to run on local devices.
Pinecone: Definition and Examples
Pinecone is a cloud-native vector database designed to store, index, and search embeddings at scale, used particularly in
Plan And Solve: Definition and Examples
Prompting technique that asks the model to first devise a resolution plan before solving a problem, thereby improving its performance on
Positional Encoding: Definition and Examples
Positional Encoding is a technique used in Transformer architectures to inject information about the position of each token in a sequence.
Get new prompts every week
Join our newsletter.