Top P: Definition and Examples
Top P, also known as nucleus sampling, is a generation parameter that controls the diversity of AI responses by limiting token selection to the most probable ones whose cumulative probability reaches a threshold P.
Full definition
Top P (or nucleus sampling) is a sampling method used during text generation by language models. Rather than considering the entire vocabulary at each generation step, the model only retains the most probable tokens whose cumulative probabilities reach the defined threshold P. For example, with a Top P of 0.9, the model selects the most probable tokens until their sum of probabilities reaches 90%, then randomly chooses from this subset.
This mechanism offers a major advantage over Top K: it dynamically adapts to context. If the model is very confident about the next word (e.g., after "the Eiffel"), the nucleus will be small and contain only a few candidates like "Tower" or "of". Conversely, in a more open context (e.g., the beginning of a story), the nucleus naturally expands to include more creative possibilities.
Top P is set between 0 and 1. A value close to 0 makes the model almost deterministic by keeping only the most probable token(s). A value of 1 disables filtering and considers the entire vocabulary. Common values range between 0.7 and 0.95 depending on use: lower for factual tasks, higher for creative tasks.
It is important to note that Top P interacts with temperature. Temperature modifies the probability distribution before Top P filtering. In practice, it is recommended to adjust one or the other, rather than both simultaneously, to maintain predictable control over model behavior.
Etymology
The term "Top P" comes from English, where P represents the cumulative probability threshold. The method was formalized under the name "nucleus sampling" in the research paper "The Curious Case of Neural Text Degeneration" by Holtzman et al. in 2019. The word "nucleus" refers to the subset of selected tokens, considered the core of the probability distribution.
Concrete examples
Creative writing with high diversity
Write an original story about a robot that discovers emotions. [Top P = 0.95, Temperature = 0.8]
Factual response requiring precision
Explain how photosynthesis works in simple terms. [Top P = 0.4, Temperature = 0.3]
Code generation with a balance between creativity and reliability
Write a Python function that sorts a list of dictionaries by key. [Top P = 0.7, Temperature = 0.5]
Practical usage
In prompt engineering, Top P is typically set via the model API parameters. For tasks requiring precision and coherence (summaries, data extraction, code), use a Top P between 0.3 and 0.6. For creative tasks (brainstorming, writing, idea generation), increase it to between 0.8 and 0.95 to get more varied and surprising responses.
Related concepts
FAQ
What is the difference between Top P and temperature?
What is the difference between Top P and Top K?
What default Top P value should I use?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Transformer: Definition and Examples
Neural network architecture introduced in 2017 by Google, based on the attention mechanism, which forms the basis of all major language models.
Tree Of Thought: Definition and Examples
Tree of Thought (ToT) is an advanced prompting technique that allows a language model to explore multiple reasoning paths simultaneously, under
Trustworthy AI: Definition and Examples
Trustworthy AI refers to artificial intelligence designed to be reliable, ethical, transparent, and respectful of fundamental rights.
Vector Database: Definition and Examples
A vector database is a specialized database for storing, indexing, and searching numerical vectors (embeddings), enabling...
Video Understanding: Definition and Examples
Ability of an AI model to analyze, interpret, and extract relevant information from video content, combining visual, temporal, and often audio understanding.
Vision RAG: Definition and Examples
Vision RAG is an extension of Retrieval-Augmented Generation that integrates visual documents (images, charts, scanned PDFs) into the search process.
Get new prompts every week
Join our newsletter.