Beam Search: Definition and Examples
Beam Search is a decoding algorithm used by language models to generate text by simultaneously exploring multiple candidate sequences to find the most probable output.
Full definition
Beam Search is a heuristic decoding strategy used when generating text with language models. Instead of selecting only the most likely word at each step (greedy approach), the algorithm maintains multiple partial hypotheses in parallel called "beams", allowing exploration of a wider solution space.
The process is as follows: at each generation step, the model calculates the probabilities of all possible tokens for each active beam. It then retains only the k most likely sequences, where k is the "beam width". This parameter controls the trade-off between output quality and computational cost. A width of 1 is equivalent to greedy search, while an infinite width would correspond to exhaustive search.
In practice, Beam Search produces more coherent and grammatically correct texts than greedy decoding, as it avoids locally optimal choices that lead to dead ends. However, it tends to generate more generic and repetitive texts than stochastic sampling methods like top-k or top-p sampling, because it systematically favors high-probability sequences.
This algorithm is particularly used in tasks where accuracy matters over creativity, such as machine translation, text summarization, or speech recognition. In the context of prompt engineering, understanding Beam Search helps in choosing the right generation parameters depending on the desired goal.
Etymology
The term "Beam Search" comes from the analogy with a light beam that simultaneously illuminates several possible paths in a search tree. The algorithm was developed in the 1970s in the fields of artificial intelligence and operations research, before being widely adopted in natural language processing.
Concrete examples
Automatic translation where precision is essential
Translate this legal text into English with high fidelity to the original meaning. Use beam search decoding with a beam width of 5 to maximize accuracy.
Generating factual summaries from a document
Summarize this scientific article in 3 sentences. Prioritize factual accuracy over creativity.
Comparison of decoding strategies via an LLM API
Generate two versions of this response: one with temperature=0 (deterministic, similar to beam search) and one with temperature=0.9 (creative). Compare the results.
Practical usage
In prompt engineering, Beam Search indirectly influences your results when you set generation parameters. Setting a low temperature (near 0) produces behavior similar to Beam Search: precise and predictable responses. Use this approach for factual tasks (summaries, translations, data extraction) and prefer more creative sampling methods for brainstorming or narrative writing.
Related concepts
FAQ
What is the difference between Beam Search and Greedy Search?
Why does Beam Search sometimes generate repetitive texts?
Can Beam Search be used with ChatGPT or Claude?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Benchmark: Definition and Examples
A benchmark is a standardized test that evaluates and compares the performance of an AI model on specific tasks, such as language understanding, ...
Beneficial AI: Definition and Examples
Beneficial AI refers to artificial intelligence designed and deployed in a way that produces positive effects for humanity, minimizing risks and
Chain-of-Thought (CoT): Definition and Examples
Chain-of-Thought pushes AI to reason step by step. Discover how this technique improves complex responses.
Chain Of Thought Reasoning: Definition and Examples
Chain of Thought Reasoning is a prompting technique that involves asking an AI model to break down its reasoning into intermediate steps.
Chinchilla Optimal: Definition and Examples
Training principle for large language models stating that model size and training data quantity should scale proportionally
Codex (OpenAI): Definition and Use Cases
Codex is OpenAI's autonomous coding agent. Understand how it works, its differences from Claude Code and Cursor, and when to use it.
Get new prompts every week
Join our newsletter.