Beam Search: Definition and Examples

Beam Search is a decoding algorithm used by language models to generate text by simultaneously exploring multiple candidate sequences to find the most probable output.

Full definition

Beam Search is a heuristic decoding strategy used when generating text with language models. Instead of selecting only the most likely word at each step (greedy approach), the algorithm maintains multiple partial hypotheses in parallel called "beams", allowing exploration of a wider solution space.

The process is as follows: at each generation step, the model calculates the probabilities of all possible tokens for each active beam. It then retains only the k most likely sequences, where k is the "beam width". This parameter controls the trade-off between output quality and computational cost. A width of 1 is equivalent to greedy search, while an infinite width would correspond to exhaustive search.

In practice, Beam Search produces more coherent and grammatically correct texts than greedy decoding, as it avoids locally optimal choices that lead to dead ends. However, it tends to generate more generic and repetitive texts than stochastic sampling methods like top-k or top-p sampling, because it systematically favors high-probability sequences.

This algorithm is particularly used in tasks where accuracy matters over creativity, such as machine translation, text summarization, or speech recognition. In the context of prompt engineering, understanding Beam Search helps in choosing the right generation parameters depending on the desired goal.

Etymology

The term "Beam Search" comes from the analogy with a light beam that simultaneously illuminates several possible paths in a search tree. The algorithm was developed in the 1970s in the fields of artificial intelligence and operations research, before being widely adopted in natural language processing.

Concrete examples

Automatic translation where precision is essential

Translate this legal text into English with high fidelity to the original meaning. Use beam search decoding with a beam width of 5 to maximize accuracy.

Generating factual summaries from a document

Summarize this scientific article in 3 sentences. Prioritize factual accuracy over creativity.

Comparison of decoding strategies via an LLM API

Generate two versions of this response: one with temperature=0 (deterministic, similar to beam search) and one with temperature=0.9 (creative). Compare the results.

Practical usage

In prompt engineering, Beam Search indirectly influences your results when you set generation parameters. Setting a low temperature (near 0) produces behavior similar to Beam Search: precise and predictable responses. Use this approach for factual tasks (summaries, translations, data extraction) and prefer more creative sampling methods for brainstorming or narrative writing.

Related concepts

Greedy DecodingTop-k SamplingTop-p Sampling (Nucleus Sampling)Temperature

FAQ

What is the difference between Beam Search and Greedy Search?

Greedy Search selects only the most likely token at each step, while Beam Search maintains multiple candidates in parallel (based on the beam width). Beam Search thus explores a wider space and generally produces higher-quality results, at the cost of greater computation time.

Why does Beam Search sometimes generate repetitive texts?

Beam Search favors high-probability sequences, which can lead to repetitive loops because repeating an already generated segment remains statistically likely. To counter this, repetition penalties or techniques like n-gram blocking are often used, which forbid repeating sequences of words already produced.

Can Beam Search be used with ChatGPT or Claude?

The ChatGPT and Claude APIs do not directly offer a "beam search" parameter. However, by setting the temperature to 0, you obtain deterministic decoding similar to the result of a Beam Search with width 1. For finer control, some open-source frameworks like Hugging Face Transformers allow explicit configuration of Beam Search with the desired width.

How to use this prompt

Copy the prompt with the button above.
Paste it into ChatGPT, Claude or your favorite AI assistant.
Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

Prompt library Learn prompting Prompt builder Prompt optimizer

More definitions

Benchmark: Definition and Examples

A benchmark is a standardized test that evaluates and compares the performance of an AI model on specific tasks, such as language understanding, ...

Beneficial AI: Definition and Examples

Beneficial AI refers to artificial intelligence designed and deployed in a way that produces positive effects for humanity, minimizing risks and

Bias-Variance: Definition and Examples

The bias-variance tradeoff is a fundamental principle in machine learning that describes the tension between two sources of error: bias (over-simplification) and variance (over-sensitivity to training data).

BLEU Score: Definition and Examples

The BLEU Score (Bilingual Evaluation Understudy) is an automatic metric that evaluates the quality of machine-generated text by comparing it to one or more human reference translations.

Browser Use: Definition and Examples

Browser Use refers to the ability of an AI agent to autonomously control a web browser to perform actions such as navigating sites, filling out forms, clicking buttons, and extracting information.

Byte Pair Encoding: Definition and Examples

Byte Pair Encoding (BPE) is a data compression algorithm adapted to text tokenization in natural language processing, which splits

Get new prompts every week

Join our newsletter.