Stop Sequence: Definition and Examples

A stop sequence is a predefined string of characters that tells the language model to stop generating text as soon as it produces it.

Full definition

A stop sequence is a control mechanism used during the inference of a language model. It consists of one or more character strings defined by the user in the API call parameters. When the model generates one of these sequences in its response, generation immediately stops, and the text produced up to that point is returned as the result.

This mechanism is essential for structuring LLM outputs. Without stop sequences, a model may continue generating text beyond what is useful: it could invent additional dialogue turns, add unrequested examples, or simply produce redundant content. Stop sequences act as a clear and predictable end signal, allowing precise and well-delimited responses.

In practice, stop sequences are particularly useful in scenarios where the output format is structured. For example, in a multi-turn chat system, you can use 'Human:' as a stop sequence to prevent the model from simulating the user's response. In a data extraction context, you can define a delimiter like '---' so that the model stops after producing a single information block.

Most language model APIs (OpenAI, Anthropic, Mistral, etc.) allow specifying multiple stop sequences simultaneously, typically up to four. The stop sequence itself is not included in the final response, which simplifies post-processing of the generated text.

Etymology

The term comes from classic computer science vocabulary. 'Stop' (arrêt) and 'sequence' (séquence de caractères) literally describe a sequence of characters that triggers the halt of a process. The concept already existed in serial communication protocols and parsers before being adopted by the field of language models.

Concrete examples

Prevent a chatbot from simulating user responses

API parameter: stop=["Human:", "User:"] — The model stops as soon as it attempts to generate a new human turn of speech.

Extract a single answer in a structured format

Prompt: "Give the name of the country and nothing else."
Stop sequence: ["\n"] — The model stops after the first line, avoiding any superfluous explanation.

Generate code function by function

Stop sequence: ["def ", "class "] — The model generates a single Python function and stops before starting another one.

Practical usage

Use stop sequences to precisely delimit your model's output and avoid generating superfluous content. Define them based on the expected format: a line break for a single-line response, a specific delimiter for structured content, or a role marker for multi-turn conversations. Combine them with the max_tokens parameter for full control over the length and form of the response.

Related concepts

TokenMax TokensTemperatureTop-p (Nucleus Sampling)

FAQ

What is the difference between a stop sequence and the max_tokens parameter?

The max_tokens parameter sets an absolute limit on the number of tokens generated, regardless of the relevance of the content. The stop sequence, on the other hand, stops generation based on the content produced. The two mechanisms are complementary: max_tokens acts as a safety net, while the stop sequence allows a semantically relevant stop.

Is the stop sequence included in the model's response?

No, in the vast majority of APIs (OpenAI, Anthropic, Mistral), the stop sequence that triggered the stop is not included in the response text. However, the API usually indicates in its metadata that generation stopped due to a stop sequence (finish_reason: "stop") rather than due to token exhaustion.

How many stop sequences can be defined at once?

It depends on the API used. OpenAI allows up to 4 stop sequences per request. Anthropic's API (Claude) also accepts multiple stop sequences. It is recommended to check the documentation of the specific API you are using for the exact limit.

How to use this prompt

Copy the prompt with the button above.
Paste it into ChatGPT, Claude or your favorite AI assistant.
Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

Prompt library Learn prompting Prompt builder Prompt optimizer

More definitions

Streaming: Definition and Examples

Streaming is a technique for transmitting AI model responses in real time, token by token, rather than waiting for the complete generation before

Structured Output: Definition and Examples

A structured output is a response generated by an AI model in a predefined, machine-readable data format such as JSON, XML, or YAML.

Superintelligence: Definition and Examples

Superintelligence refers to a form of artificial intelligence that would vastly surpass human cognitive abilities in all domains, including

Supervised Learning: Definition and Examples

Supervised learning is an artificial intelligence method where a model learns from labeled data, i.e., examples whose output is known.

Synthetic Data: Definition and Examples

Synthetic data is artificially generated data created by algorithms or AI models, designed to replicate the statistical characteristics...

Synthetic Media: Definition and Examples

Synthetic media refers to any content — text, image, audio, or video — generated or manipulated by artificial intelligence algorithms, particularly through

Get new prompts every week

Join our newsletter.