Stop Sequence: Definition and Examples
A stop sequence is a predefined string of characters that tells the language model to stop generating text as soon as it produces it.
Full definition
A stop sequence is a control mechanism used during the inference of a language model. It consists of one or more character strings defined by the user in the API call parameters. When the model generates one of these sequences in its response, generation immediately stops, and the text produced up to that point is returned as the result.
This mechanism is essential for structuring LLM outputs. Without stop sequences, a model may continue generating text beyond what is useful: it could invent additional dialogue turns, add unrequested examples, or simply produce redundant content. Stop sequences act as a clear and predictable end signal, allowing precise and well-delimited responses.
In practice, stop sequences are particularly useful in scenarios where the output format is structured. For example, in a multi-turn chat system, you can use 'Human:' as a stop sequence to prevent the model from simulating the user's response. In a data extraction context, you can define a delimiter like '---' so that the model stops after producing a single information block.
Most language model APIs (OpenAI, Anthropic, Mistral, etc.) allow specifying multiple stop sequences simultaneously, typically up to four. The stop sequence itself is not included in the final response, which simplifies post-processing of the generated text.
Etymology
The term comes from classic computer science vocabulary. 'Stop' (arrêt) and 'sequence' (séquence de caractères) literally describe a sequence of characters that triggers the halt of a process. The concept already existed in serial communication protocols and parsers before being adopted by the field of language models.
Concrete examples
Prevent a chatbot from simulating user responses
API parameter: stop=["Human:", "User:"] — The model stops as soon as it attempts to generate a new human turn of speech.
Extract a single answer in a structured format
Prompt: "Give the name of the country and nothing else."
Stop sequence: ["\n"] — The model stops after the first line, avoiding any superfluous explanation.
Generate code function by function
Stop sequence: ["def ", "class "] — The model generates a single Python function and stops before starting another one.
Practical usage
Use stop sequences to precisely delimit your model's output and avoid generating superfluous content. Define them based on the expected format: a line break for a single-line response, a specific delimiter for structured content, or a role marker for multi-turn conversations. Combine them with the max_tokens parameter for full control over the length and form of the response.
Related concepts
FAQ
What is the difference between a stop sequence and the max_tokens parameter?
Is the stop sequence included in the model's response?
How many stop sequences can be defined at once?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Streaming: Definition and Examples
Streaming is a technique for transmitting AI model responses in real time, token by token, rather than waiting for the complete generation before
Synthetic Media: Definition and Examples
Synthetic media refers to any content — text, image, audio, or video — generated or manipulated by artificial intelligence algorithms, particularly through
System Prompt: Definition and Examples
The system prompt is an initial hidden instruction, defined by the developer, that configures the behavior, tone, and limits of an AI model before
Temperature (AI): Definition and Examples
Temperature is a parameter that controls the degree of randomness and creativity in AI responses.
Test Time Compute: Definition and Examples
Test Time Compute refers to the computing power used by an AI model during inference (response generation), as opposed to the resources consumed during training.
Text Classification: Definition and Examples
Text classification is a natural language processing (NLP) technique that assigns one or more categories to a given text.
Get new prompts every week
Join our newsletter.