Automatic Prompt Engineer: Definition and Examples
Method for automatic prompt optimization where a language model itself generates, evaluates, and refines the instructions it is given, in order to maximize the quality of responses without manual human intervention.
Full definition
Automatic Prompt Engineer (APE) is a technique introduced by researchers at the University of Toronto and Google DeepMind in 2022, which consists of automating the search for the optimal prompt for a given task. Instead of a human manually drafting and adjusting their instructions, the system automatically generates a set of candidates, tests them on examples, and then selects the most effective formulation.
The process works in three main steps. First, an LLM generates several instruction variants from a task description or input-output examples. Next, each candidate is evaluated on a validation dataset by measuring the quality of the responses produced. Finally, the best prompts are selected, possibly refined through successive iterations, until converging to an optimal formulation.
This approach is inspired by optimization metaheuristics: the prompt becomes a variable to optimize rather than a fixed parameter. Results show that automatically generated prompts often match or surpass those manually crafted by experts, including popular techniques like chain-of-thought prompting.
APE paves the way for scalable prompt engineering, where instruction optimization can be systematically applied to hundreds of tasks without mobilizing human expertise for each one. It represents a paradigm shift transforming prompt engineering from a craft into a reproducible engineering process.
Etymology
The term comes from the paper 'Large Language Models Are Human-Level Prompt Engineers' (Zhou et al., 2022). It combines 'Automatic' (without human intervention), 'Prompt' (the instruction given to the model), and 'Engineer' (one who designs and optimizes). The acronym APE is a humorous nod to the English word for the primate, suggesting that even a non-human process can excel at this task.
Concrete examples
Optimizing a sentiment classification prompt
Generate 10 different instructions to classify the sentiment of a customer review as positive, negative, or neutral. For each instruction, test it on these 20 examples and return the one that achieves the best accuracy score.
Automatic improvement of an existing prompt
Here is my current prompt: 'Summarize this text.' Generate 5 improved variants of this instruction, then evaluate each variant on the following 3 texts by rating the summary quality from 1 to 10.
Search for optimal prompt for a specialized translation task
I need to translate medical documents from French to English. Propose 8 different instruction formulations, varying the level of context given, terminology constraints, and output format. Indicate which one would be most suitable and why.
Practical usage
To apply APE concretely, start by clearly defining your task and preparing a small set of examples with expected outputs. Then ask the LLM to generate several prompt variants, test each on your examples, and keep the formulation that produces the best results. This approach is particularly useful when you need to optimize prompts at scale or when manual adjustments have plateaued.
Related concepts
FAQ
What is the difference between APE and classic prompt engineering?
Do you need technical skills to use APE?
Does APE make human prompt engineering obsolete?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Benchmark: Definition and Examples
A benchmark is a standardized test that evaluates and compares the performance of an AI model on specific tasks, such as language understanding, ...
Codex (OpenAI): Definition and Use Cases
Codex is OpenAI's autonomous coding agent. Understand how it works, its differences from Claude Code and Cursor, and when to use it.
Computer Use: Definition and Examples
Ability of an AI model to directly interact with a computer by controlling the mouse, keyboard, and screen, just as a human user would.
Custom GPT: Definition and How to Create Your Own
Understand OpenAI's Custom GPTs: pre-configured ChatGPT assistants. Step-by-step creation, differences with Claude Skills and Gemini Gems.
Embedding: Definition and Examples
An embedding is a numerical representation of text, image, or other data type as a vector of numbers, enabling AI models to measure semantic similarity between items.
Gemini Gem: Definition and Creation (Google)
Understand Google's Gemini Gems: preconfigured Gemini assistants. Creation, Google Workspace integration, comparison with Custom GPT and Claude Skills.
Get new prompts every week
Join our newsletter.