Automatic Prompt Engineer: Definition and Examples

Method for automatic prompt optimization where a language model itself generates, evaluates, and refines the instructions it is given, in order to maximize the quality of responses without manual human intervention.

Full definition

Automatic Prompt Engineer (APE) is a technique introduced by researchers at the University of Toronto and Google DeepMind in 2022, which consists of automating the search for the optimal prompt for a given task. Instead of a human manually drafting and adjusting their instructions, the system automatically generates a set of candidates, tests them on examples, and then selects the most effective formulation.

The process works in three main steps. First, an LLM generates several instruction variants from a task description or input-output examples. Next, each candidate is evaluated on a validation dataset by measuring the quality of the responses produced. Finally, the best prompts are selected, possibly refined through successive iterations, until converging to an optimal formulation.

This approach is inspired by optimization metaheuristics: the prompt becomes a variable to optimize rather than a fixed parameter. Results show that automatically generated prompts often match or surpass those manually crafted by experts, including popular techniques like chain-of-thought prompting.

APE paves the way for scalable prompt engineering, where instruction optimization can be systematically applied to hundreds of tasks without mobilizing human expertise for each one. It represents a paradigm shift transforming prompt engineering from a craft into a reproducible engineering process.

Etymology

The term comes from the paper 'Large Language Models Are Human-Level Prompt Engineers' (Zhou et al., 2022). It combines 'Automatic' (without human intervention), 'Prompt' (the instruction given to the model), and 'Engineer' (one who designs and optimizes). The acronym APE is a humorous nod to the English word for the primate, suggesting that even a non-human process can excel at this task.

Concrete examples

Optimizing a sentiment classification prompt

Generate 10 different instructions to classify the sentiment of a customer review as positive, negative, or neutral. For each instruction, test it on these 20 examples and return the one that achieves the best accuracy score.

Automatic improvement of an existing prompt

Here is my current prompt: 'Summarize this text.' Generate 5 improved variants of this instruction, then evaluate each variant on the following 3 texts by rating the summary quality from 1 to 10.

Search for optimal prompt for a specialized translation task

I need to translate medical documents from French to English. Propose 8 different instruction formulations, varying the level of context given, terminology constraints, and output format. Indicate which one would be most suitable and why.

Practical usage

To apply APE concretely, start by clearly defining your task and preparing a small set of examples with expected outputs. Then ask the LLM to generate several prompt variants, test each on your examples, and keep the formulation that produces the best results. This approach is particularly useful when you need to optimize prompts at scale or when manual adjustments have plateaued.

Related concepts

Prompt OptimizationMeta-PromptingChain-of-Thought PromptingFew-Shot Learning

FAQ

What is the difference between APE and classic prompt engineering?

Classic prompt engineering relies on human intuition and manual experimentation. APE automates this process by using the LLM itself to generate, test, and select the best formulations. It is faster, more systematic, and often more effective on well-defined tasks.

Do you need technical skills to use APE?

The full version of APE requires setting up an automatic evaluation pipeline with API calls. However, the principle can be applied in a simplified way directly in a conversation with an LLM: ask it to generate several variants of a prompt and compare them, then choose the best one.

Does APE make human prompt engineering obsolete?

No. APE excels at optimizing formulations for measurable and well-defined tasks, but human expertise remains essential to frame the problem, define quality criteria, handle edge cases, and address ethical considerations. APE is a tool that augments the prompt engineer, it does not replace them.

How to use this prompt

Copy the prompt with the button above.
Paste it into ChatGPT, Claude or your favorite AI assistant.
Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

Prompt library Learn prompting Prompt builder Prompt optimizer

More definitions

Autonomous Agent: Definition and Examples

An autonomous agent is an artificial intelligence system capable of acting independently to achieve goals, making decisions, e

Autoregressive Model: Definition and Examples

An autoregressive model is a type of artificial intelligence model that generates sequences (text, code, audio) by predicting each next element based on the previously generated elements.

Backpropagation: Definition and Examples

Backpropagation is the fundamental algorithm for training neural networks, calculating how each weight contributes to the overall error.

Batch Processing: Definition and Examples

Batch processing is a method that groups multiple queries or tasks to send them simultaneously to an AI model,

Beam Search: Definition and Examples

Beam Search is a decoding algorithm used by language models to generate text by simultaneously exploring multiple candidate sequences.

Benchmark: Definition and Examples

A benchmark is a standardized test that evaluates and compares the performance of an AI model on specific tasks, such as language understanding, ...

Get new prompts every week

Join our newsletter.