Reflection: Definition and Examples

Reflection is an AI technique where a language model iteratively evaluates and corrects its own responses, analyzing its errors to produce a more accurate and reliable result.

Full definition

Reflection (or Reflexion) is an advanced paradigm in artificial intelligence that allows an agent or language model to self-evaluate after generating a response. Rather than producing a single, final output, the model examines its own production, identifies errors, inconsistencies, or gaps, and then generates an improved version. This process can be repeated over several iterations until a satisfactory quality level is reached.

This approach is directly inspired by human metacognitive mechanisms: when we solve a complex problem, we naturally check our reasoning, spot logical flaws, and adjust our response. Reflection transposes this behavior to AI systems by introducing an explicit feedback loop in the generation process.

In practice, reflection can be implemented in several ways. In prompt engineering, the model is explicitly asked to critique its own response before finalizing it. In more elaborate architectures like the Reflexion framework (Shinn et al., 2023), an agent receives feedback signals—from an environment, an external evaluator, or its own analysis—which it stores in memory to guide subsequent attempts.

Reflection is particularly effective for complex reasoning tasks, mathematical problem solving, code generation, and argumentative writing. It significantly reduces hallucinations and logical errors, making the model a more reliable and self-correcting system.

Etymology

The term "Reflection" is borrowed from Latin reflexio (the action of turning back, folding back). In the context of AI, it was popularized by the paper "Reflexion: Language Agents with Verbal Reinforcement Learning" (Shinn et al., 2023), which formalizes the idea of an agent capable of learning from its errors through verbal self-evaluation rather than weight updates.

Concrete examples

Iterative correction of logical reasoning

Solve this problem step by step. Then review your solution, identify any reasoning errors, and propose a corrected version if necessary.

Improving the quality of a written text

Write an argumentative paragraph about [TOPIC]. Then evaluate your text according to these criteria: clarity, strength of arguments, concrete examples. Rewrite an improved version taking your own critique into account.

Code debugging through self-evaluation

Write a Python function that [OBJECTIVE]. After writing it, analyze it to detect any bugs, unhandled edge cases, or performance issues. Correct the code, explaining each modification.

Practical usage

To apply reflection in your prompts, systematically add a self-critique step after the initial generation. Ask the model to explicitly identify weaknesses in its response and then produce a revised version. This technique is particularly cost-effective on complex tasks where the first naive response often contains subtle errors.

Related concepts

Chain of ThoughtSelf-ConsistencyTree of ThoughtsSelf-evaluation

FAQ

What is the difference between reflection and Chain of Thought?

Chain of Thought breaks down reasoning into sequential steps during initial generation. Reflection goes further: it adds a self-evaluation phase after this generation, allowing the model to review its steps, detect errors, and correct its response. The two techniques are complementary.

Does reflection work with all language models?

Reflection is most effective with large models (GPT-4, Claude, etc.) that have sufficient self-analysis capabilities. Smaller models may struggle to reliably identify their own errors, limiting the technique's usefulness. It is recommended to test with your target model to assess the actual benefit.

How many iterations of reflection are recommended?

Generally, one to two iterations of reflection suffice for significant improvement. Beyond three iterations, gains become marginal and token costs increase substantially. For simple cases, a single pass of review-and-correction is often optimal.

How to use this prompt

Copy the prompt with the button above.
Paste it into ChatGPT, Claude or your favorite AI assistant.
Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

Prompt library Learn prompting Prompt builder Prompt optimizer

More definitions

Regularization: Definition and Examples

Regularization is a set of techniques used in machine learning to prevent overfitting by adding constraints or penalties to the model during training.

Reinforcement Learning: Definition and Examples

Reinforcement Learning is a branch of machine learning where an agent learns to make optimal decisions by interacting with an environment and receiving rewards or penalties.

Rephrase And Respond: Definition and Examples

A prompt engineering technique that asks the model to rephrase the user's question in its own words before answering, improving

Responsible AI: Definition and Examples

Responsible AI refers to a set of principles and practices aimed at designing, developing and deploying artificial intelligence systems in a manner that is ethical, transparent and respectful of human rights.

Retrieval: Definition and Examples

Retrieval refers to the process by which an AI system searches for relevant information in a database or document corpus

RLHF: Definition and Examples

RLHF (Reinforcement Learning from Human Feedback) is a language model training technique that uses human feedback to align responses

Get new prompts every week

Join our newsletter.