Precision Recall: Definition and Examples
Precision and recall are two complementary metrics used to evaluate the quality of a classification model or information retrieval system. Precision measures the proportion of relevant results among those returned, while recall measures the proportion of relevant results actually retrieved.
Full definition
Precision and recall are fundamental metrics in artificial intelligence and information retrieval. They allow evaluating the performance of a system that must identify relevant items among a dataset. These two measures are inseparable because they capture two different facets of result quality.
Precision answers the question: 'Among all the items the model identified as positive, how many are actually positive?' For example, if a spam filter classifies 100 emails as spam and 90 of them are indeed spam, the precision is 90%. A high-precision system produces few false positives. Recall answers the question: 'Among all the actually positive items, how many were correctly identified?' If the inbox contains 120 total spams and the filter detects 90, recall is 75%.
There is generally a trade-off between these two metrics, known as the precision-recall trade-off. Increasing precision tends to decrease recall, and vice versa. A very conservative system will have high precision but low recall (it only flags cases it is sure about), while a permissive system will have high recall but lower precision (it captures everything, including false positives). The F1-score, harmonic mean of precision and recall, helps find a balance between the two.
In the context of prompt engineering, understanding these metrics helps formulate more effective instructions. When asking an LLM to extract information or classify content, you can guide its responses toward more precision ('return only results you are certain of') or more recall ('list all possible items, even uncertain ones'). This understanding is essential for calibrating expectations and refining obtained results.
Etymology
The terms 'precision' and 'recall' come from the field of information retrieval, where they were formalized in the 1950s-1960s. The word 'precision' comes from Latin 'praecisio' (cutting off, exactness), while 'recall' comes from English 'to recall' (remember, retrieve). In French, the terms 'taux de précision' and 'taux de rappel' or 'sensibilité' are sometimes used for recall in the medical field.
Concrete examples
Named entity extraction in a document
Extract all companies mentioned in this text. Prioritize recall: list every possible mention, even if you are not 100% sure it is a company. Indicate your confidence level for each entry.
Support ticket classification
Classify this support ticket into one of the following categories: bug, feature request, question. Only classify the ticket if you are more than 90% confident — otherwise, answer 'uncertain'. I prefer precision over recall here.
Detection of inappropriate content in comments
Analyze these comments and flag those that contain offensive content. Better to flag a false positive than to let an offensive comment slip through — prioritize recall.
Practical usage
In prompt engineering, mastering the precision-recall trade-off allows calibrating an LLM's responses according to the use case. For critical tasks (medical diagnosis, fraud detection), prioritize recall to avoid missing anything. For tasks where false positives are costly (sending alerts, customer recommendations), prioritize precision by adding confidence thresholds in the prompt.
Related concepts
FAQ
What is the difference between precision and accuracy?
How to choose between precision and recall in a prompt?
What is the F1-Score and when to use it?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Presence Penalty: Definition and Examples
The Presence Penalty is a language model parameter that penalizes tokens that have already appeared in the generated text, encouraging the model to introduce
Prompt Chaining: Definition and Examples
Prompt chaining is a technique that involves chaining multiple sequential prompts, where the output of each step feeds the input of the next, to
Prompt Engineering: Definition and Examples
Prompt engineering is the art and science of formulating precise and structured instructions to get the best possible results from a generative AI model.
Prompt Injection: Definition and Examples
Attack technique consisting of inserting malicious instructions into a prompt to divert the intended behavior of a language model (LLM) and
Pruning: Definition and Examples
Pruning is an optimization technique that involves removing the least important parameters, neurons, or connections from a neural network
Quantization: Definition and Examples
Quantization is an optimization technique that reduces the numerical precision of AI model weights (e.g., from 32 bits to 8 or 4 bits) in order to reduce memory footprint and speed up inference, while preserving performance as much as possible.
Get new prompts every week
Join our newsletter.