Accuracy: Definition and Examples
Accuracy (or exactness) measures the proportion of correct answers produced by an AI model compared to all generated responses. It is one of the fundamental metrics for evaluating the reliability of an artificial intelligence system.
Full definition
Accuracy (or exactness) is an evaluation metric that quantifies the ability of an artificial intelligence model to produce correct results. It is calculated by dividing the number of correct predictions by the total number of predictions made. A model with 95% accuracy means it gives the correct answer 95 times out of 100.
In the context of large language models (LLMs) like GPT-4 or Claude, accuracy takes on a more nuanced dimension. Unlike a binary classifier where the answer is either right or wrong, an LLM generates free text whose correctness can be partial, contextual, or subjective. We then speak of factual accuracy (are the stated facts verifiable?), semantic accuracy (does the meaning of the response match the question?), or logical accuracy (is the reasoning coherent?).
In prompt engineering, accuracy is directly influenced by the quality of instructions given to the model. A vague or ambiguous prompt will produce less precise responses, while a structured prompt with clear constraints, examples, and a defined output format will significantly improve the accuracy of results. Techniques like Chain-of-Thought, few-shot prompting, or cross-checking allow measurable increases in accuracy.
It is important to note that accuracy alone is not always sufficient to evaluate a model. On imbalanced datasets, a model can display high accuracy while consistently failing on minority cases. That is why it is often complemented by other metrics like precision, recall, or F1-score to obtain a more complete view of performance.
Etymology
The term 'accuracy' comes from the Latin 'accuratus', the past participle of 'accurare' meaning 'to do with care'. In English, it became established in scientific vocabulary to denote the exactness of a measurement. In artificial intelligence, it was adopted as-is as a standard metric from the early work in machine learning in the 1950s-1960s.
Concrete examples
Image classification: evaluate whether a model correctly identifies photos of cats and dogs
Analyze this image and identify the animal present. Reply only with 'cat' or 'dog'. Justify your choice in one sentence.
Factual verification: ensure that an LLM does not generate hallucinations on historical data
Answer the following question based solely on verifiable facts. If you are unsure of any information, indicate it explicitly rather than inventing. Question: In which year was the Eiffel Tower built?
Structured data extraction: measuring the model's ability to correctly extract information from text
Extract the following information from this resume in JSON format: name, email, years of experience, main skills. If any information is missing, use null. Do not infer anything that is not explicitly mentioned.
Practical usage
To improve the accuracy of your prompts, be explicit about the expected output format and provide concrete examples of correct responses (few-shot prompting). Use verification instructions like 'Check your answer before giving it' or 'If you're not sure, say so' to reduce errors. Finally, break down complex tasks into successive steps (Chain-of-Thought) so that the model reasons more rigorously.
Related concepts
FAQ
What is the difference between accuracy and precision in AI?
Can we measure the accuracy of a large language model (LLM)?
How can prompt engineering improve a model's accuracy?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Agent: Definition and Examples
An agent is an AI system capable of acting autonomously to accomplish complex tasks, planning its actions, using tools, and…
Agentic Workflow: Definition and Examples
An agentic workflow is a workflow in which one or more AI agents autonomously make decisions, chain actions, and adapt
AI A/B Testing: Definition and Examples
AI A/B Testing refers to the use of artificial intelligence to design, execute, and analyze A/B tests in an automated way, enabling
AI Accountability: Definition and Examples
AI Accountability refers to the set of principles and mechanisms ensuring that artificial intelligence systems, as well as their designers and users, are held responsible for their decisions, impacts, and outcomes.
AI Alignment: Definition and Examples
AI Alignment refers to the set of research and techniques aimed at ensuring that artificial intelligence systems act in accordance with human intentions, values, and interests.
AI Audit: Definition and Examples
An AI Audit is a systematic evaluation process of an artificial intelligence system aiming to verify its compliance, reliability, fairness, and transparency.
Get new prompts every week
Join our newsletter.