P

Overfitting: Definition and Examples

Overfitting (or overtraining) refers to the phenomenon where an AI model adapts too precisely to the training data, to the point of losing its ability to generalize correctly to new data.

Full definition

Overfitting is one of the most common problems in machine learning. It occurs when a model learns not only the relevant patterns in the training data but also the statistical noise, anomalies, and peculiarities specific to that particular dataset. The result is a model that performs remarkably well on data it has already seen but fails on new, unseen data.

To intuitively understand overfitting, imagine a student who memorizes word-for-word the answers to a past exam without understanding the underlying concepts. They will get a perfect score if asked exactly the same questions, but will be unable to answer questions phrased differently on the same topic. This is exactly what happens with an overfitted model: it 'memorizes' instead of 'learning.'

In prompt engineering, the concept of overfitting takes on a different but equally important dimension. When optimizing a prompt based solely on a few test examples, you risk creating a prompt that works perfectly for those specific cases but fails on slightly different ones. This is called 'prompt overfitting': the prompt is too specific and lacks robustness to the variety of possible inputs.

To detect overfitting in a classic model, compare its performance on training data and on a separate validation dataset. A large gap between the two is a warning sign. Regularization techniques, dropout, cross-validation, and data augmentation are strategies to combat this phenomenon. In prompt engineering, the best defense is to test your prompts on a large and diverse set of use cases before considering them final.

Etymology

The term 'overfitting' comes from English 'over' (excess) and 'fitting' (adjustment). Literally, it means 'excessive adjustment.' The concept originates from statistics, where one speaks of 'overparameterization' of a model that fits too closely to observed data. In French, the official term is 'surapprentissage,' although the Anglicism 'overfitting' remains widely used in the French-speaking community.

Concrete examples

Training an image classification model

My model achieves 99% accuracy on the training set but only 60% on the test set. Explain why this gap exists and propose 5 concrete techniques to reduce overfitting, ranked by effectiveness.

Prompt optimization for a customer support chatbot

I optimized my system prompt based on 10 typical conversations. Act as a prompt engineering expert and help me identify the risks of prompt overfitting. Propose a testing methodology with edge cases to verify the robustness of my prompt.

Fine-tuning an LLM on a small business dataset

I want to fine-tune a language model on 500 examples of domain-specific responses. What are the risks of overfitting with such a small dataset, and what precautions should I take (learning rate, number of epochs, early stopping)?

Practical usage

In prompt engineering, avoid overfitting by testing your prompts on a diverse set of scenarios, not just the ones that motivated their creation. Use generalizable instructions rather than ultra-specific rules tailored to a single example. If you use few-shot prompting, vary the provided examples to cover different cases and prevent the model from replicating too narrow a pattern.

Related concepts

UnderfittingRegularizationCross-validationGeneralization

FAQ

What is the difference between overfitting and underfitting?
Overfitting occurs when a model is too complex and learns the noise in the training data, while underfitting occurs when a model is too simple to capture the real patterns. Overfitting yields excellent performance on training but poor on test; underfitting yields poor performance in both cases. The goal is to find the right balance between the two.
How do I know if my model is overfitting?
The most telltale sign is a significant gap between performance on training data and on validation data. If your model achieves 98% accuracy on training but only 70% on validation, that is a typical case of overfitting. Also monitor learning curves: if the validation loss starts increasing while the training loss continues to decrease, overfitting has begun.
Can we talk about overfitting in prompt engineering?
Yes, this is called 'prompt overfitting' when a prompt is excessively optimized for a few specific test cases at the expense of its ability to perform well on varied inputs. This often happens when you iterate on a prompt by only checking results on one or two examples. The solution is to build a diverse test set and systematically validate each prompt modification against all these cases.

See also

How to use this prompt

  1. Copy the prompt with the button above.
  2. Paste it into ChatGPT, Claude or your favorite AI assistant.
  3. Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

More definitions

Get new prompts every week

Join our newsletter.