Unsupervised Learning: Definition and Examples
Unsupervised learning is a branch of machine learning where a model analyzes data without prior labels to discover structures, patterns, or groupings within it.
Full definition
Unsupervised learning is a machine learning method in which an algorithm is trained on a dataset without labels or expected outputs. Unlike supervised learning, where each example is associated with a known output, unsupervised learning lets the model explore the data on its own to extract underlying structures.
The most common techniques include clustering (automatic grouping of similar data, such as K-means or DBSCAN), dimensionality reduction (such as PCA or t-SNE, which simplify complex data while preserving essential features), and anomaly detection. These methods are particularly useful when dealing with large amounts of raw data without human annotations.
In the context of large language models (LLMs), unsupervised learning plays a fundamental role. The pre-training phase of models like GPT or Claude largely relies on unsupervised principles: the model learns to predict the next word in vast text corpora, without being explicitly given the 'correct answers'. It is this ability to learn rich language representations autonomously that makes these models so versatile.
For prompt engineering practitioners, understanding unsupervised learning helps to better grasp how an LLM acquired its knowledge and why it can sometimes generalize surprisingly or, conversely, produce unexpected results. This understanding aids in formulating prompts that best leverage the patterns the model internalized during training.
Etymology
The term comes from English 'unsupervised', meaning 'without supervision'. It contrasts with 'supervised learning' where a 'supervisor'—in the form of human labels—guides the learning. The metaphor evokes a student learning through autonomous observation rather than directed instruction.
Concrete examples
Customer segmentation in marketing
I have a dataset of 10,000 customers with their purchasing behaviors. Suggest an unsupervised learning approach to identify distinct customer segments, detailing the recommended algorithm and features to use.
Anomaly detection in server logs
Act as a data scientist specialized in cybersecurity. Explain how to use unsupervised learning to detect anomalous behaviors in connection logs, without prior examples of attacks.
Text data exploration
I have 5,000 uncategorized customer reviews. How can I apply topic modeling (an unsupervised learning technique) to automatically discover recurring themes? Give me a step-by-step pipeline.
Practical usage
In prompt engineering, knowledge of unsupervised learning allows a better understanding of the strengths and limitations of LLMs. When a model spontaneously groups concepts or identifies analogies without explicit instruction, it relies on representations learned in an unsupervised manner. Exploit this by crafting prompts that ask the model to categorize, group, or identify patterns in unstructured data.
Related concepts
FAQ
What is the difference between supervised and unsupervised learning?
Do LLMs like Claude use unsupervised learning?
When to use unsupervised learning instead of supervised?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Vector Database: Definition and Examples
A vector database is a specialized database for storing, indexing, and searching numerical vectors (embeddings), enabling...
Video Understanding: Definition and Examples
Ability of an AI model to analyze, interpret, and extract relevant information from video content, combining visual, temporal, and often audio understanding.
Virtual Assistant: Definition and Examples
A virtual assistant is a computer program powered by artificial intelligence, capable of understanding natural language instructions and performing tasks on behalf of a user.
Vision Language Model: Definition and Examples
A Vision Language Model (VLM) is an artificial intelligence model capable of understanding and reasoning simultaneously over images and text, enabling
Vision RAG: Definition and Examples
Vision RAG is an extension of Retrieval-Augmented Generation that integrates visual documents (images, charts, scanned PDFs) into the search process.
World Model: Definition and Examples
A world model is an internal representation that an AI system builds of the external world, allowing it to simulate, predict, and reason about the consequences of its actions without having to execute them in reality.
Get new prompts every week
Join our newsletter.