Text Classification: Definition and Examples

Text classification is a natural language processing (NLP) technique that assigns one or more predefined categories to a given text.

Full definition

Text classification is one of the fundamental tasks of natural language processing (NLP). It involves analyzing a text — whether an email, customer review, article, or tweet — and automatically assigning it a label from a set of predefined categories. For example, classifying an email as "spam" or "not spam," or determining whether a comment expresses a positive, negative, or neutral sentiment.

Historically, text classification relied on statistical approaches such as Naive Bayes models or SVMs (support vector machines), which required significant feature engineering. With the advent of deep learning and language models like BERT, then large generative models (LLMs), text classification has undergone a revolution: it is now possible to classify texts with high precision using simply a well-crafted prompt, without training a specific model.

In the context of prompt engineering, text classification is one of the most common and accessible use cases. One can ask an LLM to classify a text by providing the desired categories directly in the prompt, using techniques such as zero-shot (without examples), few-shot (with a few examples), or chain-of-thought (by asking the model to explain its reasoning before deciding).

Practical applications are vast: content moderation, automatic routing of support tickets, sentiment analysis, document categorization, intent detection in chatbots, or automatic email sorting. The quality of classification depends heavily on the clarity of the defined categories and the precision of the instructions given in the prompt.

Etymology

The term comes from English "text" and "classification" (from Latin classificare, "to arrange by classes"). It appeared in the field of information retrieval in the 1960s-1970s, before becoming a pillar of modern NLP with the rise of machine learning in the 1990s.

Concrete examples

Sentiment analysis on customer reviews

Classify the following comment as POSITIVE, NEGATIVE, or NEUTRAL. Reply only with the label.

Comment: "The product arrived quickly but the quality really leaves something to be desired, very disappointed."

Classification:

Automatic routing of technical support tickets

You are a triage agent for customer service. Classify the following ticket into one of these categories: BILLING, TECHNICAL, DELIVERY, OTHER.

Ticket: "I can no longer log into my account since this morning's update."

Category:

Toxic content detection with multi-label classification

Analyze the following message and indicate which categories apply among: HARASSMENT, HATE_SPEECH, DISINFORMATION, NONE. Multiple categories may apply. Reply in JSON format.

Message: "These people don't deserve to live in our country."

Result:

Practical usage

In prompt engineering, text classification is implemented by clearly defining the possible categories in the prompt and asking the model to reply with the appropriate label. To improve accuracy, it is recommended to provide 2-3 examples (few-shot) and ask the model to briefly justify its choice before giving its final answer. Structuring the output in JSON format facilitates integration into automated pipelines.

Related concepts

Sentiment AnalysisNamed Entity Recognition (NER)Zero-Shot ClassificationFew-Shot Prompting

FAQ

What is the difference between mono-label and multi-label classification?

In mono-label classification, each text receives a single category (e.g., spam or not spam). In multi-label classification, a text can belong to multiple categories simultaneously (e.g., an article can be both "technology" and "business"). In a prompt, simply specify whether the model should choose one category or can select multiple.

Do I need to train a model to do text classification with an LLM?

No, that's one of the great advantages of modern LLMs. Thanks to zero-shot and few-shot prompting, you can classify texts simply by describing the categories in your prompt, without any training. For very large-scale use cases or those requiring maximum precision, fine-tuning can however be considered.

How can I improve the accuracy of a classification by prompt?

Several techniques are effective: define mutually exclusive and unambiguous categories, provide representative examples (few-shot), add descriptions for each category, ask the model to reason before classifying (chain-of-thought), and constrain the output format to avoid off-category responses.

How to use this prompt

Copy the prompt with the button above.
Paste it into ChatGPT, Claude or your favorite AI assistant.
Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

Prompt library Learn prompting Prompt builder Prompt optimizer

More definitions

Text Summarization: Definition and Examples

Text summarization is an AI technique that condenses a long document into a shorter version while preserving the essential information and overall meaning.

Text To Image: Definition and Examples

Text To Image refers to an artificial intelligence technology capable of generating images from a textual description, called

Text To Speech: Definition and Examples

Text To Speech (TTS) is a speech synthesis technology that converts written text into audible speech, allowing a machine to "read" content aloud.

Text To Video: Definition and Examples

Text To Video is an artificial intelligence technology that automatically generates video sequences from a textual description, transforming

Thread Of Thought: Definition and Examples

Prompting technique that asks the model to unravel a continuous thread of reasoning by identifying and connecting relevant information from a long context.

Tiktoken: Definition and Examples

Tiktoken is the open-source tokenization library developed by OpenAI, used to split text into tokens before sending it to models like GPT-4.

Get new prompts every week

Join our newsletter.