Reranking: Definition and Examples
Reranking is a technique that reorders an initial list of results by applying a more precise model to improve the relevance of the top-ranked items.
Full definition
Reranking is a post-processing step used in information retrieval and Retrieval-Augmented Generation (RAG) systems. Its principle is simple: after a fast initial retrieval phase that returns a set of candidates, a more sophisticated model reassesses and reorders these results to place the most relevant ones at the top.
In a typical RAG pipeline, the initial search often relies on fast but approximate methods, such as cosine similarity between embeddings or keyword-based search (BM25). These approaches are effective at reducing a corpus of millions of documents to a few dozen candidates, but they sometimes lack nuance. The reranker then acts as a second filter, using a cross-encoder model that jointly analyzes the query and each candidate document to produce a much more accurate relevance score.
The most common reranking models are transformers specifically trained for this task, such as Cohere Rerank, cross-encoder models from the SBERT family, or Jina AI re-rankers. Unlike the bi-encoders used during the retrieval phase, these models take the query-document pair as input and can capture fine semantic interactions between the two texts.
Reranking has become an essential component of modern RAG architectures because it significantly improves the quality of responses generated by LLMs, ensuring that the context provided to the model contains the most relevant information. The computational cost remains reasonable since the reranker only processes the top N results from the initial search, typically between 20 and 100 documents.
Etymology
The term "reranking" comes from the English "to rank" with the prefix "re-" (again). It literally means "to re-rank" or "reorder". The concept originates from the field of Information Retrieval, where two-stage architectures (retrieve then rerank) have been used since the 2000s, well before the LLM era.
Concrete examples
RAG pipeline for a document chatbot
You are an assistant that answers questions from internal documents. Here are the 5 most relevant passages after reranking. Use only these passages to answer, prioritizing the first ones which are the most relevant.
Improving an e-commerce search engine
Rerank these 20 search results for the query 'waterproof hiking shoes' considering the relevance of the title, product description, and customer reviews. Return the 5 most relevant results in order.
Filtering context before injection into a prompt
Among these 10 documentation excerpts, identify the 3 most relevant to answer the following question: 'How to configure OAuth2 authentication?' Rank them in descending order of relevance.
Practical usage
In prompt engineering, reranking is mainly applied in RAG pipelines to improve the quality of context injected into your prompts. Integrate a reranking model (like Cohere Rerank or a cross-encoder) between your vector retrieval step and the LLM call. This reduces noise in the context and yields more accurate answers, especially when the initial search returns results of varying relevance.
Related concepts
FAQ
What is the difference between initial ranking and reranking?
Is reranking essential in a RAG pipeline?
What tools can be used to implement reranking?
See also
How to use this prompt
- Copy the prompt with the button above.
- Paste it into ChatGPT, Claude or your favorite AI assistant.
- Replace the bracketed variables with your details, then refine the result.
About Prompt Guide
Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.
More definitions
Responsible AI: Definition and Examples
Responsible AI refers to a set of principles and practices aimed at designing, developing and deploying artificial intelligence systems in a manner that is ethical, transparent and respectful of human rights.
Retrieval: Definition and Examples
Retrieval refers to the process by which an AI system searches for relevant information in a database or document corpus
RLHF: Definition and Examples
RLHF (Reinforcement Learning from Human Feedback) is a language model training technique that uses human feedback to align responses
Role Prompting: Definition and Examples
Role prompting involves assigning a specific role, identity, or expertise to an AI model in the prompt, in order to guide the style, tone, and
Rotary Position Embedding: Definition and Examples
Rotary Position Embedding (RoPE) is a positional encoding technique that incorporates token position information into a Transformer model by applying
ROUGE Score: Definition and Examples
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a family of automatic metrics used to evaluate the quality of summaries generated by
Get new prompts every week
Join our newsletter.