P

Vector Database: Definition and Examples

A vector database is a specialized database for storing, indexing, and searching numerical vectors (embeddings), enabling fast retrieval of similar data through vector distance calculation.

Full definition

A vector database is a storage system designed specifically to manage multidimensional numerical representations called embeddings. These embeddings are produced by artificial intelligence models that transform text, images, audio, or any other type of data into vectors — lists of numbers that capture the semantic meaning of the original content.

Unlike traditional databases that search for exact matches (by keywords or identifiers), a vector database excels at similarity search. It uses algorithms like ANN (Approximate Nearest Neighbors) to quickly identify the vectors closest to a query vector, even among millions or billions of entries. This capability is at the heart of RAG (Retrieval-Augmented Generation) systems that feed LLMs with contextual knowledge.

The most popular vector databases include Pinecone, Weaviate, Qdrant, Milvus, and Chroma, as well as vector extensions for existing databases like pgvector for PostgreSQL. Each offers different trade-offs in terms of performance, scalability, and features (metadata filtering, multi-tenancy, hybrid search).

In the prompt engineering and generative AI ecosystem, vector databases play a fundamental role: they give memory and context to language models by retrieving the most relevant information to inject into a prompt, making responses more accurate and grounded in real data.

Etymology

The term combines "vector" (from Latin vector, "one who carries"), used in mathematics to denote a quantity with multiple dimensions, and "database". The expression became popular around 2022-2023 with the explosion of LLM-based applications and the need to massively store embeddings for semantic search.

Concrete examples

Building a RAG chatbot with knowledge base

You are an assistant who answers based solely on the provided context. Here are the most relevant documents retrieved from our vector database:

{RETRIEVED_CONTEXT}

User question: {QUESTION}

Answer precisely and cite your sources.

Semantic search in a product catalog

The user is searching for: "{USER_QUERY}". Here are the 5 most similar products found by vector search:

{PRODUCTS}

Generate a natural response presenting these products and explaining why they match the search.

Duplicate detection and content deduplication

Analyze the following articles identified as semantically close (similarity score > 0.92) by our vector database. Determine if they are true duplicates or distinct content covering the same topic:

Article A: {TITLE_A}
Article B: {TITLE_B}

Reply with: DUPLICATE, SIMILAR, or DISTINCT, and justify.

Practical usage

In prompt engineering, vector databases are essential for building RAG systems: you split your documents into chunks, generate embeddings via a model (like OpenAI's text-embedding-3-small), store them in a vector database, then retrieve the most relevant passages to inject into the prompt before querying the LLM. The quality of the chunking strategy and embedding model directly impacts the relevance of generated responses.

Related concepts

EmbeddingRAG (Retrieval-Augmented Generation)Semantic searchCosine similarity

FAQ

What is the difference between a vector database and a classic database?
A classic database (SQL or NoSQL) searches for exact matches on structured fields (name, date, ID). A vector database stores multidimensional numerical representations and performs similarity searches: it finds items whose meaning is closest to the query, even without exact keyword matches. This enables semantic search.
Do I need a dedicated vector database or can I use PostgreSQL with pgvector?
For small to medium projects (less than a few million vectors), pgvector with PostgreSQL is often sufficient and avoids adding extra infrastructure. For large-scale needs with billions of vectors, very low latency requirements, or advanced features (hybrid search, automatic sharding), a dedicated solution like Pinecone, Qdrant, or Milvus is more suitable.
How to choose the right chunking strategy for feeding a vector database?
Chunk size directly influences retrieval quality. Chunks that are too small lose context, chunks that are too large dilute relevant information. A good practice is to split into paragraphs of 200 to 500 tokens with an overlap of 10 to 20%. It is also recommended to test different strategies (by paragraph, by section, by sentence) and measure result relevance on a reference question-answer set.

See also

How to use this prompt

  1. Copy the prompt with the button above.
  2. Paste it into ChatGPT, Claude or your favorite AI assistant.
  3. Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

More definitions

Get new prompts every week

Join our newsletter.