P

Llama 3: Definition and Examples

Llama 3 is a family of open-source large language models developed by Meta (formerly Facebook), designed to compete with the best proprietary models while remaining freely accessible to the community.

Full definition

Llama 3 (Large Language Model Meta AI, 3rd generation) is a family of language models published by Meta in April 2024. It represents a significant leap over Llama 2, with performance rivaling proprietary models like GPT-4 and Claude on many benchmarks. Llama 3 is available in several sizes — notably 8B and 70B parameters — allowing it to adapt to various use cases, from deployment on a personal computer to the most demanding cloud infrastructures.

Meta's philosophy with Llama 3 is based on open source: the model weights are freely downloadable, allowing anyone to use, fine-tune, or integrate them into commercial applications (under a permissive license). This approach has catalyzed an entire ecosystem of tools, adaptations, and derivative models created by the community, making Llama 3 one of the most widely adopted open-source models in the world.

On the technical side, Llama 3 builds on an optimized Transformer architecture, an improved tokenizer (128K token vocabulary), and training on a massive corpus of over 15 trillion tokens. The model excels in reasoning, code generation, instruction following, and multilingual understanding. Meta has also released Llama 3.1 (with a 405B parameter version) and Llama 3.2 (integrating multimodal vision capabilities and lightweight versions for edge computing), solidifying Llama 3 as an ever-evolving platform.

For prompt engineering practitioners, Llama 3 offers the major advantage of being runnable locally or on private infrastructure, ensuring full data control. Its structured prompt format (with system, user, and assistant role tags) is compatible with advanced prompting techniques such as few-shot, chain-of-thought, and RAG.

Etymology

"Llama" stands for Large Language Model Meta AI. The number 3 denotes the third major generation of this model family. The name also winks at the llama, the animal, which Meta uses as an informal mascot for the project.

Concrete examples

Local deployment for a confidential corporate chatbot

You are a legal assistant specialized in French labor law. Answer precisely and cite the relevant legal articles. Question: what are the conditions for a valid mutually agreed termination?

Fine-tuning Llama 3 for a specific domain

Using the Alpaca format, generate 50 instruction/response pairs to train a model specialized in veterinary medical diagnosis for cattle.

Usage via a compatible API (Ollama, vLLM, Together AI)

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a Python data analysis expert. Answer with commented code.
<|start_header_id|>user<|end_header_id|>
Write a script that loads a CSV, detects outliers using IQR, and generates a visual report.

Practical usage

In prompt engineering, Llama 3 is mainly used when data confidentiality, deep customization via fine-tuning, or control over inference costs are required. It can be deployed locally with tools like Ollama or llama.cpp, or used via compatible cloud providers. Standard prompting techniques (system prompt, few-shot, chain-of-thought) work effectively, following Llama 3's specific prompt format with its role tags.

Related concepts

Open-source language modelFine-tuningQuantization (GGUF, GPTQ)Local inference

FAQ

What is the difference between Llama 3, Llama 3.1, and Llama 3.2?
Llama 3 (April 2024) introduced the 8B and 70B models. Llama 3.1 (July 2024) added a massive 405B parameter model and extended the context window to 128K tokens. Llama 3.2 (September 2024) brought multimodal capabilities (vision) and ultra-lightweight models (1B and 3B) designed for mobile and edge computing.
Can Llama 3 be used commercially?
Yes. Meta distributes Llama 3 under a permissive community license that allows commercial use, including for companies with fewer than 700 million monthly active users. Beyond this threshold, a special license is required. It is advisable to read the license carefully before any production deployment.
How to run Llama 3 on your own computer?
The easiest way is to use Ollama (ollama run llama3) or LM Studio, which automatically handle downloading and quantization of the model. For a GPU with 8 GB VRAM, the 8B version quantized to 4 bits works well. The 70B version requires at least 40 GB VRAM or can be distributed across multiple GPUs. Optimized formats like GGUF also allow CPU execution, albeit slower.

See also

How to use this prompt

  1. Copy the prompt with the button above.
  2. Paste it into ChatGPT, Claude or your favorite AI assistant.
  3. Replace the bracketed variables with your details, then refine the result.

About Prompt Guide

Prompt Guide is a free library of 2500+ ready-to-use prompts for ChatGPT, Claude and other AIs, with guides to learn prompting and tools to build and optimize your own prompts.

More definitions

Get new prompts every week

Join our newsletter.