RAG Reference Hub

Learn RAG

Retrieval-Augmented Generation, from first principles to practice

Understand why RAG emerged, how it differs from fine-tuning and long-context models, and how modern RAG systems are designed.

Beginner path

Read the definition of RAG.
Study the pipeline from documents to citations.
Learn the difference between RAG, fine-tuning, and long context.
Use the glossary whenever a term is unfamiliar.

Expert path

Compare naive, advanced, modular, graph, agentic, and multimodal RAG.
Design an evaluation set before changing models.
Measure retrieval separately from generation.
Track source freshness and access-control behavior.

What is RAG?

Retrieval-Augmented Generation connects a language model to external knowledge sources. The retriever finds relevant evidence, and the model uses that evidence to answer with better grounding, accuracy, traceability, and domain relevance.

Why RAG emerged

RAG emerged because model weights cannot reliably contain every private, current, or specialized fact. Retrieval lets knowledge be updated independently from the model.

RAG vs fine-tuning

RAG is generally suitable when information changes or must be cited. Fine-tuning is generally suitable for behavior, tone, formatting, or domain patterns rather than constantly changing facts.

RAG vs long-context models

Long context can hold more material, but retrieval still helps select relevant evidence, reduce cost, improve traceability, and manage very large collections.

Naive, advanced, modular, graph, agentic, and multimodal RAG

Naive RAG retrieves chunks directly. Advanced RAG adds query rewriting, hybrid search, reranking, and evaluation. Modular RAG separates components. Graph, agentic, and multimodal RAG add relationships, tool use, and multiple content types.

Core concepts

Important concepts include chunking, embeddings, vector databases, hybrid search, metadata, reranking, query rewriting, retrieval evaluation, hallucination control, and citations.

Knowledge-base design

A RAG system is only as good as the knowledge it can retrieve. Good knowledge-base design defines source authority, metadata, ownership, update cadence, permissions, document structure, and deprecation rules before indexing starts.

Retrieval quality

Retrieval quality depends on chunking, embeddings, keyword coverage, metadata filters, reranking, and query transformation. Inspect failed questions directly; dashboards alone rarely explain why a source was missed.

Citations and traceability

Citations should help users verify claims, not merely decorate answers. Strong systems preserve source IDs, page numbers, section titles, timestamps, permissions, and snippets throughout the pipeline.

Security and governance

RAG systems must handle access control, prompt injection, sensitive data, logging, retention, source licensing, and human escalation. Retrieved text should be treated as untrusted content that cannot override system policy.

Production RAG

Production RAG requires monitoring, test sets, trace review, feedback loops, source refresh workflows, incident handling, and clear ownership. A prototype can answer questions; a production system must be maintained.

Strengths and limitations

RAG improves grounding and updateability, but it can still fail through bad ingestion, weak retrieval, stale sources, prompt injection, poor evaluation, or unsupported generation.

Common mistakes

Common mistakes include chunking everything the same way, ignoring metadata, skipping evaluation, trusting top-k retrieval blindly, failing to manage permissions, and not showing sources to users.

Best practices

Start with a narrow use case, curate sources, keep metadata, evaluate retrieval and answers separately, use reranking for noisy corpora, add citations, monitor failures, and define human escalation.

Practical examples

A simple RAG example

A university policy assistant receives the question: 'Can graduate students borrow interlibrary loan books?' The retriever searches approved library policy pages, finds the relevant borrowing rule, and passes that passage to the model. The answer cites the policy page instead of relying on model memory.

User asks a question
Retriever searches approved sources
Relevant passages are selected
Model answers only from those passages
UI shows citations

When RAG is the wrong tool

RAG is not a cure for every AI problem. If the task is pure classification, style transfer, translation, or extraction from a single provided document, retrieval may add unnecessary complexity.

Check whether external knowledge is needed
Check whether sources must be updated
Check whether citations matter
Choose simpler patterns when retrieval adds no value

A strong first project

The best first RAG project is narrow, source-rich, and easy to evaluate: an internal policy assistant, course-material Q&A bot, support documentation assistant, or research-paper explorer.

Limit the domain
Start with 20-100 trusted documents
Write test questions before launch
Review failures with subject experts

On this page

What is RAG?Why RAG emerged RAG vs fine-tuning RAG vs long-context models Naive, advanced, modular, graph, agentic, and multimodal RAG Core concepts Knowledge-base design Retrieval quality Citations and traceability Security and governance Production RAG Strengths and limitations Common mistakes Best practices

Trusted starting sources

Original RAG paperFoundational research paper for Retrieval-Augmented Generation.LangChain RAG tutorialOfficial implementation tutorial for indexing, retrieval, generation, and agents.LlamaIndex RAG guideOfficial conceptual and framework guide for data-connected LLM applications.Dify documentationOfficial source for Dify apps, knowledge bases, workflows, models, and deployment.Haystack documentationOfficial docs for component-based search, QA, and RAG pipelines.