Why This Playbook?
RAG is often described as retrieve then generate, but production reliability requires explicit grounding policies and measurable abstention behavior. This playbook focuses on those engineering contracts.
Repos:
- RAG: github.com/amiya-pattnaik/rag-engineering-playbook
- Generative: github.com/amiya-pattnaik/generativeAI-engineering-playbook
- Agentic: github.com/amiya-pattnaik/agentic-engineering-playbook
Concept Primer: What Is RAG?
RAG combines retrieval and generation:
- Retrieve relevant chunks from local knowledge.
- Generate an answer constrained by retrieved context.
This lowers hallucination risk compared to prompt-only generation.
Broader RAG Use Cases
- Enterprise policy/compliance assistants.
- Internal engineering/support knowledge assistants.
- Product/API documentation Q&A.
- Customer support copilots grounded on approved KB content.
Demo scope in this repo:
- Policy-style Q&A focused on grounding, abstention, and citation validation.
Concept Comparison (GenAI vs Agentic vs RAG)
User Need
|
+--> Fast content draft from prompt/context
| -> Choose GENERATIVE AI
|
+--> Multi-step planning + tool orchestration
| -> Choose AGENTIC AI
|
+--> Answers grounded in source documents with citations
-> Choose RAG
What It Demonstrates
- Retrieval + ranking over local docs.
- Grounding guardrails (
MIN_RETRIEVAL_SCORE,MIN_QUESTION_COVERAGE). - Citation validation against retrieved chunk IDs.
- Blocked reasons for unsupported/unsafe answers.
- Scenario-based anti-hallucination evaluation.
Flow
- User question is embedded and matched against indexed chunks.
- Retriever returns top-k ranked chunks.
- Score and coverage gates are evaluated.
- If gates fail, system abstains with guarded reason.
- If gates pass, generation uses only retrieved context.
- Citations are validated before final response.
ASCII Diagram
User Question
|
v
Embed Query -> Retrieve Top-K -> Score/Sort
|
v
Gate Checks
(score + coverage)
|
+-------------+-------------+
| |
v v
Block/Abstain Generate from Context
| + Validate Citations
+-------------+-------------+
v
Final Response + UI Status
Provider Support
- OpenAI is integrated out-of-the-box.
- You can connect Gemini, Claude, and others by adding provider adapters in
demo-app/src/providers/and extending provider-selection logic in services.
Quickstart
git clone https://github.com/amiya-pattnaik/rag-engineering-playbook.git
cd rag-engineering-playbook/demo-app
cp .env.example .env
npm install
npm run dev
# open http://localhost:3000
Use OpenAI provider mode:
- Set
OPENAI_API_KEYin.env. - Keep
OPENAI_TEMPERATURE=0for deterministic outputs.
Run and Evaluate
npm run demo:scenarios
npm run demo:anti-hallucination
Suite checks:
- answerable: grounded + expected facts
- unanswerable: abstain without grounded factual claims
- partial: answer known parts, abstain on unknown parts
Closing Thought
Reliable RAG is an engineering discipline: thresholds, coverage checks, citation constraints, and repeatable evaluation, not just prompt design.