Metadata-Version: 2.4
Name: ragkit-adhitya
Version: 0.1.0
Summary: Production RAG pipelines without the abstraction tax
Author-email: M Adhitya <adhitya5119@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/iamadhitya1/rag-kit
Project-URL: Repository, https://github.com/iamadhitya1/rag-kit
Project-URL: Issues, https://github.com/iamadhitya1/rag-kit/issues
Keywords: rag,retrieval-augmented-generation,llm,embeddings,vector-search,ai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: groq
Requires-Dist: groq>=0.9; extra == "groq"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.36; extra == "anthropic"
Provides-Extra: supabase
Requires-Dist: supabase>=2.0; extra == "supabase"
Provides-Extra: pdf
Requires-Dist: pypdf>=4.0; extra == "pdf"
Provides-Extra: url
Requires-Dist: httpx>=0.27; extra == "url"
Requires-Dist: beautifulsoup4>=4.12; extra == "url"
Provides-Extra: local
Requires-Dist: sentence-transformers>=3.0; extra == "local"
Provides-Extra: all
Requires-Dist: openai>=1.0; extra == "all"
Requires-Dist: groq>=0.9; extra == "all"
Requires-Dist: anthropic>=0.36; extra == "all"
Requires-Dist: supabase>=2.0; extra == "all"
Requires-Dist: pypdf>=4.0; extra == "all"
Requires-Dist: httpx>=0.27; extra == "all"
Requires-Dist: beautifulsoup4>=4.12; extra == "all"
Requires-Dist: sentence-transformers>=3.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Dynamic: license-file

# rag-kit

Production RAG pipelines without the abstraction tax.

```python
from ragkit import RAGPipeline
from ragkit.embedders import OpenAIEmbedder
from ragkit.generators import GroqGenerator

rag = RAGPipeline(
    embedder=OpenAIEmbedder(api_key="sk-..."),
    generator=GroqGenerator(api_key="gsk_..."),
)
rag.ingest("handbook.pdf")
result = rag.query("What is the refund policy?")
print(result.answer)
# → "Refunds are available within 7 days of purchase. Contact support@..."
print(result.sources)
# → [Chunk(text="Refunds are available...", metadata={"source": "handbook.pdf"})]
```

No LangChain. No magic. Every line of the pipeline is readable Python you can modify.

---

## Before You Start

### What you need

| Requirement | Minimum | How to check |
|---|---|---|
| Python | 3.10+ | `python --version` |
| pip | any | `pip --version` |
| An LLM API key | Groq (free) | [console.groq.com](https://console.groq.com) |
| An embedding API key | OpenAI **or** free local model | [platform.openai.com](https://platform.openai.com/api-keys) |

**Don't have Python?** Download it from [python.org](https://python.org). Pick any version ≥ 3.10.

### What is an API key?

An API key is a password that lets your code talk to an AI service (like Groq or OpenAI). You get one by creating a free account on their website.

- **Groq** — free, no credit card needed. Go to [console.groq.com](https://console.groq.com) → API Keys → Create. Looks like `gsk_abc123...`
- **OpenAI** — needs a paid account for embeddings. Go to [platform.openai.com](https://platform.openai.com/api-keys) → Create new secret key. Looks like `sk-abc123...`
- **No money?** Use `LocalEmbedder` instead of `OpenAIEmbedder` — runs on your own computer, completely free. See [Quick Start #3](#3-free-local-embeddings-no-api-key-for-embedding).

### Set up your environment (recommended)

```bash
# Create a virtual environment so rag-kit doesn't conflict with other packages
python -m venv venv

# Activate it
source venv/bin/activate        # Mac / Linux
venv\Scripts\activate           # Windows

# Install rag-kit with the providers you want
pip install rag-kit[openai,groq]

# Put your API keys in a .env file (never commit this to git)
cp .env.example .env
# Open .env and fill in your keys
```

---

## What is RAG?

LLMs like GPT-4 and Llama3 are trained on public internet data up to a cutoff date. They know nothing about:
- Your company's internal documents
- Data created after their training cutoff
- Private knowledge bases

**RAG (Retrieval-Augmented Generation)** solves this by giving the LLM the right context at query time, instead of baking knowledge into model weights.

```
User asks: "What's our refund policy?"

Without RAG:
  LLM: "I don't have information about your specific refund policy."
  (or worse — it hallucinates a plausible-sounding policy)

With RAG:
  1. Retrieve: find the paragraph about refunds from your policy PDF
  2. Generate: "Here is the relevant section: [paragraph]. Based on this, ..."
  LLM: "Refunds are available within 7 days. Contact support@yourco.com."
```

This is what every AI assistant with "chat with your docs" capability uses under the hood — Notion AI, GitHub Copilot's context, Cursor, Claude Projects, all of them.

---

## How RAG Works — The Full Pipeline

Understanding this pipeline is more valuable than any certification.

```
INGESTION (run once per document)
──────────────────────────────────────────────────────────────────
                                                                  
  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐
  │    LOADER    │──▶│   CHUNKER    │──▶│   EMBEDDER   │──▶│    STORE     │
  │              │   │              │   │              │   │              │
  │ PDF/TXT/URL  │   │ Split text   │   │ text → vec   │   │ Save vectors │
  │ → Document   │   │ into Chunks  │   │ (numbers)    │   │ to DB/memory │
  └──────────────┘   └──────────────┘   └──────────────┘   └──────────────┘

QUERYING (run for every user question)
──────────────────────────────────────────────────────────────────
                                                                  
  User Question                                                   
       │                                                          
       ▼                                                          
  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐
  │   EMBEDDER   │──▶│  RETRIEVER   │──▶│  (RERANKER)  │──▶│  GENERATOR  │
  │              │   │              │   │   optional   │   │              │
  │ query → vec  │   │ find similar │   │ LLM re-scores│   │ LLM answers  │
  │              │   │ chunks by    │   │ for precision│   │ using chunks │
  │              │   │ vector dist  │   │              │   │ as context   │
  └──────────────┘   └──────────────┘   └──────────────┘   └──────────────┘
                                                                  │
                                                                  ▼
                                                            Answer + Sources
```

Let's go through each step.

---

### Step 1: Loading

The loader converts any input (PDF, URL, plain text) into a uniform `Document` object. This abstraction means the rest of the pipeline doesn't care whether your document was a PDF or a webpage.

```python
from ragkit.loaders import load_pdf, load_url, load_text

doc = load_pdf("report.pdf")
# doc.text = "Q1 revenue was $4.2M..."
# doc.source = "report.pdf"

doc = load_url("https://docs.yourapp.com/api")
# doc.text = "API Reference\n\nEndpoints:\n..."
# doc.source = "https://docs.yourapp.com/api"
```

---

### Step 2: Chunking

Every LLM has a **context window** — the maximum amount of text it can read at one time. Think of it like working memory: a human can hold a few paragraphs in their head while answering a question, but not an entire book.

Most models allow 8,000–128,000 tokens (roughly 6,000–96,000 words). A 200-page PDF is ~100,000 words. Even if it fit, sending the entire document every time a user asks a question would be very slow and very expensive.

Chunking solves this: instead of sending everything, you only send the 3–5 most relevant pieces.

Chunking splits the document into small, overlapping pieces that can be retrieved and fed to the LLM individually.

**Why overlap?**

```
Original text: "...the payment is processed. Refunds take 3-5 business days to appear..."
                                            ↑ chunk boundary without overlap

Chunk 1: "...the payment is processed."
Chunk 2: "Refunds take 3-5 business days to appear..."
```

Without overlap, "Refunds take 3-5..." has no context for what payment method, what product, what country. With overlap:

```
Chunk 1: "...the payment is processed."
Chunk 2: "processed. Refunds take 3-5 business days to appear..."
             ↑ tail of chunk 1 gives context
```

**Three chunking strategies:**

| Strategy | How it splits | Best for |
|---|---|---|
| `fixed` | Every N characters, always | Uniform text, simple baseline |
| `recursive` | Paragraphs → sentences → words → chars | Most documents (default) |
| `semantic` | Where meaning shifts (needs embedder) | High-precision knowledge bases |

```python
from ragkit.chunkers import recursive_chunker, fixed_chunker, semantic_chunker

chunks = recursive_chunker(doc, chunk_size=500, overlap=50)
# chunks[0].text = "First paragraph..."
# chunks[0].metadata = {"source": "report.pdf", "chunk_index": 0, "total_chunks": 42}
```

**How to pick chunk_size:**

- Too small (< 100 chars): chunks lose context, embeddings become noisy
- Too large (> 1000 chars): chunks cover multiple topics, retrieval is imprecise
- **Sweet spot: 300–700 characters** for most documents

---

### Step 3: Embedding

An embedding converts text into a list of numbers (a vector) that represents its meaning. Similar texts produce similar vectors.

```
"refund policy"   → [0.21, -0.54, 0.88, 0.12, ...]   (1536 numbers)
"money back"      → [0.19, -0.51, 0.85, 0.14, ...]   (similar direction)
"pizza recipes"   → [-0.72, 0.33, -0.41, 0.65, ...]  (different direction)
```

This is what makes semantic search work. "Refund" and "money back" don't share any words, but their embeddings are close — so a search for "refund policy" will find a chunk that says "money back guarantee."

Keyword search (like SQL `LIKE '%refund%'`) would miss it.

```python
from ragkit.embedders import OpenAIEmbedder, LocalEmbedder

# Cloud: best quality, small cost
embedder = OpenAIEmbedder(api_key="sk-...")
vector = embedder.embed("refund policy")
# → list of 1536 floats

# Local: free, private, slightly lower quality
embedder = LocalEmbedder(model_name="all-MiniLM-L6-v2")
vector = embedder.embed("refund policy")
# → list of 384 floats, runs on your CPU
```

**Choosing an embedding model:**

| Model | Dims | Cost | Quality | Use when |
|---|---|---|---|---|
| `text-embedding-3-small` | 1536 | ~$0.00002/1K tokens | Excellent | Default choice |
| `text-embedding-3-large` | 3072 | ~$0.00013/1K tokens | Best | Legal/medical precision |
| `all-MiniLM-L6-v2` (local) | 384 | Free | Good | Privacy, no API key |
| `BAAI/bge-small-en` (local) | 384 | Free | Very good | Best free model |

---

### Step 4: Vector Store

Once you have vectors, you need to store them so you can search them later.

```python
from ragkit.stores import MemoryStore, SupabaseStore

# Development: fast, in-process, no setup
store = MemoryStore()

# Production: persistent, searchable across restarts, scales to millions
store = SupabaseStore(url="https://xxxx.supabase.co", key="eyJ...")
```

**How vector search works:**

The store computes **cosine similarity** between your query vector and every stored chunk vector. Chunks most similar in direction to the query are returned.

```
query vector: [0.21, -0.54, 0.88, ...]

chunk A: [0.19, -0.51, 0.85, ...]  → similarity: 0.98  ✓ very similar
chunk B: [0.20, -0.52, 0.87, ...]  → similarity: 0.97  ✓ similar  
chunk C: [-0.72, 0.33, -0.41, ...] → similarity: 0.12  ✗ unrelated
```

**MemoryStore** does a linear scan (O(n)) — fine for < 10K chunks.  
**SupabaseStore** uses an HNSW index — sub-10ms at millions of vectors.

**What is HNSW?**  
Hierarchical Navigable Small World. A graph-based index where each node connects to its nearest neighbors. Search navigates the graph instead of scanning every vector. O(log n) instead of O(n). You never need to build or maintain it — pgvector handles it automatically.

---

### Step 5: Retrieval

Given a query vector, return the most relevant chunks.

```python
from ragkit.retrievers import topk_retriever, mmr_retriever

# Simple: return the 5 most similar chunks
chunks = topk_retriever(store, query_vector, top_k=5)

# Advanced: return 5 diverse chunks (avoids redundant results)
chunks = mmr_retriever(store, query_vector, top_k=5, lambda_mult=0.5)
```

**Top-K vs MMR:**

Top-K returns the 5 most similar chunks. If your document says "refund" 10 times across different sections, you'll get 5 near-duplicate chunks. The LLM gets confused by repetition and wastes context.

MMR (Maximal Marginal Relevance) picks chunks one at a time, penalizing choices that are too similar to what was already picked. Each selected chunk must contribute *new* information.

```
Query: "refund policy"

Top-K results:          MMR results:
1. "Refunds in 7 days"  1. "Refunds in 7 days"     ← most relevant
2. "Refunds in 7 days"  2. "Contact support for..."  ← new info
3. "7 day refund limit" 3. "Cancellations vs refunds" ← new info
4. "7 day refund limit" 4. "Razorpay processes..."   ← new info
5. "Refunds available"  5. "Exceptions to refunds"  ← new info
```

Use MMR when your documents have repetitive content. Use Top-K otherwise.

---

### Step 6 (Optional): Reranking

Vector similarity measures "are these about the same topic?" — not "does this directly answer the question?"

The reranker reads each chunk and the query, then scores relevance directly. More expensive (1 LLM call per chunk), but significantly more precise.

```python
from ragkit.rerankers import llm_reranker

# Initial retrieval: 8 candidates by vector similarity
candidates = topk_retriever(store, query_vector, top_k=8)

# Rerank: LLM scores each on 1-10 relevance, return top 3
final = llm_reranker(candidates, query="refund policy", llm=generator, top_k=3)
```

Use reranking when:
- Answer accuracy matters more than speed/cost
- You're seeing the LLM use slightly-wrong chunks
- Your documents have many similar-sounding sections

Skip reranking when:
- You're building a high-QPS API (latency will hurt)
- Your queries are simple and retrieval quality is already good

---

### Step 7: Generation

Feed the retrieved chunks + the user's question to an LLM.

```python
from ragkit.generators import GroqGenerator

generator = GroqGenerator(api_key="gsk_...")
answer = generator.generate(
    query="What is the refund policy?",
    chunks=retrieved_chunks,
)
```

The generator formats chunks into a numbered context block and passes it to the LLM with a system prompt that says: *"Answer using ONLY the provided context. Never make up information."*

This grounding instruction is critical. Without it, the LLM will blend retrieved facts with its training data and hallucinate confidently.

The answer will cite `[1]`, `[2]`, etc. corresponding to the numbered chunks. Show these citations to your users so they can verify the source.

---

## Installation

```bash
# Minimal (choose your providers):
pip install rag-kit[groq]       # + Groq LLM
pip install rag-kit[openai]     # + OpenAI embeddings + GPT
pip install rag-kit[anthropic]  # + Claude

# Add-ons:
pip install rag-kit[pdf]        # PDF loading
pip install rag-kit[url]        # URL/webpage loading
pip install rag-kit[supabase]   # Persistent vector store
pip install rag-kit[local]      # Local embeddings (no API key)

# Everything:
pip install rag-kit[all]
```

---

## Quick Start

### 1. Basic (in-memory, no persistence)

```python
import os
from ragkit import RAGPipeline
from ragkit.embedders import OpenAIEmbedder
from ragkit.generators import GroqGenerator

rag = RAGPipeline(
    embedder=OpenAIEmbedder(api_key=os.environ["OPENAI_API_KEY"]),
    generator=GroqGenerator(api_key=os.environ["GROQ_API_KEY"]),
)

rag.ingest("company_handbook.pdf")       # PDF
rag.ingest("https://docs.myapp.com")     # URL
rag.ingest_text("Prices: Pro = ₹399/mo") # Raw string

result = rag.query("What are the pricing tiers?")
print(result.answer)

for chunk in result.sources:
    print(f"  Source: {chunk.metadata['source']}")
```

### 2. Production (Supabase persistence)

```python
from ragkit import RAGPipeline
from ragkit.embedders import OpenAIEmbedder
from ragkit.generators import GroqGenerator
from ragkit.stores import SupabaseStore

rag = RAGPipeline(
    embedder=OpenAIEmbedder(api_key="sk-..."),
    generator=GroqGenerator(api_key="gsk_..."),
    store=SupabaseStore(url="https://xxxx.supabase.co", key="eyJ..."),
)
```

First, run the setup SQL in your Supabase SQL editor:
```python
from ragkit.stores.supabase import SETUP_SQL
print(SETUP_SQL)  # copy and run this in Supabase
```

### 3. Free (local embeddings, no API key for embedding)

```python
from ragkit import RAGPipeline
from ragkit.embedders import LocalEmbedder
from ragkit.generators import GroqGenerator  # Groq free tier is generous

rag = RAGPipeline(
    embedder=LocalEmbedder("all-MiniLM-L6-v2"),  # runs on your CPU, free
    generator=GroqGenerator(api_key="gsk_..."),   # Groq free tier
)
```

### 4. Advanced (MMR + reranking for maximum quality)

```python
from ragkit import RAGPipeline
from ragkit.embedders import OpenAIEmbedder
from ragkit.generators import GroqGenerator
from ragkit.stores import SupabaseStore

rag = RAGPipeline(
    embedder=OpenAIEmbedder(api_key="sk-..."),
    generator=GroqGenerator(api_key="gsk_...", model="llama3-70b-8192"),
    store=SupabaseStore(url="...", key="..."),
    chunker="recursive",
    chunk_size=600,
    chunk_overlap=80,
    retriever="mmr",          # diverse retrieval
    top_k=6,
    reranker=True,            # LLM re-scores for precision
)
```

---

## API Reference

### `RAGPipeline`

```python
RAGPipeline(
    embedder,                # Required: OpenAIEmbedder | LocalEmbedder
    generator,               # Required: GroqGenerator | OpenAIGenerator | AnthropicGenerator
    store=None,              # MemoryStore() by default; pass SupabaseStore for persistence
    chunker="recursive",     # "fixed" | "recursive" | "semantic"
    chunk_size=500,          # characters per chunk
    chunk_overlap=50,        # overlap between consecutive chunks
    retriever="topk",        # "topk" | "mmr"
    top_k=5,                 # number of chunks to retrieve
    min_score=0.0,           # discard chunks below this similarity (0.0–1.0)
    reranker=None,           # True to enable LLM reranking
)
```

| Method | Returns | Description |
|---|---|---|
| `.ingest(source)` | `int` | Load, chunk, embed, store a file/URL. Returns chunk count. |
| `.ingest_text(text, source_label)` | `int` | Same but from a raw string. |
| `.query(question)` | `QueryResult` | Embed query, retrieve, generate, return answer + sources. |

### `QueryResult`

```python
result.answer   # str — the LLM's answer, grounded in retrieved chunks
result.sources  # list[Chunk] — the chunks that were used
```

### `Chunk`

```python
chunk.text                         # str — the text content
chunk.metadata["source"]           # str — file path or URL
chunk.metadata["chunk_index"]      # int — position in original document
chunk.metadata["total_chunks"]     # int — total chunks in this document
chunk.metadata["strategy"]         # str — "fixed" | "recursive" | "semantic"
```

---

## Common Patterns

### Show citations in a chat UI

```python
result = rag.query(user_message)

response_parts = [result.answer, "\n\n**Sources:**"]
for i, chunk in enumerate(result.sources, 1):
    source = chunk.metadata.get("source", "unknown")
    preview = chunk.text[:120].strip().replace("\n", " ")
    response_parts.append(f"[{i}] {source}: _{preview}..._")

final_response = "\n".join(response_parts)
```

### Ingest only new documents (avoid duplicates)

```python
already_ingested = {"report_q1.pdf", "handbook.pdf"}

for file in Path("docs").glob("*.pdf"):
    if file.name not in already_ingested:
        n = rag.ingest(str(file))
        print(f"Ingested {file.name}: {n} chunks")
```

### Filter by source document

With SupabaseStore, you can search only within a specific document:

```python
results = store.search(
    query_embedding,
    top_k=5,
    filter={"source": "hr-policy.pdf"},
)
```

### Streaming responses

```python
# GroqGenerator supports streaming
from groq import Groq

client = Groq(api_key="gsk_...")
stream = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[...],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
```

---

## Choosing Your Stack

| Need | Pick |
|---|---|
| Fastest setup | `MemoryStore` + `OpenAIEmbedder` + `GroqGenerator` |
| Zero API cost | `MemoryStore` + `LocalEmbedder` + `GroqGenerator` (free tier) |
| Production persistence | `SupabaseStore` + any embedder/generator |
| Maximum accuracy | `SupabaseStore` + `OpenAIEmbedder` + `mmr` retriever + reranker + GPT-4o |
| Private / on-premise | `MemoryStore` + `LocalEmbedder` + local Ollama generator |

---

## What You've Learned

If you read this far and ran the examples, you understand:

- **RAG architecture** — loader → chunker → embedder → store → retriever → generator
- **Chunking strategies** — fixed, recursive, semantic; why overlap matters
- **Embeddings** — what they are, how cosine similarity works, how to pick a model
- **Vector search** — how HNSW indexing works at scale
- **Retrieval strategies** — Top-K vs MMR, when diversity matters
- **Reranking** — LLM-as-judge, when precision > speed
- **Generation** — how to write grounding prompts, how to show citations

This is the complete knowledge stack behind every "chat with your docs" product, every enterprise knowledge base, and every AI assistant with document understanding. No paid course required.

---

## What's Next

Now that you understand RAG, the natural next steps:

1. **Agents** — instead of one retrieval+generation step, let the LLM decide *when* to retrieve and *what* to do with the result. See [agent-loop](https://github.com/iamadhitya1/agent-loop).

2. **Memory** — give your RAG system episodic memory (remember past conversations) and semantic memory (retrieve relevant facts from prior sessions). See [mem-store](https://github.com/iamadhitya1/mem-store).

3. **Evals** — measure whether your RAG pipeline is actually answering correctly. See [eval-bench](https://github.com/iamadhitya1/eval-bench).

---

## Troubleshooting

These are the errors every beginner hits. Fixes are here so you don't lose an hour to them.

---

### `ModuleNotFoundError: No module named 'ragkit'`

You haven't installed the library yet, or your virtual environment isn't activated.

```bash
# Make sure your venv is active first
source venv/bin/activate       # Mac/Linux
venv\Scripts\activate          # Windows

# Then install
pip install rag-kit[openai,groq]
```

---

### `ModuleNotFoundError: No module named 'openai'` (or `groq`, `supabase`, etc.)

rag-kit has zero mandatory dependencies. You only get the extras you ask for.

```bash
pip install rag-kit[openai]     # for OpenAIEmbedder / OpenAIGenerator
pip install rag-kit[groq]       # for GroqGenerator
pip install rag-kit[supabase]   # for SupabaseStore
pip install rag-kit[pdf]        # for load_pdf()
pip install rag-kit[url]        # for load_url()
pip install rag-kit[local]      # for LocalEmbedder
pip install rag-kit[all]        # everything at once
```

---

### `AuthenticationError` / `401 Unauthorized`

Your API key is wrong or not set.

```python
# Bad — key hardcoded with a typo or expired key
embedder = OpenAIEmbedder(api_key="sk-abc123WRONG")

# Good — read from environment variable
import os
embedder = OpenAIEmbedder(api_key=os.environ["OPENAI_API_KEY"])
```

Double-check:
1. You copied the full key (they're long — don't cut it off)
2. The key is for the right service (OpenAI key ≠ Groq key)
3. Your `.env` file is loaded before your script runs:
   ```bash
   export OPENAI_API_KEY=sk-...   # Mac/Linux — run this in your terminal first
   set OPENAI_API_KEY=sk-...      # Windows CMD
   $env:OPENAI_API_KEY="sk-..."   # Windows PowerShell
   ```

---

### `SyntaxError` or `TypeError` on Python 3.9 or below

rag-kit uses `list[str]` and `dict | None` type hints, which require Python 3.10+.

```bash
python --version   # must show 3.10, 3.11, 3.12, or 3.13

# If you're on 3.9 or below, upgrade Python at python.org
```

---

### `result.answer` is "I couldn't find relevant information"

This means no chunks passed the similarity threshold. Three possible causes:

**1. Your document wasn't ingested yet.**
```python
rag.ingest("your_file.pdf")   # do this before rag.query()
result = rag.query("your question")
```

**2. The question phrasing is too different from the document language.**

Try rephrasing. "What is the money-back guarantee?" retrieves better than "refund" if the document uses the phrase "money-back guarantee."

**3. `min_score` is set too high.**
```python
# Default min_score is 0.0 — everything passes through
# If you set it high (e.g. 0.8), lower it while debugging
rag = RAGPipeline(..., min_score=0.0)
```

---

### `SupabaseStore` not finding results after ingestion

Make sure you ran the setup SQL first. Open your Supabase SQL editor and run:

```python
from ragkit.stores.supabase import SETUP_SQL
print(SETUP_SQL)
```

Copy the output and paste it into Supabase → SQL Editor → Run. This creates the table, HNSW index, and `match_chunks` function that the store depends on.

---

### Still stuck?

Open an issue at [github.com/iamadhitya1/rag-kit/issues](https://github.com/iamadhitya1/rag-kit/issues) with:
1. Your Python version (`python --version`)
2. The full error message (copy-paste, don't screenshot)
3. The 5 lines of code that triggered it

---

## License

MIT © 2025 M Adhitya
