Metadata-Version: 2.4
Name: helix-rag
Version: 0.1.12
Summary: Helix: Temporal GraphRAG combining LightRAG and Graphiti for time-aware knowledge graphs
Author-email: Yash Nuhash <nuhashroxme@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/YashNuhash/Helix
Project-URL: Documentation, https://github.com/YashNuhash/Helix#readme
Project-URL: Repository, https://github.com/YashNuhash/Helix
Project-URL: Bug Tracker, https://github.com/YashNuhash/Helix/issues
Keywords: rag,graphrag,temporal,knowledge-graph,llm,ai
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: lightrag-hku>=1.4.9.11
Requires-Dist: graphiti-core>=0.27.1
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: pydantic<=2.12.3,>=2.0
Requires-Dist: neo4j>=5.28.0
Requires-Dist: aiohttp>=3.11.12
Requires-Dist: asyncio-throttle>=1.0.2
Requires-Dist: datasets>=3.2.0
Requires-Dist: pandas>=2.2.3
Requires-Dist: tqdm>=4.67.1
Provides-Extra: supabase
Requires-Dist: supabase>=2.13.0; extra == "supabase"
Provides-Extra: huggingface
Requires-Dist: transformers>=4.40.0; extra == "huggingface"
Requires-Dist: torch>=2.2.0; extra == "huggingface"
Requires-Dist: accelerate>=0.30.0; extra == "huggingface"
Requires-Dist: sentence-transformers>=3.0.0; extra == "huggingface"
Provides-Extra: ollama
Requires-Dist: ollama>=0.4.0; extra == "ollama"
Provides-Extra: api
Requires-Dist: fastapi>=0.115.0; extra == "api"
Requires-Dist: uvicorn[standard]>=0.34.0; extra == "api"
Provides-Extra: dev
Requires-Dist: pytest>=8.3.4; extra == "dev"
Requires-Dist: pytest-asyncio>=0.25.3; extra == "dev"
Requires-Dist: ruff>=0.9.4; extra == "dev"
Requires-Dist: mypy>=1.14.1; extra == "dev"
Requires-Dist: scikit-learn>=1.6.1; extra == "dev"
Provides-Extra: all
Requires-Dist: helix-rag[api,dev,huggingface,ollama,supabase]; extra == "all"
Dynamic: license-file

<div align="center">

# 🧬 Helix: Temporal GraphRAG

**LightRAG + Graphiti = Temporal Knowledge Graphs for RAG**

<p>
  <img src="https://img.shields.io/badge/🐍Python-3.10+-4ecdc4?style=for-the-badge&logo=python&logoColor=white&labelColor=1a1a2e">
  <img src="https://img.shields.io/badge/📊Version-0.1.1-ff6b6b?style=for-the-badge&labelColor=1a1a2e">
  <img src="https://img.shields.io/badge/🧠Graphiti-Temporal_KG-00d9ff?style=for-the-badge&labelColor=1a1a2e">
</p>

<p>
  <a href="#-quick-start"><img src="https://img.shields.io/badge/🚀Quick_Start-1a1a2e?style=for-the-badge"></a>
  <a href="#-installation"><img src="https://img.shields.io/badge/📦Installation-1a1a2e?style=for-the-badge"></a>
  <a href="#-evaluation"><img src="https://img.shields.io/badge/📈Evaluation-1a1a2e?style=for-the-badge"></a>
</p>

</div>

---

## 🎯 What is Helix?

**Helix** fuses [LightRAG](https://github.com/HKUDS/LightRAG)'s proven dual-level retrieval with [Graphiti](https://github.com/getzep/graphiti)'s bi-temporal Knowledge Graph to create a next-generation RAG system with:

| Feature | Capability |
|---------|------------|
| **Temporal Awareness** | Point-in-time queries, automatic edge invalidation |
| **Multi-Hop Reasoning** | BFS-based path exploration with scoring |
| **Hallucination Detection** | Composite Fidelity Index (CFI) verification |
| **Incremental Updates** | No full graph rebuild required |

---

## 📊 Benchmark Targets

| Category | Datasets | Metrics | Target | Baseline |
|----------|----------|---------|--------|----------|
| **Temporal** | TSQA, Time-LongQA, ECT-QA, MultiTQ | Hit@1, Hit@5, Acc | **70-75%** | 45-55% |
| **Hallucination** | Legal QA, Medical QA, FEVER | AUC, CFI | **>0.95** | 0.84-0.94 |
| **Multi-Hop** | MuSiQue, 2WikiMHQA, HotpotQA | F1, EM | **70-75** | 54-59 |
| **Scalability** | UltraDomain (all) | Tokens, Latency | **<600K** | 14M |

---

## 📦 Installation

### From PyPI

```bash
pip install helix-rag
```

### From Source (Development)

```bash
git clone https://github.com/YashNuhash/Helix.git
cd Helix

# Install with Helix dependencies
pip install -e ".[helix]"
```

### Dependencies

Helix requires:
- **Neo4j** (for Graphiti Knowledge Graph)
- **Supabase** (optional, for vector storage)
- **LLM API** (any provider - configured via environment)

---

## ⚙️ Configuration

Copy `.env.example` to `.env` and configure:

```bash
cp .env.example .env
```

### Required Environment Variables

```env
# Neo4j Configuration (for Graphiti)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

# LLM Configuration (model-agnostic)
LLM_MODEL_NAME=your_model_name
LLM_API_KEY=your_api_key

# Supabase (optional)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_key
```

### Supabase Setup (Optional)

Run `scripts/supabase_schema.sql` in your Supabase SQL Editor to create the vector storage table.

---

## 🚀 Quick Start

### Basic Usage

```python
import asyncio
from helix import Helix

async def main():
    # Initialize Helix
    async with Helix() as helix:
        # Insert document with temporal tracking
        result = await helix.insert(
            "Alan Turing was born on June 23, 1912. "
            "He is considered the father of computer science.",
            source_description="Wikipedia"
        )
        print(f"Extracted {result['entities_extracted']} entities")
        
        # Query with temporal awareness
        answer = await helix.query(
            "When was Alan Turing born?",
            mode="hybrid"
        )
        print(answer["answer"])

asyncio.run(main())
```

### Temporal Queries

```python
from datetime import datetime
from helix import Helix
from helix.utils import is_temporal_query, extract_temporal_params

async def temporal_example():
    async with Helix() as helix:
        # Detect temporal intent
        query = "What was the CEO of Apple in 2015?"
        
        if is_temporal_query(query):
            params = extract_temporal_params(query)
            print(f"Temporal query detected: {params.temporal_keywords}")
        
        # Query with point-in-time context
        result = await helix.query(
            query,
            valid_at=datetime(2015, 1, 1),
            include_temporal_context=True
        )
        print(result)

asyncio.run(temporal_example())
```

### Hallucination Detection

```python
from helix.hallucination import HallucinationDetector

async def verify_response():
    async with Helix() as helix:
        detector = HallucinationDetector(graphiti=helix.graphiti)
        
        # Get response
        result = await helix.query("Tell me about Alan Turing")
        
        # Verify against knowledge graph
        verification = await detector.verify_response(
            response=result["answer"],
            query="Tell me about Alan Turing",
            context=result.get("temporal_context")
        )
        
        print(f"Grounded: {verification.is_grounded}")
        print(f"CFI Score: {verification.confidence_score:.2f}")
        print(f"Entity Coverage: {verification.entity_coverage:.2%}")

asyncio.run(verify_response())
```

### Multi-Hop Reasoning

```python
from helix.multihop import MultiHopRetriever

async def multihop_example():
    async with Helix() as helix:
        retriever = MultiHopRetriever(graphiti=helix.graphiti)
        
        # Find reasoning paths
        paths = await retriever.find_paths(
            query="How is Alan Turing connected to modern AI?",
            max_hops=3
        )
        
        # Format as context
        context = retriever.format_paths_as_context(paths)
        print(context)

asyncio.run(multihop_example())
```

---

## 📈 Evaluation

### Running Benchmarks

Helix includes evaluation scripts for academic benchmarks. Use these in Google Colab or Kaggle:

```python
# Install Helix
!pip install helix-rag

# Run temporal benchmark
from helix.eval import TemporalBenchmark

benchmark = TemporalBenchmark(dataset="time-longqa")
results = await benchmark.run()
print(f"Hit@1: {results['hit_at_1']:.2%}")
```

### Supported Benchmarks

| Benchmark | Dataset | Command |
|-----------|---------|---------|
| Temporal | TSQA | `helix eval --dataset tsqa` |
| Temporal | Time-LongQA | `helix eval --dataset time-longqa` |
| Temporal | ECT-QA | `helix eval --dataset ect-qa` |
| Multi-Hop | MuSiQue | `helix eval --dataset musique` |
| Multi-Hop | HotpotQA | `helix eval --dataset hotpotqa` |
| Hallucination | FEVER | `helix eval --dataset fever` |
| Scalability | UltraDomain | `helix eval --dataset ultradomain` |

### Colab/Kaggle Notebook

```python
# Quick evaluation notebook
import os
os.environ["LLM_API_KEY"] = "your_key"
os.environ["LLM_MODEL_NAME"] = "your_model"
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_PASSWORD"] = "password"

from helix import Helix
from helix.eval import run_all_benchmarks

# Run all benchmarks
results = await run_all_benchmarks()
print(results.to_dataframe())
```

---

## 🏗️ Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                         Helix                                │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────────┐   │
│  │   LightRAG  │  │   Graphiti   │  │  Helix Modules    │   │
│  │  (Retrieval)│  │ (Temporal KG)│  │                   │   │
│  ├─────────────┤  ├──────────────┤  ├───────────────────┤   │
│  │ - Chunking  │  │ - Episodes   │  │ - TemporalHandler │   │
│  │ - Embedding │  │ - Bi-temporal│  │ - Hallucination   │   │
│  │ - Vector DB │  │ - Resolution │  │ - MultiHop        │   │
│  │ - Dual-level│  │ - Invalidate │  │ - CFI Scoring     │   │
│  └──────┬──────┘  └──────┬───────┘  └─────────┬─────────┘   │
│         │                │                    │              │
│         └────────────────┼────────────────────┘              │
│                          ▼                                   │
│  ┌─────────────────────────────────────────────────────┐    │
│  │                    Storage Layer                     │    │
│  ├─────────────────────────────────────────────────────┤    │
│  │  Neo4j (Graph)  │  Supabase (Vector)  │  Local KV   │    │
│  └─────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
```

---

## 📁 Project Structure

```
helix/
├── __init__.py           # Package entry (v0.1.1)
├── core/
│   └── helix.py          # Main Helix class
├── storage/
│   ├── graphiti_impl.py  # GraphitiStorage
│   └── supabase_impl.py  # SupabaseVectorStorage
├── temporal/
│   └── query_handler.py  # TemporalQueryHandler
├── hallucination/
│   └── detector.py       # HallucinationDetector (CFI)
├── multihop/
│   └── retriever.py      # MultiHopRetriever (BFS)
└── utils/
    └── temporal_utils.py # Temporal parsing
```

---

## 🔬 Research Goals

Helix is designed to achieve state-of-the-art performance on:

1. **Temporal GraphRAG**: 70-75% accuracy on temporal QA benchmarks
2. **Hallucination Detection**: AUC >0.95 using graph-aligned verification
3. **Multi-Hop Reasoning**: F1 70-75 on complex reasoning benchmarks
4. **Scalability**: <600K tokens for indexing (vs 14M baseline)

See [PLAN.md](PLAN.md) for detailed research methodology.

---

## 📚 Citation

If you use Helix in your research, please cite:

```bibtex
@software{helix2024,
  title = {Helix: Temporal GraphRAG with LightRAG and Graphiti},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/YashNuhash/Helix}
}
```

---

## 🤝 Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

---

## 📄 License

MIT License - see [LICENSE](LICENSE) for details.

---

<div align="center">
  <p><strong>Built with 🧬 Helix</strong></p>
  <p>LightRAG + Graphiti = Temporal GraphRAG</p>
</div>
