Metadata-Version: 2.4
Name: glassbox-rag
Version: 0.1.0
Summary: An open-source, high-transparency modular RAG framework for AI/ML applications
Author: averoe
License-Expression: Apache-2.0
Keywords: rag,retrieval,generation,onnx,ollama,vector-store,multimodal
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.104.0
Requires-Dist: uvicorn[standard]>=0.24.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: requests>=2.31.0
Requires-Dist: python-multipart>=0.0.6
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: embeddings
Requires-Dist: onnxruntime>=1.16.0; extra == "embeddings"
Requires-Dist: ollama>=0.0.50; extra == "embeddings"
Requires-Dist: openai>=1.3.0; extra == "embeddings"
Provides-Extra: vector-stores
Requires-Dist: qdrant-client>=2.4.0; extra == "vector-stores"
Requires-Dist: chromadb>=0.4.0; extra == "vector-stores"
Requires-Dist: psycopg2-binary>=2.9.0; extra == "vector-stores"
Provides-Extra: multimodal
Requires-Dist: pillow>=10.0.0; extra == "multimodal"
Requires-Dist: pypdf>=3.17.0; extra == "multimodal"
Requires-Dist: python-pptx>=0.6.21; extra == "multimodal"
Dynamic: license-file

# GlassBox

An open-source, high-transparency modular RAG (Retrieval-Augmented Generation) framework for AI/ML applications.

## Features

- Modular architecture with pluggable components
- Support for multiple vector stores (Qdrant, Chroma)
- Multiple embedding providers (OpenAI, Ollama, ONNX)
- Database support (PostgreSQL, SQLite)
- Comprehensive tracing and monitoring
- FastAPI-based REST API
- Web dashboard for trace visualization
- Async/await support with asyncio
- Pydantic v2 for robust configuration management

## Installation

Install from PyPI:

```bash
pip install glassbox-rag
```

## Quick Start

```python
from glassbox_rag.core.engine import GlassBoxEngine
from glassbox_rag.config import GlassBoxConfig

# Initialize the engine
config = GlassBoxConfig()
engine = GlassBoxEngine(config)

# Ingest documents
documents = [
    {"content": "Document 1", "metadata": {"source": "source1"}},
    {"content": "Document 2", "metadata": {"source": "source2"}},
]
engine.ingest(documents)

# Retrieve relevant documents
results = engine.retrieve("query text", top_k=5)
```

## Configuration

Configure via YAML file (config/default.yaml):

```yaml
server:
  host: "0.0.0.0"
  port: 8000

vector_store:
  type: "qdrant"
  config:
    url: "http://localhost:6333"

encoder:
  type: "openai"
  config:
    api_key: "${OPENAI_API_KEY}"
    model: "text-embedding-3-small"

database:
  type: "postgresql"
  config:
    url: "postgresql://user:password@localhost/glassbox"
```

## API Endpoints

- GET /health - Health check
- POST /retrieve - Retrieve documents
- POST /ingest - Ingest new documents
- POST /update - Update existing documents
- GET /traces/{id} - Get execution trace
- GET /traces/{id}/visualize - Visualize trace

## Running the Server

Start the FastAPI server with Uvicorn:

```bash
python -m glassbox_rag
```

The server will be available at http://localhost:8000

## Web Dashboard

Access the web dashboard at http://localhost:8000/ to:

- View recent execution traces
- Visualize trace hierarchy and timing
- Monitor system metrics
- Track token usage and costs

## Testing

Run the test suite:

```bash
pytest tests/ -v
```

## Plugin Architecture

Extend GlassBox with custom plugins:

1. Create a plugin class inheriting from the appropriate base class
2. Implement required abstract methods
3. Register in the configuration

Example custom embedder:

```python
from glassbox_rag.plugins.base import EmbedderBase

class CustomEmbedder(EmbedderBase):
    def encode(self, texts):
        # Your embedding implementation
        return embeddings
```

## Architecture

GlassBox consists of several core components:

- Engine: Orchestrates all operations
- Encoder: Handles document and query encoding
- Retriever: Implements retrieval strategies
- Writeback: Manages document updates with protection
- Metrics: Tracks token usage and costs
- Trace: Records execution traces for debugging

## License

Licensed under the Apache License 2.0. See LICENSE file for details.

## Contributing

Contributions are welcome! Please feel free to submit pull requests.

## Support

For issues, questions, or suggestions, please open an issue on GitHub.
