Metadata-Version: 2.4
Name: allyoucanrag
Version: 0.1.1
Summary: A reusable RAG package and optional Streamlit app for academic question answering.
Author-email: Yibo Qiao <qiaoy33@mcmaster.ca>
Maintainer-email: Yibo Qiao <qiaoy33@mcmaster.ca>
License-Expression: MIT
Project-URL: Homepage, https://pypi.org/project/allyoucanrag/
Keywords: rag,qa,langchain,streamlit,chroma
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Education
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain==1.2.0
Requires-Dist: langchain-classic==1.0.1
Requires-Dist: langchain-openai
Requires-Dist: langchain-community
Requires-Dist: langchain-text-splitters
Requires-Dist: langchain-core
Requires-Dist: langchain_chroma
Requires-Dist: chromadb
Requires-Dist: openai
Requires-Dist: pypdf
Requires-Dist: unstructured
Provides-Extra: streamlit
Requires-Dist: streamlit>=1.28.0; extra == "streamlit"
Dynamic: license-file

# allyoucanRAG

This repository contains the `allyoucanrag` package, published under the public-facing brand name allyoucanRAG. The RAG core is packaged as a reusable Python library, while the Streamlit UI remains available as an optional frontend.

## What the project provides

- The allyoucanRAG package for reusable academic RAG workflows.
- A Streamlit QA app for course-material and user-uploaded PDF retrieval.
- A dual-vector-store RAG workflow based on Chroma and LangChain.
- Runtime-managed data directories for uploads, vector stores, and local config.

## Important packaging note

The following runtime data is intentionally excluded from package distributions:

- `CourseMaterials/`
- `UserUploads/`
- `chroma_db/`
- `config.json`
- `.streamlit/secrets.toml`

This means your course materials are treated as external runtime data, not packaged assets. That matches your PyPI goal.

## Current layout

```text
MAC-Academy-QA-System/
├── app.py
├── pyproject.toml
├── MANIFEST.in
├── requirements.txt
├── config.example.json
├── src/
│   └── allyoucanrag/
│       ├── __init__.py
│       ├── batch_upload_helper.py
│       ├── config.py
│       ├── document_manager.py
│       ├── exceptions.py
│       ├── paths.py
│       ├── rag_system.py
│       ├── streamlit_app.py
│       └── utils.py
└── CourseMaterials/
```

## Installation

### Local development

```bash
pip install -r requirements.txt
```

### Install the package only

```bash
pip install .
```

### Install the package with the Streamlit frontend

```bash
pip install ".[streamlit]"
```

After you publish to PyPI, the install command will become:

```bash
pip install "allyoucanrag[streamlit]"
```

The public-facing brand is allyoucanRAG. The PyPI distribution name and the Python import path are both `allyoucanrag`.

## CLI usage

After installing the package, start the interactive CLI with:

```bash
allyoucanrag start
```

The CLI can update API keys, upload supported text documents, and ask questions against the knowledge base.

## Configuration

The package resolves API keys in this order:

1. Streamlit secrets
2. Environment variables
3. `config.json`

Example local `config.json`:

```json
{
  "OpenAIAPIKey": "your-openai-api-key",
  "LangChainAPIKey": "your-langchain-api-key"
}
```

Supported environment variables:

- `OPENAI_API_KEY`
- `LANGCHAIN_API_KEY`
- `MAC_ACADEMY_QA_HOME`

If `MAC_ACADEMY_QA_HOME` is set, runtime directories are resolved from that base path. Otherwise the current working directory is used.

## Run the Streamlit app

```bash
streamlit run app.py
```

## Use as a package

```python
from allyoucanrag import DualVectorStoreRAG, resolve_runtime_paths

runtime_paths = resolve_runtime_paths(base_docs_dir="D:/external-course-materials")
rag = DualVectorStoreRAG(runtime_paths=runtime_paths)
rag.initialize_base_vectorstore()
rag.initialize_user_vectorstore()
```

## Runtime data directories

By default, the package expects these runtime paths under the active base directory:

- `CourseMaterials/`
- `UserUploads/`
- `chroma_db/base/`
- `chroma_db/user/`
- `config.json`

These are not installed into site-packages and should be managed by the runtime environment.

## Build distributions

```bash
python -m build --sdist --wheel
```

The generated files appear in `dist/`.

## Publish checklist

1. Choose and add a real `LICENSE` file.
2. Verify `README.md` renders correctly on PyPI.
3. Build with `python -m build --sdist --wheel`.
4. Upload to TestPyPI first.
5. Upload to PyPI.

Detailed release steps are in `DEPLOYMENT.md`.

