Metadata-Version: 2.4
Name: tnh-scholar
Version: 0.4.1
Summary: TNH Scholar is an AI-driven project designed to explore, query, and translate the teachings of Thich Nhat Hanh and Plum Village community.
License: GPL-3.0-only
License-File: LICENSE
Keywords: nlp,dharma,processing,text,translation,ai
Author: Aaron K. Solomon
Author-email: aaron.kyle.solomon@gmail.com
Requires-Python: >=3.12,<3.13
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: gui
Provides-Extra: ocr
Provides-Extra: query
Requires-Dist: assemblyai
Requires-Dist: beautifulsoup4 (==4.12.3) ; extra == "query"
Requires-Dist: click (==8.1.8)
Requires-Dist: colorlog (==6.9.0)
Requires-Dist: ebooklib (==0.18) ; extra == "query"
Requires-Dist: fitz (==0.0.1.dev2) ; extra == "ocr"
Requires-Dist: flask (==3.1.3)
Requires-Dist: gitpython (==3.1.50)
Requires-Dist: google-cloud (==0.34.0) ; extra == "ocr"
Requires-Dist: google-cloud-vision (==3.9.0) ; extra == "ocr"
Requires-Dist: jinja2 (==3.1.6)
Requires-Dist: langdetect (==1.0.9)
Requires-Dist: lxml (==6.1.0)
Requires-Dist: nltk (==3.9.4)
Requires-Dist: numpy (>=2.0,<2.1)
Requires-Dist: openai (>=2,<3)
Requires-Dist: packaging (==24.2)
Requires-Dist: pdf2image (==1.17.0) ; extra == "ocr"
Requires-Dist: pillow (==12.2.0) ; extra == "ocr"
Requires-Dist: platformdirs (==4.3.6)
Requires-Dist: pycountry (==24.6.1)
Requires-Dist: pydantic (>=2.10,<3)
Requires-Dist: pydantic-settings (>=2.6,<3)
Requires-Dist: pydub (==0.25.1)
Requires-Dist: pyngrok (==7.2.3)
Requires-Dist: pysrt (==1.1.2)
Requires-Dist: python-dotenv (==1.2.2)
Requires-Dist: python-json-logger (==2.0.7)
Requires-Dist: pyyaml (==6.0.2)
Requires-Dist: regex (==2024.11.6)
Requires-Dist: requests (==2.33.0)
Requires-Dist: rich (==13.9.4)
Requires-Dist: spacy (>=3.8,<3.9) ; extra == "query"
Requires-Dist: streamlit (>=1.41,<2) ; extra == "gui"
Requires-Dist: tenacity (==9.0.0)
Requires-Dist: tiktoken (>=0.8,<1)
Requires-Dist: tqdm (==4.67.1)
Requires-Dist: typer (==0.12.5)
Requires-Dist: unidecode (==1.3.8)
Requires-Dist: yt-dlp
Project-URL: Bug Tracker, https://github.com/aaronksolomon/tnh-scholar/issues
Project-URL: Documentation, https://aaronksolomon.github.io/tnh-scholar/
Project-URL: Homepage, https://aaronksolomon.github.io/tnh-scholar/
Description-Content-Type: text/markdown

# TNH Scholar README

TNH Scholar is a human-centered research project for the study and exploration of the teachings of Thích Nhất Hạnh, the Plum Village tradition, and the broader Buddhist tradition. It is grounded in mindful human–AI collaboration, with an emphasis on transparency, authenticity, and care.

## Vision & Aspiration

TNH Scholar aspires to make the teachings of Thích Nhất Hạnh, the Plum Village tradition, and the broader Buddhist tradition more accessible, discoverable, and researchable. It explores how AI can support deep study, translation, navigation, and content processing, while also opening new possibilities for the collaborative development of the software and research workflows behind that work. Our vision is to keep human discernment, scholarly care, contemplative use, and mindful human–AI collaboration at the center.

## Features

TNH Scholar is currently in active prototyping. Key capabilities:

- **Audio and transcript processing**: `audio-transcribe` with diarization and YouTube support
- **Text formatting and translation**: `tnh-gen` CLI for punctuation, translation, sectioning, and prompt-driven processing. See [ADR-TG01](docs/architecture/tnh-gen/adr/adr-tg01-cli-architecture.md) and [ADR-TG02](docs/architecture/tnh-gen/adr/adr-tg02-prompt-integration.md) for architecture details. See the [Thầy Edited Journal Text Case Study](docs/user-guide/journal-pipeline-case-study.md) for a fully worked OCR-to-translation example.
- **Acquisition utilities**: `ytt-fetch` for transcripts; `token-count` and `nfmt` for prep and planning
- **Setup and configuration**: `tnh-setup` plus guided config in Getting Started
- **Agent orchestration bootstrap**: maintained `tnh-conductor` local/headless workflow runner for bounded worktree-backed bootstrap flows
- **Prompt system**: See ADRs under [docs/architecture/prompt-system/index.md](docs/architecture/prompt-system/index.md) for decisions and roadmap

> **⚠️ Rapid Prototype Phase (0.x)**: TNH Scholar is in active development with **no backward compatibility guarantees**. Breaking changes may occur in ANY 0.x release (including patches). Pin to a specific version if stability is needed: `pip install tnh-scholar==0.3.1`. See [ADR-PP01](docs/architecture/project-policies/adr/adr-pp01-rapid-prototype-versioning.md) for versioning policy.

## Quick Start

### Installation (PyPI)

```bash
pip install tnh-scholar
tnh-setup
```

Prerequisites: Python 3.12.4+, OpenAI API key (CLI tools), Google Vision (optional OCR), pip or Poetry.

### Development setup (from source)

Follow [DEV_SETUP.md](DEV_SETUP.md) for the full workflow. Short version:

```bash
pyenv install 3.12.4
poetry config virtualenvs.in-project true
make setup-dev    # Full dev environment (recommended)
make build-all    # Full rebuild (poetry update, yt-dlp, pipx, docs)
make pipx-build   # Install CLI tools globally (audio-transcribe, tnh-gen, etc.)
```

### Set OpenAI credentials

```bash
export OPENAI_API_KEY="your-api-key"
```

### Example usage

**Transcribe Audio from YouTube:**

```bash
audio-transcribe --yt_url "https://youtube.com/watch?v=example" --split --transcribe
```

**Download Video Transcripts:**

```bash
ytt-fetch "https://youtube.com/watch?v=example" -l en -o transcript.txt
```

**Process Text with tnh-gen:**

```bash
# List available prompts
tnh-gen list

# Run a prompt on a file
tnh-gen run --prompt translate --input-file input.txt --var source_language=vi --var target_language=en

```

## Getting Started

- **Practitioners**: Install, configure credentials, and follow the [Quick Start Guide](docs/getting-started/quick-start-guide.md); workflows live in the [User Guide](docs/user-guide/overview.md).
- **Developers**: Set up via [DEV_SETUP.md](DEV_SETUP.md) and [Contributing](CONTRIBUTING.md); review [System Design](docs/development/system-design.md) and the [CLI docs](docs/cli-reference/index.md); run `make docs` to view locally.
  - **Project Philosophy & Vision**: Developers and researchers should review the conceptual foundations in `docs/project/vision.md`, `docs/project/philosophy.md`, `docs/project/principles.md`, and `docs/project/conceptual-architecture.md` to understand the system’s long-term direction and design intent.
- **Researchers**: Explore [Research](docs/research/index.md) for experiments and direction in the study of Thích Nhất Hạnh, Plum Village, and broader Buddhist materials; see [Architecture](docs/architecture/index.md) for pipelines/ADRs (e.g., [ADR-K01](docs/architecture/knowledge-base/adr/adr-k01-kb-architecture-strategy.md)).

## Documentation Overview

Comprehensive documentation is available in multiple formats:

- **Online Documentation**: [aaronksolomon.github.io/tnh-scholar/](https://aaronksolomon.github.io/tnh-scholar/)
- **GitHub Repository**: [github.com/aaronksolomon/tnh-scholar](https://github.com/aaronksolomon/tnh-scholar)

### Documentation Structure

- **[Getting Started](docs/getting-started/index.md)** – Installation, setup, and first steps
- **[CLI Docs](docs/cli-reference/index.md)** – Command-line tool documentation
- **[User Guide](docs/user-guide/index.md)** – Detailed usage guides, prompts, and workflows
- **[API Reference](docs/api/index.md)** – Python API documentation for programmatic use
- **[Architecture](docs/architecture/index.md)** – Design decisions, ADRs, and system overview
- **[Development](docs/development/index.md)** – Contributing guidelines and development setup
- **[Research](docs/research/index.md)** – Research notes, experiments, and background
- **[Documentation Operations](docs/docs-ops/index.md)** – Documentation roadmap and maintenance

## Architecture Overview

- Documentation strategy: [ADR-DD01](docs/architecture/docs-system/adr/adr-dd01-docs-reorg-strategy.md) and [ADR-DD02](docs/architecture/docs-system/adr/adr-dd02-main-content-nav.md)
- GenAI, transcription, and prompt system ADRs live under [Architecture](docs/architecture/index.md) (see ADR-A*, ADR-TR*, ADR-PT*).
- System design references: [Object–Service Design](docs/architecture/object-service/object-service-design-overview.md) and [System Design](docs/development/system-design.md).

## Development

**Common commands:**

- `make setup-dev` - Full development environment setup
- `make build-all` - Full rebuild (poetry update, yt-dlp, pipx tools, docs)
- `make update` - Update dependencies and reinstall pipx tools
- `make pipx-build` - Install CLI tools globally via pipx (editable mode)
- `make test`, `make lint`, `make format` - Testing and code quality
- `make docs`, `make ci-check` - Documentation and CI validation
- `poetry run mypy src/` - Type checking

**CLI Tool Access:**

All CLI tools can be installed globally via pipx for easy access in any shell:

```bash
make pipx-build  # Installs: audio-transcribe, tnh-gen, ytt-fetch, token-count, nfmt, etc.
```

**Optional dependency groups (development only):** `tnh-scholar[ocr]`, `tnh-scholar[gui]`, `tnh-scholar[query]`, `tnh-scholar[dev]`

**Troubleshooting and workflows:** [DEV_SETUP.md](DEV_SETUP.md)

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for coding standards, testing expectations, and PR workflow. We welcome contributions from practitioners, developers, and scholars who resonate with a human-centered, mindful, transparent, and careful approach to AI-supported research.

## Project Status

TNH Scholar is currently in **alpha stage** (v0.3.1), preparing a `0.4.0` bootstrap-oriented minor release. Expect ongoing API and workflow changes during active development.

## Support & Community

- Bug reports & feature requests: [GitHub Issues](https://github.com/aaronksolomon/tnh-scholar/issues)
- Questions & discussions: [GitHub Discussions](https://github.com/aaronksolomon/tnh-scholar/discussions)

## Documentation Map

For an auto-generated list of every document (titles and metadata), see the [Documentation Index](docs/documentation_index.md).

## License

This project is licensed under the [GPL-3.0 License](LICENSE).

---

**For more information, visit the [full documentation](https://aaronksolomon.github.io/tnh-scholar/) or explore the [source code](https://github.com/aaronksolomon/tnh-scholar).**

