Metadata-Version: 2.4
Name: document-integrity-layer
Version: 0.1.0
Summary: Git pre-commit hook + web dashboard for detecting LLM-induced document corruption
Author-email: Zach <zach@example.com>
License: MIT
Requires-Python: >=3.10
Requires-Dist: aiosqlite>=0.19.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: click>=8.1.7
Requires-Dist: fastapi>=0.109.0
Requires-Dist: gitpython>=3.1.40
Requires-Dist: jinja2>=3.1.3
Requires-Dist: markdown>=3.5.0
Requires-Dist: pymupdf>=1.23.0
Requires-Dist: python-docx>=1.1.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: python-multipart>=0.0.6
Requires-Dist: rich>=13.7.0
Requires-Dist: sqlalchemy>=2.0.25
Requires-Dist: stripe>=7.11.0
Requires-Dist: uvicorn[standard]>=0.27.0
Provides-Extra: dev
Requires-Dist: black>=23.12.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest>=7.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.9; extra == 'dev'
Description-Content-Type: text/markdown

# Document Integrity Layer

**Detect LLM-induced document corruption before it ships.**

## What is this?

Document Integrity Layer is a Git pre-commit hook and web dashboard that catches document corruption introduced by AI assistants in real-time. It scans Word, PDF, and Markdown files for hallucinated citations, broken cross-references, malformed tables, and formatting inconsistencies—then generates audit trails proving exactly what changed and why. Built for developers and technical writers who delegate writing to Claude, ChatGPT, or similar tools and need verification that the AI didn't silently break your work.

## Features

- **Git pre-commit scanning** — Automatically checks staged documents before commits
- **Multi-format support** — Analyzes .docx, .pdf, and .md files with semantic understanding
- **Corruption detection** — Identifies hallucinated URLs, broken internal links, table structure corruption, and citation inconsistencies
- **Web dashboard** — Visual history of all integrity checks across your repository
- **Audit trails** — Export compliance-ready reports showing what changed and when
- **Slack/Discord alerts** — Real-time notifications when corruption is detected
- **Custom rule configuration** — Define project-specific validation rules in `.dil.toml`
- **Docker-ready** — Run locally or containerized; no external dependencies required

## Quick Start

### Installation

```bash
pip install document-integrity-layer
```

### Setup

Initialize in your repository:

```bash
dil init
```

This creates `.dil.toml` with default configuration. Install the Git hook:

```bash
dil install-hook
```

### Configuration

Edit `.dil.toml` to customize detection rules:

```toml
[scanner]
check_citations = true
check_cross_references = true
check_table_integrity = true
check_formatting = true

[alerts]
slack_webhook = "https://hooks.slack.com/services/YOUR/WEBHOOK"
```

## Usage

### CLI

Run a one-time scan:

```bash
dil scan document.docx
```

Scan an entire directory:

```bash
dil scan ./docs --recursive
```

View integrity history:

```bash
dil history
```

### Web Dashboard

Start the dashboard server:

```bash
dil server --port 8000
```

Navigate to `http://localhost:8000` to explore:
- Scan history across commits
- Corruption reports with side-by-side diffs
- Citation and link validation results
- Custom audit exports

### Pre-commit Hook

Once installed, the hook runs automatically:

```bash
git add my-document.docx
git commit -m "Update docs"
# → Pre-commit hook scans my-document.docx
# → Reports corruption if found
# → Blocks commit if severity threshold exceeded
```

## Tech Stack

- **Python 3.9+** — Core language
- **Flask** — Web dashboard and API
- **python-docx** — DOCX parsing and analysis
- **PyPDF2** — PDF text extraction
- **Markdown** — Native MD support
- **SQLite** — Audit trail storage
- **Docker** — Containerized deployment

## License

MIT