Metadata-Version: 2.4
Name: agent-ci-verify
Version: 0.3.0
Summary: CI/CD verification pipeline for AI agent outputs — fact check, schema validation, diff verification
Author: Lewis-404
License-Expression: MIT
Project-URL: Homepage, https://github.com/Lewis-404/agent-ci-verify
Project-URL: Repository, https://github.com/Lewis-404/agent-ci-verify.git
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyyaml>=6.0
Requires-Dist: jsonschema>=4.20
Requires-Dist: httpx>=0.27
Requires-Dist: rich>=13.0
Requires-Dist: click>=8.1
Provides-Extra: llm
Requires-Dist: openai>=1.0; extra == "llm"
Requires-Dist: litellm>=1.0; extra == "llm"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.24; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Dynamic: license-file

# agent-ci-verify

> CI/CD verification pipeline for AI agent outputs.  
> **Don't trust your agent's output — verify it.**

[![CI](https://github.com/Lewis-404/agent-ci-verify/actions/workflows/ci.yml/badge.svg)](https://github.com/Lewis-404/agent-ci-verify/actions/workflows/ci.yml)
[![PyPI version](https://img.shields.io/pypi/v/agent-ci-verify.svg)](https://pypi.org/project/agent-ci-verify/)
[![Python](https://img.shields.io/pypi/pyversions/agent-ci-verify.svg)](https://pypi.org/project/agent-ci-verify/)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

[中文](./README_CN.md)

---

## Why agent-ci-verify?

AI agents are entering production, but **no one can answer "can I trust this output?"** 

Existing tools are all "eval libraries" — you import them and write tests yourself. That's self-review, not independent verification.

**agent-ci-verify is your agent's CI/CD pipeline** — plug it in, and every agent output goes through an independent verification layer before it reaches your users.

## Quick Start

```bash
pip install agent-ci-verify
agent-ci ./agent-output/
```

```
agent-ci v0.1.0
Output dir: ./agent-output/
Checkers: schema, fact, diff

                               📋 Schema Checker
┏━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ✅   │ json_valid           │                                                ┃
┃ ✅   │ yaml_valid           │                                                ┃
┃ ✅   │ security_scan        │ No secrets detected                            ┃
┗━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

                               🔍 Fact Checker
┏━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ✅   │ fact:file_count      │ 1 files for '*.json'                           ┃
┃ ✅   │ fact:content_contains│ 'success' found in result.json                 ┃
┗━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

╭────────────────────────────────── Verdict ────────────────────────────────╮
│   ✅  PASS                                                                 │
╰───────────────────────────────────────────────────────────────────────────╯
```

## Three Verification Layers

| Layer | What it checks | Example |
|-------|---------------|---------|
| **Schema** | Format, structure, security | Valid JSON? API key leaked? Required files present? |
| **Fact** | File existence, API reconciliation, LLM judging | Agent claimed `result.json` exists — does it? API returned 200? |
| **Diff** | Regression detection, semantic drift | Output changed vs baseline? Similarity below threshold? |

## Configuration

Drop `.agent-ci.yaml` in your agent project root:

```yaml
pipeline:
  enabled_checkers: [schema, fact, diff]
  fail_fast: false

schema:
  security:
    enabled: true
  required_files:
    - "output/result.json"
  json_schemas:
    schemas/output.schema.json: "output/**/*.json"

fact:
  files:
    - pattern: "output/**/*.json"
      expected_count: 1
      min_size_bytes: 10
      content_checks:
        - type: contains
          value: "success"
        - type: not_contains
          value: "error"
  api:
    - endpoint: "https://api.example.com/health"
      expected_status: 200
  llm_judge:
    - file: "output/answer.md"
      rubric: "Is the answer factually correct?"
      model: "gpt-4o-mini"

diff:
  baseline: "./baseline-output/"
  semantic_threshold: 0.7
  max_changed_files: 5
```

## Security Scanning

Built-in patterns detect:
- AWS Access Keys (`AKIA...`)
- GitHub Tokens (`ghp_...`)
- OpenAI API Keys (`sk-proj-...`)
- JWT Tokens
- Private Keys (RSA, EC, DSA, OpenSSH)
- Password/Secret assignments

## CI Integration

```bash
# JSON output for programmatic parsing
agent-ci --json ./output/ | jq .verdict
# "PASS"

agent-ci --json ./output/ | jq .summary
# {"total_checks": 6, "passed": 5, "warnings": 1, "failed": 0}
```

```yaml
# .github/workflows/agent-check.yml
- name: Verify agent output
  run: |
    pip install agent-ci-verify
    agent-ci --json ./output/ | tee result.json
```

## Plugins

Write custom checkers in any `.py` file:

```python
from agent_ci.checkers import BaseChecker
from agent_ci.types import CheckResult, CheckerReport, Severity

class SizeChecker(BaseChecker):
    name = "size"

    async def verify(self, output_dir):
        report = CheckerReport(checker_name=self.name)
        total = sum(f.stat().st_size for f in output_dir.rglob("*") if f.is_file())
        limit = self.config.get("size", {}).get("max_bytes", 10_000_000)
        severity = Severity.FAIL if total > limit else Severity.PASS
        report.checks.append(CheckResult(
            checker=self.name, check_name="size_limit",
            severity=severity,
            message=f"Output size: {total:,} bytes (limit: {limit:,})",
        ))
        return report
```

Configure in `.agent-ci.yaml`:

```yaml
plugins:
  paths:
    - ./checks/

pipeline:
  enabled_checkers: [schema, fact, size]
  parallel: true  # Run all checkers concurrently

size:
  max_bytes: 5000000
```

## Development

```bash
git clone https://github.com/Lewis-404/agent-ci-verify.git
cd agent-ci-verify
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v
```

## License

MIT — see [LICENSE](./LICENSE)
