Metadata-Version: 2.4
Name: agent-uniformity
Version: 0.1.0
Summary: Reference implementation for the Agent Almanac Code Uniformity benchmark
Project-URL: Homepage, https://github.com/saucam/agent-uniformity
Project-URL: Methodology, https://github.com/saucam/agent-uniformity-q2-2026/blob/main/methodology.md
Project-URL: Report, https://github.com/saucam/agent-uniformity-q2-2026
Project-URL: Dataset, https://huggingface.co/datasets/saucam/agent-uniformity-q2-2026
Author-email: Yash Datta <yash@saucam.dev>
License: MIT
License-File: LICENSE
Keywords: agent-almanac,ai-authorship,benchmark,code-search,structural-uniformity
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: click>=8.1
Requires-Dist: pydantic<3,>=2.7
Requires-Dist: radon>=6.0
Requires-Dist: semble>=0.1.3
Requires-Dist: tree-sitter-language-pack<1.8,>=1.6
Requires-Dist: tree-sitter<0.26,>=0.23
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Description-Content-Type: text/markdown

# agent-uniformity

> Reference implementation for the [Agent Almanac Code Uniformity benchmark](https://github.com/saucam/agent-uniformity-q2-2026).
> Same code path that produced the published numbers — install it, point it at any of the 48 sampled repos, get matching results.

[![PyPI version](https://img.shields.io/pypi/v/agent-uniformity)](https://pypi.org/project/agent-uniformity/)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Methodology](https://img.shields.io/badge/methodology-v0.1.0-green)](methodology.md)

## What this is

A Python package that implements every step of the Agent Almanac Code Uniformity benchmark: function extraction (Python ast + tree-sitter for TS/JS/Go/Rust), AI authorship detection via `git blame` against Claude-tagged commits, semble-driven similarity scoring, the six per-repo evaluators, and the multi-repo aggregation that produces the headline hypothesis tests.

The locked task list (48 repos with HEAD SHAs from the inaugural run) ships with the package. You can re-run any single repo against the locked SHA and verify the numbers we published, or re-run the whole sample sequentially and compare your aggregates against ours.

## Install

```bash
pip install agent-uniformity
```

Requires Python 3.11 or 3.12 (semble's tree-sitter pin doesn't currently work on 3.13). Brings in `semble`, `radon`, `tree-sitter-language-pack` (pinned to <1.8 — newer versions have a broken Python API), and `pydantic`.

## Quick verification — reproduce one repo

```bash
agent-uniformity tasks                                     # list the 48 task IDs
agent-uniformity run-one davila7-claude-code-templates --output ./out
```

This clones the repo at the SHA in `tasks/q2-2026.json`, runs the analysis pipeline, and writes `./out/davila7-claude-code-templates.json` containing every per-function metric (similarity scores, AI ratio, complexity, etc.) and per-repo evaluator scores.

Compare key numbers against [`agent-uniformity-q2-2026/analysis/summary.csv`](https://github.com/saucam/agent-uniformity-q2-2026/blob/main/analysis/summary.csv). Expected variance: ~1% from semble's BM25 non-determinism.

## Run the full benchmark

```bash
agent-uniformity run-all --output ./out         # ~6-8 hours sequential on a laptop
agent-uniformity aggregate ./out                # writes ./out/analysis/*.csv
agent-uniformity deep ./out                     # writes H4 + H5 CSVs
```

## Use as a library

```python
from pathlib import Path
from agent_uniformity import analyze_repo, runner

# Run one task by ID
tasks = {t.task_id: t for t in runner.load_tasks()}
result = runner.run_one(tasks["dora-rs-dora"], Path("./out"))
print(result.output.function_count, result.output.repo_ai_ratio_observed)

# Or analyze an arbitrary repo you've already cloned
out = analyze_repo(
    repo=Path("./my-repo"),
    repo_slug="myorg/my-repo",
    base_sha="HEAD",
)
print(out.function_count)
```

## What's in this repo

```
agent-uniformity/
├── methodology.md             frozen v0.1.0 — pre-registered hypotheses + metrics
├── tasks/q2-2026.json         the 48 sampled repos with locked HEAD SHAs
├── agent_uniformity/
│   ├── extract.py             function extraction (Python ast, tree-sitter)
│   ├── blame.py               git blame + AI commit detection
│   ├── analyze.py             per-repo metrics + 6 evaluators
│   ├── aggregate.py           multi-repo summary CSVs + hypothesis tests
│   ├── deep.py                H4 + H5 deep-pass analyses
│   ├── runner.py              sequential runner (clone → checkout → analyze)
│   ├── schema.py              FunctionFact, RepoOutput pydantic models
│   └── cli.py                 click-based CLI
└── tests/
```

## Versioning policy

The package version (currently `0.1.0`) tracks the methodology version. A change to:

- the metric definitions
- the evaluator formulas
- the hypothesis-test logic
- the function-extraction filters
- the AI-commit detection signals

…requires a methodology version bump and a corresponding package release. Old methodology versions remain installable from PyPI; old reports cite specific package versions.

## Related repos

- [`saucam/agent-uniformity-q2-2026`](https://github.com/saucam/agent-uniformity-q2-2026) — the published report (methodology, sampling, analysis CSVs, hypothesis tests)
- [`huggingface.co/datasets/saucam/agent-uniformity-q2-2026`](https://huggingface.co/datasets/saucam/agent-uniformity-q2-2026) — raw per-repo partials (~120 MB) and `report.json`
- [`agentalmanac.com`](https://agentalmanac.com) — Agent Almanac publication

## License

MIT — see [`LICENSE`](LICENSE). Each upstream repository referenced in `tasks/q2-2026.json` retains its original license.

## Citation

```
Datta, Y. (saucam). (2026). agent-uniformity: reference implementation for the
Agent Almanac Code Uniformity benchmark. https://github.com/saucam/agent-uniformity
```
