Metadata-Version: 2.4
Name: ai-code-quality-auditor
Version: 0.1.0
Summary: Empirical Safety Harness for agentic AI coding systems. Scores AI-generated code on 5 metrics across 5 vendor conditions against one fixed spec.
Author-email: Dominic Rume <dominicrume@gmail.com>
License: MIT License
        
        Copyright (c) 2026 Dominic Rume
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor
Project-URL: Repository, https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor
Project-URL: Issues, https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor/issues
Project-URL: Documentation, https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor/tree/main/docs
Keywords: ai,code-quality,llm,agents,evaluation,claude-code,cursor,sonarcloud,dissertation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: click>=8.1
Requires-Dist: rich>=13.0
Requires-Dist: radon>=6.0
Requires-Dist: python-dotenv>=1.0
Requires-Dist: bandit>=1.8
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: ruff>=0.5; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Provides-Extra: notebook
Requires-Dist: pandas>=2.0; extra == "notebook"
Requires-Dist: scipy>=1.10; extra == "notebook"
Requires-Dist: matplotlib>=3.7; extra == "notebook"
Requires-Dist: jupyter>=1.0; extra == "notebook"
Provides-Extra: dashboard
Requires-Dist: flask>=3.0; extra == "dashboard"
Provides-Extra: recorder
Requires-Dist: pynput>=1.7; extra == "recorder"
Provides-Extra: all
Requires-Dist: pytest>=8.0; extra == "all"
Requires-Dist: pandas>=2.0; extra == "all"
Requires-Dist: scipy>=1.10; extra == "all"
Requires-Dist: matplotlib>=3.7; extra == "all"
Requires-Dist: jupyter>=1.0; extra == "all"
Requires-Dist: flask>=3.0; extra == "all"
Requires-Dist: pynput>=1.7; extra == "all"
Dynamic: license-file

# AI Code Quality Auditor — the Referee Tool

[![CI](https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor/actions/workflows/ci.yml/badge.svg)](https://github.com/dominicrume/NEW-enterprise-ai-code-quality-auditor/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/ai-code-quality-auditor.svg)](https://pypi.org/project/ai-code-quality-auditor/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Live dashboard](https://img.shields.io/badge/live-dashboard-purple)](https://auditor-dashboard.fly.dev)

> An empirical Safety Harness for agentic AI coding systems.
> Quantifies where AI-assisted development fails at governance, security,
> and ethical alignment — *before* the code reaches production.

**🟢 Try it in 30 seconds:**
```bash
pipx install ai-code-quality-auditor
auditor --help
```

**🚀 Or wire it into your CI in 6 lines** (`.github/workflows/auditor.yml`):
```yaml
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dominicrume/NEW-enterprise-ai-code-quality-auditor@main
        with:
          run-id: ${{ github.run_id }}
          conditions: claude_code,cursor_agent
```

**📊 Live dashboard:** https://auditor-dashboard.fly.dev *(pending deploy — see below)*

This is the experimental instrument for the MSc dissertation
**"AI-Assisted Coding Assessment Tool: Evaluating LLM Performance, Governance,
and Security in an Agent Education System"** (Aston University, MSc AI &
Business Strategy). The same instrument is the working prototype for the
PhD extension at the Aston-Capgemini Centre of Excellence for Enterprise AI.

---

## What it does
Given a fixed specification (the "spec box"), the Auditor:
1. Runs five experimental conditions against the same task (human control,
   visualisation→Claude→Replit, Cursor IDE, autonomous agent).
2. Captures every output and every interaction event.
3. Scores each result on five empirical metrics: security vulnerability
   density, cyclomatic complexity, code duplication, hallucination frequency
   (features outside spec), and keystroke dynamics (correction frequency).
4. Emits CSV/JSON reports for statistical comparison.

## Quick start
```bash
cp .env.example .env
pip install -e .
auditor run --spec specs/agent_education_system.yaml --workflow human_control
auditor report --out data/reports/
```

## Read in this order
1. `docs/ARCHITECTURE.md` — how the pieces fit
2. `docs/METHODOLOGY.md` — how an experiment is run
3. `docs/METRICS.md` — what each metric means and how it's computed
4. `docs/ETHICS.md` — GDPR, synthetic data, academic integrity
5. `docs/DISSERTATION_LINKAGE.md` — which folder serves which proposal section
6. `docs/ROADMAP.md` — the PhD extension (API security + enterprise risk)

## Principles
- One analyzer per metric. One adapter per AI workflow. Single responsibility.
- The spec is data, not code — externalised in `specs/` for reproducibility.
- Synthetic data only. No PII, no proprietary corporate records, ever.
- Every analyzer has a test. Green tests = trustable experiment.
