Metadata-Version: 2.4
Name: code-maat-python
Version: 0.2.0
Summary: Modern Python tool for mining and analyzing version control system data
License: GPL-3.0
License-File: LICENSE
Keywords: vcs,git,analysis,mining,coupling,churn
Author: Cameron Yick
Author-email: cameron.yick@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Version Control
Requires-Dist: click (>=8.1.0,<9.0.0)
Requires-Dist: pandas (>=2.0.0,<3.0.0)
Requires-Dist: python-dateutil (>=2.8.0,<3.0.0)
Project-URL: Homepage, https://github.com/hydrosquall/code-maat-python
Project-URL: Repository, https://github.com/hydrosquall/code-maat-python
Description-Content-Type: text/markdown

# code-maat

[![License: GPL-3.0](https://img.shields.io/badge/License-GPL%203.0-blue.svg)](https://opensource.org/licenses/GPL-3.0)
[![Python: 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

A Python implementation of [Code Maat](https://github.com/adamtornhill/code-maat) by Adam Tornhill. Analyzes Git repository history to identify code coupling, change patterns, and team dynamics.

## Overview

code-maat extracts insights from version control history:

- Temporal coupling: Files that change together frequently
- Code churn: Change frequency and stability metrics
- Knowledge distribution: Contributor patterns and code ownership
- Team coordination: Communication needs inferred from code changes

These analyses help identify architectural dependencies not visible in static code structure, inform code review priorities, and understand how teams interact with code over time.

## Why Use This Tool?

Version control history contains information about code dependencies that aren't captured by static analysis. Files that frequently change together often share hidden coupling through business logic, data structures, or implicit contracts, even when they don't import each other directly.

Research shows that files with high temporal coupling have 2-10x higher defect rates ([Predicting Faults from Cached History, MSR 2008](https://dl.acm.org/doi/10.1145/1453101.1453106)). Understanding these patterns helps with:

- Prioritizing code review effort on coupled changes
- Identifying coordination requirements between team members
- Finding knowledge silos and ownership concentration
- Discovering stable versus volatile areas of the codebase

## Installation

```bash
# Using pip
pip install code-maat

# From source with Poetry
git clone https://github.com/hydrosquall/code-maat.git
cd code-maat
poetry install
```

## Quick Start

Generate a Git log in the required format:

```bash
cd your-project
git log --all -M -C --numstat --date=short --pretty=format:'--%h--%cd--%cn' > git.log
```

Run a coupling analysis to find files that frequently change together:

```bash
code-maat coupling git.log --min-coupling 50 --rows 10
# Or: python -m code_maat_python coupling git.log --min-coupling 50 --rows 10
```

Example output:
```csv
entity,coupled,degree,average-revs
src/models/user.py,src/views/profile.py,87,45
src/api/auth.py,src/middleware/session.py,76,32
src/utils/validators.py,src/forms/registration.py,65,28
```

This indicates `user.py` and `profile.py` change together in 87% of commits where either file changes, based on 45 average revisions. High coupling may indicate shared responsibilities or implicit dependencies worth reviewing.

## Available Analyses

code-maat provides 17 analysis types:

| Command | Description | Use Case |
|---------|-------------|----------|
| `coupling` | Temporal coupling between file pairs | Identify files that frequently change together |
| `soc` | Sum of coupling scores per entity | Aggregate coupling metric for prioritization |
| `entity-churn` | Lines added/deleted per entity over time | Measure change volatility |
| `age` | Time since last modification | Identify stable or abandoned code |
| `communication` | Shared code changes between authors | Map coordination requirements |
| `authors` | Number of distinct authors per entity | Measure knowledge distribution |
| `entity-ownership` | Line contribution percentage by author | Determine primary code owners |
| `main-dev` | Primary contributor by lines changed | Identify domain experts |
| `main-dev-by-revs` | Primary contributor by commit count | Identify active maintainers |
| `refactoring-main-dev` | Primary contributor to refactorings | Track code quality effort |
| `revisions` | Commit count per entity | Find frequently modified files |
| `abs-churn` | Aggregate churn over time windows | Track development activity trends |
| `author-churn` | Per-author contribution over time | Analyze individual contribution patterns |
| `entity-effort` | Commit count per entity | Quantify development effort distribution |
| `fragmentation` | Author distribution per entity | Detect coordination overhead |
| `summary` | Repository-wide statistics | Overview metrics |
| `entities` | List of all tracked entities | Verify log parsing scope |

See [REFERENCE.md](REFERENCE.md) for detailed command documentation.

## Documentation

- [Setup Guide](docs/SETUP.md) — Git log format, shell aliases, advanced options
- [Use Cases & Best Practices](docs/USE_CASES.md) — Application scenarios and analysis patterns
- [Command Reference](REFERENCE.md) — Complete command documentation
- [FAQ](docs/FAQ.md) — Common questions and further reading
- [Contributing](CONTRIBUTING.md) — Contribution guidelines

## License

GPL-3.0 — see [LICENSE](LICENSE) for details.

This project is inspired by and compatible with [Code Maat](https://github.com/adamtornhill/code-maat) by Adam Tornhill.

