Metadata-Version: 2.4
Name: rpo
Version: 0.1.0a0
Summary: Repository Participation Observer. A tool for investigation git repository contribution data and patterns.
Project-URL: Homepage, https://github.com/crlane/rpo
Project-URL: Documentation, https://rpo.readthedocs.io/en/latest/
Project-URL: Repsoitory, https://github.com/crlane/rpo
Project-URL: Bug Tracker, https://github.com/crlane/rpo/issues
Project-URL: Changelog, https://github.com/crlane/rpo/blob/master/CHANGELOG.md
Author-email: Cameron Lane <crlane@adamanteus.com>
Maintainer-email: Cameron Lane <crlane@adamanteus.com>
License-File: LICENSE
Keywords: git,repository,statistics
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.13
Requires-Dist: altair[all]>=5.5.0
Requires-Dist: click>=8.2.1
Requires-Dist: gitpython>=3.1.44
Requires-Dist: polars>=1.30.0
Requires-Dist: pydantic>=2.11.5
Description-Content-Type: text/markdown

# RPO: Repository Participation Observer

A command line tool and Python library to help you analyze and visualized Git repositories. Ever wondered who has most contributions? How participation has changed over time? What are the hotspots in your code that change frequently? Who has the highest bus factor? `rpo` can help.

A note on analyzing code repositories: Attempting to quantify developer productivity by lines of code (or git commits) is generally a bad idea. `rpo` is designed to help you uncover how your code's contribution model has changed over time, and how you can build a more efficient and sustainable software operation. The tools here _might_ tell you something about your development team - but it's even more likely that they'll tell you something about your *management*. Do you have high turnover and/or burnout problems? Are people committing way outside their normal work hours? How are you doing at documentation and knowledge transfer?

All that to say, while I hope this tool will be useful, it is not a substitute for thinking.

## Usage

### CLI

```bash
Usage: rpo [OPTIONS] COMMAND [ARGS]...

Options:
  -g, --glob TEXT            File path glob patterns to INCLUDE. If specified,
                             matching paths will be the only files included in
                             aggregation. If neither --glob nor --xglob are
                             specified, all files will be included in
                             aggregation. Paths are relative to root of
                             repository.
  -xg, --xglob TEXT          File path glob patterns to EXCLUDE. If specified,
                             matching paths will be filtered before
                             aggregation. If neither --glob nor --xglob are
                             specified, all files will be included in
                             aggregation. Paths are relative to root of
                             repository.
  -A, --aggregate-by TEXT
  -I, --identify-by TEXT
  -S, --sort-by TEXT
  -a, --alias-file FILENAME  A JSON file that maps a contributor name to one
                             or more aliases. Useful in cases where authors
                             have used multiple email addresses, names, or
                             spellings to create commits.
  -r, --repository PATH
  -b, --branch TEXT
  --help                     Show this message and exit.

Commands:
  activity-report  Simple commit report aggregated by author or committer
  repo-blame       Computes the per contributor blame for all files at a...
  revisions        List all revisions in the repository
  summary
  ```

### Library

```bash
pip install rpo
```

```python
from rpo import Project, Repository


```
## Examples

> NOTE: depending on your shell, you may or may not need to escape the splat character in the glob patterns used below.

### Git Blame for all Files in a Repo at a Given Revision, Identify Authors by Email
```
$ rpo -r ../my-local-repo -I email repo-blame -R HEAD
```


### Author Activity Report, Including Only Files that Match a Pattern
```
$ rpo -r ../my-local-repo -g tests/\* activity-report
```

### Author Activity Report, Excluding Files that Match a Pattern
```
$ rpo -r ../my-local-repo -xg tests/\* activity-report
```

### File Activity Report, Excluding Files that Match a Pattern
```
$ rpo -r ../my-local-repo -xg tests/\* activity-report --files-report
```


## Features
- [ ] Automatically generate aliases that refer to the same person
- [x] Support analyzing by glob
- [x] Support excluding by glob
- [ ] Produce blame charts
- [x] Optionally ignore merge commits
- [x] Optionally ignore whitespace
- [ ] Identify major refactorings
- [ ] Fast execution, even on giant repositories


## Performance

The goal is for the library to work even on the largest libraries. In general, the performance is proportional to the number of authors, commits, and files being considered in the aggregations.

The authors regularly [test](./tests/integration/test_cpython_repository.py) using the [cpython repository](https://github.com/python/cpython), which contains over 1,000,000 objects. That takes a while.

> TODO: Performance graphs

## Similar Projects and Inspiration

Thanks to [GitPandas](https://github.com/wdm0006/git-pandas) for inspiration.
