Metadata-Version: 2.4
Name: repo-guide
Version: 0.4.2
Summary: Uses AI to help understand repositories and their changes.
Project-URL: Homepage, https://github.com/wolfmanstout/repo-guide
Project-URL: Changelog, https://github.com/wolfmanstout/repo-guide/releases
Project-URL: Issues, https://github.com/wolfmanstout/repo-guide/issues
Project-URL: CI, https://github.com/wolfmanstout/repo-guide/actions
Author: James Stout
License-File: LICENSE
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.11
Requires-Dist: bleach-allowlist>=1.0.3
Requires-Dist: bleach>=6.2.0
Requires-Dist: click
Requires-Dist: gitpython>=3.1.43
Requires-Dist: llm-gemini>=0.5a0
Requires-Dist: llm>=0.19a0
Requires-Dist: mkdocs-material>=9.5.46
Requires-Dist: mkdocs>=1.6.1
Requires-Dist: tiktoken>=0.8.0
Requires-Dist: tqdm>=4.67.1
Provides-Extra: magika
Requires-Dist: magika>=0.6.0rc3; extra == 'magika'
Description-Content-Type: text/markdown

# repo-guide

[![PyPI](https://img.shields.io/pypi/v/repo-guide.svg)](https://pypi.org/project/repo-guide/)
[![Changelog](https://img.shields.io/github/v/release/wolfmanstout/repo-guide?include_prereleases&label=changelog)](https://github.com/wolfmanstout/repo-guide/releases)
[![Tests](https://github.com/wolfmanstout/repo-guide/actions/workflows/test.yml/badge.svg)](https://github.com/wolfmanstout/repo-guide/actions/workflows/test.yml)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/wolfmanstout/repo-guide/blob/master/LICENSE)

Use AI to generate guides to code repositories.

> _NEO:_ Can you fly that thing?  
> _TRINITY:_ Not yet. Tank, I need a pilot program for a military B-212 helicopter. Hurry!  
> _[seconds later ...]_  
> _TRINITY:_ Let's go.
>
> — _The Matrix (1999)_

You can see the output of repo-guide on its own repository at
https://wolfmanstout.github.io/repo-guide/. This is automatically generated and
published after every release. As of 1/4/2025, this consumes under 25K tokens on
each run and costs less than 1 cent on Gemini 1.5 Flash (in fact it's using
Gemini 2.0 Experimental, which is currently free).

**NOTE**: The guides generated by `repo-guide` are intended to be complementary to
human-authored documentation, not replacements.

## Installation

Install this tool using `pip`, `pipx`, or `uv tool install`, e.g.:

```bash
pip install repo-guide
```

By default it uses Gemini Flash as the AI model, which [requires an API
key](https://ai.google.dev/gemini-api/docs/api-key). You can either set the
`LLM_GEMINI_KEY` environment variable or [install Simon Willison's
LLM command-line tool](https://llm.datasette.io/) and use [`llm keys set
gemini`](https://llm.datasette.io/en/stable/setup.html#saving-and-using-stored-keys)
to store your key in a configuration file.

**DISCLAIMER**: LLM API calls may cost money. Although this tool displays token
counts and provides methods to limit token usage, you are ultimately responsible
for any costs incurred, including costs that may be higher than expected due to
bugs in this tool. Consider setting hard limits or other protections in your API
accounts where possible.

## Usage

This tool currently only supports Git repositories, with some additional
features for GitHub repositories (e.g. links to files).

Typical usage:

```
repo-guide <path_to_cloned_repo_or_subdirectory>
```

This will create a `generated_docs` directory within the current directory,
populate it with an AI-generated Markdown guide, then run a private
[MkDocs](https://www.mkdocs.org/) server at `localhost:8000` to serve the docs.
It will show a progress bar as it generates docs, including how many tokens the
model has used (combining input + output). You can start viewing the docs
immediately, and the page will automatically reload as new docs are generated.

If you kill the server and need to restart it later, by default it will reuse
any previously-generated Markdown files, so you can simply rerun the same
command. You can also add `--no-resume` to delete and regenerate the files, or
`--no-gen` to explicitly disable doc generation (e.g. even if a new directory
has been added). Alternatively, you can add `--no-serve` when building the guide
and run [MkDocs](https://www.mkdocs.org/) directly to host the files. You'll
need to install MkDocs and the necessary dependencies (e.g. with `uv tool
install mkdocs --with mkdocs-material,bleach,bleach-allowlist`) then run `mkdocs
serve -f generated_docs/mkdocs.yml`. With this approach, you can also [deploy
the docs to other hosting
platforms](https://www.mkdocs.org/user-guide/deploying-your-docs/#deploying-your-docs).

Here are some of the most common flags you may want to use:

- `--output-dir`: Change where the generated docs are written.
- `-v` or `--verbose`: Prints details on doc generation progress instead of a
  progress bar.
- `--model`: Sets the LLM model to use. As of 1/1/2025, the default is Gemini
  1.5 Flash, which costs under a dollar for 10 million tokens and has a 1
  million token context window, making it a great fit. You can try other Gemini
  models such as `gemini-2.0-flash-exp`, or OpenAI models if `OPENAI_API_KEY` is
  set, as supported by [simonw/llm](https://github.com/simonw/llm) and
  [simonw/llm-gemini](https://github.com/simonw/llm-gemini).
- `--token-budget`: Sets an approximate token budget to avoid overspending.
  Tokens are counted after each LLM call, so the actual number may be higher.
- `--custom-instructions` and `--custom-instructions-file`: Use either of these
  to append custom instructions to the system prompt. Let me know if you come up
  with something that significantly improves the general result quality!

For a full description of command line flags, run:

```bash
repo-guide --help
```

You can also use:

```bash
python -m repo_guide --help
```

## Troubleshooting

If the command fails either due to an error or hitting the token budget, simply
rerun the command and it will resume and retry (unless `--no-resume` is
applied). Most common model errors (e.g. rate-limiting) should be automatically
retried with exponential backoff. You can `--ignore` large generated or binary
files that aren't automatically filtered out (the tool automatically respects
`.gitignore` files and ignores files annotated in `git ls-files --eol` as
non-text). If you still hit the model token limit, try setting
`--files-token-limit`, which is applied per-directory.

LLMs are unpredictable, and the generated Markdown may contain errors and broken
links. The system prompt tries to mitigate common issues, but they happen
anyways. The only real fix to this will be better models, which will surely come
soon.

## Development

To contribute to this tool, use [uv](https://docs.astral.sh/uv/). The following
command will establish the venv and run tests:

```bash
uv run pytest
```

To run repo-guide locally, use:

```bash
uv run repo-guide
```
