Metadata-Version: 2.2
Name: gh-tokens-loader
Version: 0.1.2
Summary: Simply utility library for loading GitHub tokens from a YAML file to be used for large-scale mining from the GitHub API.
Author-email: Eva Maxfield Brown <evamaxfieldbrown@gmail.com>
License: MPLv2
Project-URL: Homepage, https://github.com/evamaxfield/gh-tokens-loader
Project-URL: Bug Tracker, https://github.com/evamaxfield/gh-tokens-loader/issues
Project-URL: Documentation, https://evamaxfield.github.io/gh-tokens-loader
Project-URL: User Support, https://github.com/evamaxfield/gh-tokens-loader/issues
Classifier: Development Status :: 4 - Beta
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: msgspec<1,>=0.13
Requires-Dist: PyYAML<7,>=6
Provides-Extra: dev
Requires-Dist: ipython; extra == "dev"
Requires-Dist: jupyterlab; extra == "dev"
Provides-Extra: lint
Requires-Dist: pre-commit>=2.20.0; extra == "lint"
Provides-Extra: test
Requires-Dist: pytest<9,>=8; extra == "test"

# GitHub Tokens Loader

A simple utility library to load GitHub tokens from a structured file which helps track who the token is linked to and when it expires.

## Usage

### GitHub Tokens File

To use this library, create a file with the following structure:

```yaml
tokens:
  jane:
    token: "github_pat_56..."
    expiration_date: null
  john:
    token: "github_pat_41..."
    expiration_date: "2026-02-13"
  bob:
    token: "ghp_VeaD3..."
    expiration_date: null
  sally:
    token: "ghp_VVv5E..."
    expiration_date: "2023-02-13"
```

Each token has a short name (generally who or what it is linked to), the token itself, and an optional expiration date. The expiration date should be in the format `YYYY-MM-DD`.

⚠️⚠️ **Important:** Do not share this file with anyone else. Keep it secure and private. It is recommended to add the filename to your `.gitignore` as well. ⚠️⚠️

### Loading Tokens

To load the tokens from the file, use the following code:

```python
from gh_tokens_loader import load_github_tokens

tokens = load_github_tokens("path/to/tokens.yaml")
```

This will return only the tokens that are valid (i.e., not expired).

## Why

I am a researcher who spends a lot of time mining the GitHub API for data about scientific software. Due to the rate limits on the GitHub API, my research collaborators and I will commonly pool our tokens. I found myself copy pasting some form of this code around to different projects and decided to make it a library.

In my own usage, I generally use the `GitHubTokensCycler` that is also builtin to the library:

```python
import time
from concurrent.futures import ThreadPoolExecutor

from gh_tokens_loader import GitHubTokensCycler

# Load tokens
gh_tokens_cycler = GitHubTokensCycler("path/to/tokens.yaml")

# Imagine some function that uses the GitHub API
def get_repo_data(repo: str, gh_token: str) -> dict:
    # Important to sleep to avoid rate limits
    time.sleep(1)

    # Use the token to get data from the GitHub API
    # ...
    # Return the data
    return {"some": "data"}

# Imagine some list of repos
repos = ["repo1", "repo2", "repo3", "..."]

# Thread with cycling tokens
with ThreadPoolExecutor(max_workers=len(gh_tokens_cycler)) as exe:
    results = list(exe.map(
        get_repo_data,
        repos,
        [next(gh_tokens_cycler) for _ in range(len(repos))],
    ))

# Do something with the results
# ...
```

The `GitHubTokensCycler` will occasionally also refresh it's internal set of tokens to only include valid tokens.

## License

This project is licensed under the Mozilla Public License 2.0 - see the [LICENSE](LICENSE) file for details.
