Metadata-Version: 2.4
Name: ptrepo
Version: 0.0.2
Summary: Exposed repository metadata testing tool
Home-page: https://www.penterep.com/
Author: Penterep
Author-email: info@penterep.com
License: GPLv3
Project-URL: homepage, https://www.penterep.com/
Project-URL: repository, https://github.com/penterep/ptrepo
Project-URL: tracker, https://github.com/penterep/ptrepo/issues
Project-URL: changelog, https://github.com/penterep/ptrepo/blob/main/CHANGELOG.md
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Environment :: Console
Classifier: Topic :: Security
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ptlibs<2,>=1.0.33
Requires-Dist: requests<3,>=2.31
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

[![penterepTools](https://www.penterep.com/external/penterepToolsLogo.png)](https://www.penterep.com/)

## PTREPO - Exposed repository testing tool

ptrepo is a Penterep tool for testing exposed source-code repositories on web servers.
The planned scope covers repository discovery, Git and SVN repository download, commit/revision listing, and native secret scanning of recovered content.

## Current MVP status

Current version supports discovery, Git/SVN best-effort content download, reachable or dangling Git commit listing, observed SVN revision reporting, and native secret scanning of recovered Git/SVN content and Git history.

Implemented:

- URL normalization
- `.git`, `.svn`, `_svn`, `.bzr`, `.hg`, and `cgi-bin/cvsweb.cgi` candidate generation
- HTTP probing
- discovery classification
- Git recovery for metadata, refs, reflogs, loose objects, pack files, files recoverable from `.git/index`, and files exportable from the reconstructed Git object database
- SVN recovery for `entries`, `text-base`, `wc.db`, `pristine`, and recovered file contents
- Git validation and history reporting through a defensive low-level `git` backend
- SVN observed revision reporting from recovered `entries` and `wc.db` metadata
- native Git commit patch/message and recovered Git/SVN file secret scanning with built-in rules, redaction, fingerprints, and coverage reporting
- human and JSON output

Git download currently saves:

- `.git/HEAD`
- `.git/config`
- `.git/index`
- `.git/packed-refs`
- `.git/info/refs`
- `.git/objects/info/packs`
- `.git/logs/HEAD` and discovered/common ref logs
- branch/tag ref files where discovered
- loose objects discovered from refs, reflogs, commits, trees, and `.git/index`
- pack files listed in `.git/objects/info/packs`
- locally reconstructed pack indexes where a `.pack` file is recovered but the matching `.idx` file is unavailable
- recovered blob contents under `git/files/`
- files exported from reachable or dangling commit trees when the local Git object database is usable

SVN download currently saves:

- `.svn/entries` or `_svn/entries`
- `.svn/wc.db` or `_svn/wc.db`
- old working-copy `text-base` files where discoverable
- recursive old working-copy `entries`/`text-base` files where subdirectory metadata is exposed
- new working-copy `pristine` files where discoverable from `wc.db`
- recovered file contents under `svn/files/`

These planned options are accepted by the CLI contract but intentionally fail in the current MVP slice:
- `-r/--redirects`
- `-C/--cache`

## Installation

```
pip install ptrepo
```

## Adding to PATH

If you're unable to invoke the script from your terminal, it's likely because it's not included in your PATH. You can resolve this issue by executing the following commands, depending on the shell you're using:

For Bash Users

```bash
echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc
```

For ZSH Users

```bash
echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.zshrc
source ~/.zshrc
```

## Usage examples

```
ptrepo -u https://www.example.com/
ptrepo -u https://www.example.com/plugins/mpdf
ptrepo -u https://www.example.com/ -t git svn bzr hg cvs
ptrepo -f urls.txt -w repository_paths.txt
ptrepo -u https://www.example.com/ --download
ptrepo -u https://www.example.com/ --commits
ptrepo -u https://www.example.com/ --commits --commit-limit 20
ptrepo -u https://www.example.com/ --download --commits
ptrepo -u https://www.example.com/ --secrets
ptrepo -u https://www.example.com/ --max-response-bytes 32768 -j
```

## Options

```
   -u   --url           <url>           Test specified URL
   -f   --file          <file>          Load URLs from file
   -w   --wordlist      <file>          Load additional supported repository path candidates from file
   -t   --repo-type     <type>          Repository type(s) to test: git, svn, bzr, hg, cvs
        --download                      Download recoverable Git/SVN repository content and report history summary
        --commits                       Temporarily recover metadata and list Git commits or observed SVN revisions
        --commit-limit  <count>         Maximum commit/revision entries to print and Git commits to scan; 0 disables both
        --secrets                       Temporarily recover Git/SVN content and scan for secrets
        --secrets-rules <file>          Load additional JSON secret rules
        --secrets-baseline <file>       Ignore previously reported secret finding fingerprints
        --secrets-mode  <mode>          Secret scan mode: auto, files, or history
        --entropy                       Enable entropy checks for generic secret rules
        --no-entropy                    Disable entropy checks for generic secret rules
        --allowlist     <file>          Load JSON secret allowlist
        --max-secret-file-size <bytes>  Maximum recovered file size to scan for secrets
   -H   --headers       <header:value>  Set custom header(s)
   -T   --timeout       <timeout>       Set timeout
        --max-response-bytes <bytes>    Maximum bytes to read from each discovery response
        --max-download-bytes <bytes>    Maximum bytes to write for each downloaded file
   -a   --user-agent    <user-agent>    Set User-Agent header
   -c   --cookie        <cookie=value>  Set cookie(s)
   -p   --proxy         <proxy>         Set proxy (e.g. http://127.0.0.1:8080)
   -v   --version                       Show script version and exit
   -h   --help                          Show this help message and exit
   -j   --json                          Output JSON only, suppresses banner and human output
```

## Planned options

These options are accepted by the CLI contract but intentionally fail in the current MVP slice.

```
   -r   --redirects                     Planned, not implemented in current MVP slice
   -C   --cache                         Planned, not implemented in current MVP slice
```

## Secret rule files

`--secrets-rules` loads additional JSON rules. The file may contain either a list of rules or an object with a `rules` list.
Custom rules must include at least one keyword so the scanner can skip regex evaluation on unrelated lines.
Custom rule regexes are length-limited, `secret_group` must reference an existing capture group, and `entropy_threshold` must be between `0.0` and `8.0`.

```json
{
  "rules": [
    {
      "id": "custom-demo-token",
      "name": "Custom demo token",
      "description": "Project-specific token",
      "regex": "(DEMO_[A-Z0-9]{12})",
      "secret_group": 1,
      "keywords": ["DEMO_"],
      "severity": "high",
      "confidence": "medium",
      "allowlist": {
        "patterns": ["DEMO_PUBLIC_FIXTURE"],
        "regexes": ["^DEMO_TEST_[A-Z0-9]+$"]
      }
    }
  ]
}
```

`--allowlist` loads JSON allowlists:

```json
{
  "patterns": ["known-fixture-value"],
  "regexes": ["^example_[A-Za-z0-9]+$"]
}
```

Allowlist regexes are length-limited before compilation.

`--secrets-baseline` loads previously reported fingerprints and suppresses
matching findings. It accepts either a JSON list of fingerprint strings, an
object with a `fingerprints` list, or a previous PTREPO-style JSON report that
contains nested `fingerprint` fields. Suppressed findings are counted as
ignored baseline findings in human and JSON output.

When Git history contains more commits than `--commit-limit`, history-aware
secret scanning reports partial coverage instead of implying that the whole
history was scanned. Setting `--commit-limit 0` disables detailed history
listing and Git commit secret scanning; file-mode secret scanning can still run.

Built-in secret rules cover common provider and generic credential patterns,
including private key markers, AWS `AKIA`/`ASIA` access key IDs, GitHub tokens,
GitLab access/build/deploy/runner/OAuth token prefixes, Slack tokens and
incoming webhooks, Stripe secret/restricted/webhook keys, Google API keys and
OAuth client secrets, Google service-account JSON, database URLs with
credentials, URLs with embedded credentials, JWT-like tokens, generic
password/token/API key assignments, and conservative base64/hex decoded
credential assignments. Git history scanning checks added and deleted patch
lines plus commit message text. Recovered-file scanning skips oversized files
and files that look binary based on NUL bytes or a high ratio of binary control
bytes.

## Dependencies

```
ptlibs>=1.0.33,<2
requests>=2.31,<3
```

## License

Copyright (c) 2026 Penterep Security s.r.o.

ptrepo is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

ptrepo is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with ptrepo. If not, see https://www.gnu.org/licenses/.

## Warning

You are only allowed to run the tool against the websites which
you have been given permission to pentest. We do not accept any
responsibility for any damage/harm that this application causes to your
computer, or your network. Penterep is not responsible for any illegal
or malicious use of this code. Be Ethical!
