Metadata-Version: 2.4
Name: prylint
Version: 0.4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Microsoft :: Windows
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Debuggers
License-File: LICENSE
Summary: A Rust reimplementation of pylint's error checking that produces byte-for-byte identical output to pylint — 15-84x faster
Keywords: pylint,linter,lint,static-analysis,rust,python
Author-email: Adam Raudonis <adam.raudonis@gmail.com>
License-Expression: GPL-2.0-or-later
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/adamraudonis/prylint/blob/main/PROGRESS.md
Project-URL: Issues, https://github.com/adamraudonis/prylint/issues
Project-URL: Repository, https://github.com/adamraudonis/prylint

# prylint

A Rust reimplementation of [pylint](https://github.com/pylint-dev/pylint) that
produces **byte-for-byte identical output** — **15–2300× faster** (median ~85×).

prylint is not "inspired by" pylint. It is a bug-for-bug port: the same
messages, at the same lines and columns, with the same text, in the same
order, with the same exit codes and the same `Your code has been rated`
footer — verified byte-identically against real pylint on **27 production
codebases** (~60,000 Python files), including django, numpy, pandas, sympy,
home-assistant, sqlalchemy, twisted, scikit-learn, and pylint's own functional
test suite. Where pylint has bugs, prylint reproduces them. Where pylint
crashes, prylint reports the same crash message.

## Install

```bash
pip install prylint
```

Requirements: a `python3` (≥3.9) on `PATH` (used only to mirror pylint's
module-resolution paths and to reproduce CPython's exact syntax-error messages
for unparseable files). pylint and astroid themselves are **not** required.

## Usage

Use it exactly like pylint — full check mode is the default:

```bash
prylint .                      # all checks (like `pylint .`)
prylint -E .                   # errors only (like `pylint -E .`)
prylint --disable=C0114,... .  # same --disable / --enable / inline pragmas
prylint -j 8 .                 # opt-in parallel mode (see "Parallelism" below)
```

Output, message order, exit codes, the score footer, `--rcfile` /
`pyproject.toml` discovery, `init-hook`, and `# pylint:` pragmas all match
pylint 4.0.5.

## Parallelism

prylint is **single-threaded by default** in the checking phase, on purpose:
the byte-identity guarantee depends on replicating astroid's process-global,
order-sensitive inference cache exactly, which parallelizing would perturb. The
default path is the one verified byte-identical to pylint.

`-j N` (alias `--jobs N`) is an **opt-in** parallel mode, mirroring pylint's
own `-j`:

- **The serial default stays byte-identical.** `-j 1`, or no `-j`, runs the
  exact existing single-engine path. The parallel branch is taken only for
  `-j N>1` (or `-j 0` = auto = number of cores).
- **`-j N` may differ from serial — just like pylint's `-j`.** Each worker has
  its own thread-local inference cache and checks a fixed `file_index % N`
  slice of the files, so cache warmth differs and the cross-file "close" checks
  (`R0801` duplicate-code, `R0401` cyclic-import) see only one slice. On django
  (full mode, 10-core M-series), serial vs `-j 8` differs by ~938 message
  lines, ~99% of which are R0801/R0401.
- **It is deterministic per run.** The partition is fixed (not work-stealing)
  and outputs merge in file order, so the same input gives the same `-j N`
  output every run — it just isn't equal to `-j 1`.
- **Speedup is modest and partition-bound.** Each worker re-boots the full file
  set (the cost of determinism), so on django the win is ~15.4s → ~13.7s at
  `-j 8` (~1.1×); it grows on corpora where per-module inference dominates the
  boot cost.

**Use the default when you need byte-identity to pylint; reach for `-j N` only
when the single-core run is long and exact parity is not required.** Full
details and the divergence breakdown are in
[LIMITATIONS.md](LIMITATIONS.md) §5.

## Benchmarks

`prylint .` vs `pylint .` (both full check mode), pylint 4.0.5, Apple M-series,
single-threaded:

| codebase | pylint | prylint | speedup |
|---|---:|---:|---:|
| black | 26.7 hr | 41s | **2328×** |
| sentry | 3.7 hr | 24s | 546× |
| home-assistant (17.5k files) | 10.3 hr | 82s | 452× |
| airflow | 1.9 hr | 17s | 399× |
| salt | 1890s | 8.8s | 215× |
| zulip | 909s | 5.3s | 172× |
| django | 1524s | 10.1s | 150× |
| ansible | 419s | 2.9s | 143× |
| nova (OpenStack) | 1209s | 10.3s | 117× |
| fastapi | 116s | 1.0s | 120× |
| mypy | 367s | 3.9s | 95× |
| sqlalchemy | 614s | 7.1s | 87× |
| pandas | 1009s | 14.2s | 71× |
| scikit-learn | 613s | 9.6s | 64× |
| sympy | 1238s | 26s | 48× |
| *…and 12 more, all ≥30×* | | | |
| **aggregate (27 repos)** | **45.8 hr** | **4.9 min** | **~560×** |

Median per-repo speedup ~85×; the aggregate is higher because pylint's
duplicate-code check (`R0801`) is O(n²) and dominates on test-heavy repos like
black. **These are single-core numbers** — the default inference engine is
deliberately single-threaded to replicate astroid's order-sensitive global
cache exactly (see [LIMITATIONS.md](LIMITATIONS.md)). An opt-in `-j N` mode
trades that byte-identity for cores (see [Parallelism](#parallelism)); the
byte-identical default is already 15–2300× pylint, so parallelism is rarely the
bottleneck.

Every row above is also an accuracy test: each repo's full output is
byte-identical to pylint's (see exceptions in [LIMITATIONS.md](LIMITATIONS.md)).

## Accuracy

prylint was built by differential testing against pinned pylint 4.0.5 /
astroid 4.0.4 / CPython 3.12:

1. **AST fidelity** — prylint's parse tree (built on the
   [ruff](https://github.com/astral-sh/ruff) parser) is compared node-by-node
   against astroid's (positions, scopes, locals, brain transforms) across all
   corpus files: zero differences.
2. **Inference fidelity** — astroid's inference engine is ported exactly:
   lazy-generator semantics, the 100-node inference budget, the bounded-LRU
   caches (`lookup` 128, `_metaclass_lookup_attribute` 1024) with their exact
   eviction, the 64-entry inference-tip FIFO, `Uninferable` propagation. Every
   name/attribute/call node's inference is dumped and compared against astroid.
3. **Output fidelity** — full runs compared byte-for-byte, including message
   order, module headers, the score footer, `# pylint:` pragma handling
   (disable/enable blocks, `disable-next`, `skip-file`), config-file discovery,
   and exit-code bitmasks.
4. **Blind testing** — two batteries of 10 repos each were added *after*
   development and judged cold; every divergence was root-caused and fixed.

Known, documented exceptions (one obscure SQLAlchemy class; the deliberately
excluded `no-member` family; the places pylint is nondeterministic against
itself) are catalogued in **[LIMITATIONS.md](LIMITATIONS.md)**.

## How it works

- File discovery, message control, config parsing, and reporting are direct
  ports of pylint's own logic (down to `os.walk` ordering, the
  `************* Module` header rule, and the score-report footer).
- Parsing uses ruff's Rust parser, then rebuilds astroid's exact tree shape
  (docstring extraction, decorator positions, implicit class locals, metaclass
  handling, brain transforms for dataclasses/enums/namedtuples/attrs/…).
- A full port of astroid's inference engine resolves names, calls, attributes,
  MROs, and operator protocols with astroid's exact conservatism — including
  its caches and their quirks, because the quirks are observable in the output.
- Files the Rust parser rejects are re-judged by CPython itself (an embedded,
  stdlib-only helper) so syntax-error messages match `ast.parse` exactly.

## Reproducing the test suite

`scripts/setup_corpora.sh` clones all 27 corpora at pinned commits and builds
the pinned pylint/astroid ground-truth venv. The accuracy contract: every
change must keep the corpora byte-identical (`harness/` holds the differential
comparators).

## License

GPL-2.0-or-later, the same license as
[pylint](https://github.com/pylint-dev/pylint) — prylint reproduces pylint's
message texts and behavior verbatim.

