Metadata-Version: 2.4
Name: purepatch
Version: 0.1.0
Summary: Apply unified diffs and fuzzy search/replace edits in pure Python - the patch engine for code agents. No git, no patch binary.
Author: adam2go
License: MIT
Project-URL: Homepage, https://github.com/adam2go/purepatch
Keywords: patch,diff,unified-diff,apply,git,llm,agents,pure-python
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Version Control
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# purepatch

[![CI](https://github.com/adam2go/purepatch/actions/workflows/ci.yml/badge.svg)](https://github.com/adam2go/purepatch/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/purepatch)](https://pypi.org/project/purepatch/)
[![Python](https://img.shields.io/badge/python-3.9%E2%80%933.14%20%7C%20PyPy-blue)](.github/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)

**The patch engine for code agents, in pure Python.** Apply unified diffs
and fuzzy search/replace edits with no git, no `patch` binary, no C
extension — in sandboxes, Pyodide/WASM, Lambda, anywhere `pip install`
works. And because it runs in-process, it applies a patch in ~25 µs where
spawning a binary costs milliseconds.

```sh
pip install purepatch
```

```python
import purepatch

new_text = purepatch.apply(diff_text, old_text)          # unified diff -> text
report   = purepatch.apply_files(diff_text, root=".")    # multi-file patch
new_text = purepatch.apply_edit(text, search, replace)   # fuzzy block edit
```

```sh
purepatch --dry-run < change.patch     # the familiar CLI, agent-friendly
purepatch -R < change.patch            # un-apply
```

## Why

LLMs edit code by emitting **unified diffs** and **SEARCH/REPLACE blocks**
— and both arrive slightly wrong: line numbers drifted, context rotted,
indentation moved, trailing whitespace differs. The existing Python
options either only *parse* diffs (unidiff) or are long abandoned
(python-patch, last release 2019). So every agent framework re-implements
patching, badly, or shells out to git.

purepatch is that missing engine:

- **GNU patch semantics for unified diffs**: cumulative offset tracking,
  bidirectional position search, fuzz degradation — verified against the
  real thing (below).
- **A fuzzy edit ladder for LLM edit blocks**: exact match → trailing
  whitespace tolerance → indentation transplant (the block the model wrote
  at top level gets re-indented to where it actually lives). Refuses to
  guess on ambiguity.
- **Errors an agent can act on**: failed matches report the closest
  near-miss (`closest match: line 41, 87% similar`) so the model can
  correct its edit instead of retrying blind.
- Git extended headers understood: new/deleted files, renames, quoted
  paths, `\ No newline at end of file`, CRLF content.

## Verified against GNU patch and git apply

Following the [pure* series methodology](https://github.com/adam2go/purejq):
behavior is checked by **differential testing against the reference
implementations**, run in CI on every commit —

- **500 random clean patches**: `purepatch ≡ GNU patch ≡ git apply ≡
  expected output`, byte for byte;
- **200 drift scenarios** (the file gained unrelated lines): offset
  behavior matches GNU patch exactly;
- **200 rotted-context scenarios**: fuzz behavior matches GNU patch's
  output wherever GNU patch succeeds;
- **300 property cases**: `apply(diff(a,b), a) == b` and
  `apply(diff(a,b), b, reverse=True) == a`.

## Performance

Per-application latency — how a code agent actually uses a patcher: one
patch at a time. Spawn cost is the binaries' real cost; in-process is
purepatch's real cost. Median of 7, three independent rounds (spread
<10%), outputs verified equal before timing. Reproduce:
`python tools/bench.py --verify`.

| workload | purepatch (in-process) | GNU patch (spawn) | git apply (spawn) |
|---|---:|---:|---:|
| 200-line file, 5 edits | 0.025 ms | 2.6 ms (**~100×**) | 7.3 ms (**~290×**) |
| 2k-line file, 30 edits | 0.17 ms | 2.8 ms (16×) | 7.9 ms (46×) |
| 20k-line file, 200 edits | 1.6 ms | 5.3 ms (3.4×) | 15.5 ms (10×) |

Fuzzy `apply_edit` on a 400-line file: ~0.01 ms per call.

An agent loop applying hundreds of edits per session pays milliseconds
total, not seconds — and needs no git in its sandbox.

## API sketch

```python
purepatch.parse(text) -> PatchSet               # inspect hunks/files
purepatch.apply(patch, source, reverse=False, max_fuzz=2) -> str
purepatch.apply_files(patch, root=".", strip=None,  # strip auto-detected
                      reverse=False, dry_run=False) -> ApplyReport
purepatch.apply_edit(content, search, replace) -> str
purepatch.find_block(content, search) -> (start, end, strategy)
```

`ApplyReport.ok`, per-file actions (`patched/created/deleted/renamed/
failed`), and per-hunk offset/fuzz are all inspectable — log them and an
agent can explain exactly what happened.

Exceptions: `ParseError`, `HunkApplyError`, `NoMatchError` (with
`closest_line` / `closest_similarity`), `AmbiguousMatchError` (with all
locations).

## Limitations (honest ones)

- **Binary patches are rejected**, not applied.
- File modes are parsed from git headers but not applied to the
  filesystem (chmod is on the roadmap).
- `purepatch` the CLI covers the agent subset (`-p -d -R --fuzz
  --dry-run`), not every GNU patch flag.
- Like GNU patch, fuzzy hunk placement can in principle pick a wrong spot
  in pathological inputs; `--fuzz 0` disables tolerance entirely.

## License

[MIT](LICENSE)
