Metadata-Version: 2.4
Name: pyprior
Version: 0.0.2
Summary: A constructive prior over small Python programs using AST-driven constrained decoding
Project-URL: Homepage, https://github.com/PatrickHua/pyprior
Project-URL: Repository, https://github.com/PatrickHua/pyprior
Author: Tianyu Hua
License: MIT
License-File: LICENSE
Keywords: ast,code-generation,constrained-decoding,fuzzing,grammar,program-synthesis,property-based-testing
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Software Development :: Compilers
Requires-Python: >=3.10
Requires-Dist: numpy>=1.23
Requires-Dist: pillow>=9
Provides-Extra: demo
Requires-Dist: torch; extra == 'demo'
Provides-Extra: dev
Requires-Dist: hypothesis>=6; extra == 'dev'
Requires-Dist: pytest-benchmark>=4; extra == 'dev'
Requires-Dist: pytest-cov>=4; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Provides-Extra: vllm
Requires-Dist: vllm; (platform_system == 'Linux' or (platform_system == 'Darwin' and platform_machine == 'arm64')) and extra == 'vllm'
Description-Content-Type: text/markdown

# PyPrior

**A constructive prior over Python programs: sample them, condition on
prompts, or use the prior as a constrained decoder for model-generated code.**

![A function with a for-loop and an if/else assembling itself one legal token at a time, then running](assets/demo.webp)

<sub>A single free-form sample, sampled from a random transformer network under PyPrior's constraints — a function with a `for` loop and an `if`/`else` (each branch returns), a comparison used as a value, and nested calls, assembled one legal token at a time and then run. Scope-safe, no dead code. Reproduce it (and the GIF) with `python examples/api/demo_gif.py`.</sub>

PyPrior samples programs from a restricted Python grammar by deciding, at every
step, the exact set of tokens that are legal next — so every program it produces
is syntactically valid, scope-correct (no undefined variables), and bounded in
size **by construction**. Its token stream is a preorder serialization of the
restricted Python AST.

The GIF above shows a nontrivial program with loops and branches. That program
uses **47 PyPrior tokens**; the same source costs **111 GPT-2 tokens**.
PyPrior's AST vocabulary is compact, and AST-token generation spends tokens on
program structure instead of source-code spelling.


## What existing constrained decoders miss

Existing constrained decoding libraries such as
[SynCode](https://github.com/structuredllm/syncode),
[XGrammar](https://github.com/mlc-ai/xgrammar), and
[Outlines](https://github.com/dottxt-ai/outlines) are valuable when a normal
text-token LLM has already learned a Python prior from lots of human-written
code and you want to keep its output inside a grammar or structured format. If
the sampler or model has no Python prior, a syntax mask alone is not enough.
PyPrior targets that different problem: a Python AST-token language with scope
tracking, bounded completion, direct execution semantics, and useful
unconditional sampling.

For example, uniform random sampling through SynCode's Python grammar mask with
a GPT-2 tokenizer can produce syntactically valid Python, but GPT-2's token
distribution is not a useful Python program prior. A prompt-conditioned sample
can look like:

```python
def f(x):
     KCatioushaIntroductioneeAddedRushymmWrittennyderescalALTHpecAccess713EyeensitiveenderedtriggerRahredditmenuappedicheixtpossiblybandJJwatersORD
```

That parses as Python, but it is just a bare expression using an undefined name.
The constrained decoder did its job: it controlled syntax. It did not provide a
program prior, scope safety, or execution semantics.

PyPrior samples directly in an AST vocabulary, so even uniform random
sampling produces structured, scope-safe, runnable programs — every name is
bound, there is no unreachable code, and every program is guaranteed to
**terminate** (no `while`, no recursion — only bounded `for`/`range` loops and
calls to already-defined functions, so every run halts):

```python
def func():
    for a in range(9, 3 + 5):
        print(a)
    return 2 + 4

print(func())   # runs -> 6
```

| Approach | Good at | Missing for PyPrior's use case |
| --- | --- | --- |
| Text-token grammar masks | Keeping an LLM's text output syntactically valid | Scope tracking, AST-token output, small controlled vocabulary, bounded AST size/depth, direct execution model |
| PyPrior | Random code augmentation and PyPrior-vocab constrained decoding | Full Python coverage, arbitrary libraries/imports, unconstrained human-style code |

This is the core distinction: text-level constrained decoders constrain a
language model; PyPrior defines the language, the vocabulary, the decoder state,
and the executable subset together.

## Install

Install directly from the GitHub repo for now:

```bash
pip install "pyprior @ git+https://github.com/PatrickHua/pyprior.git"
```

The core is numpy-only. Optional extras: `[demo]` (torch, for the sample
decoders), `[vllm]` (the vLLM V1 logits-processor integration), `[dev]` (pytest,
hypothesis, pytest-benchmark).

## Using PyPrior

The engine works the same at every step: ask what's legal, advance with any of
those tokens, repeat. A random sampler, a weighted prior, and a neural model all
plug into that one loop. There are two surfaces.

### 1. The engine

Drive the decode directly — pick any legal token however you like.

This is the primitive state-machine API that `Prior` wraps; `examples/api/prior.py`
shows the higher-level model/prior loop built on top of it.

```python
import random
import pyprior as pp

tok = pp.Tokenizer(value_range=(0, 9))
s = pp.init(vocab=tok, max_len=30)                  # a fresh free-form decode
                                                    # (pass ids=[...] to start from a prompt)
while not pp.done(s):
    legal = s.rows[0].legal()                       # the set of legal next token ids
    pp.advance(s, [random.choice(sorted(legal))])   # pick any one — here, uniformly
print(tok.decode(s.ids[0]))
```

`legal` (or `pp.next_legal_tokens(s)` for the additive `[B, V]` mask a model adds
to its logits) is intersected from four sources: the grammar's FIRST set, the
in-scope variables (definite assignment), the depth/length guards, and a
min-cost-to-complete check that guarantees the program closes within `max_len`.
It is never empty until the program is done.

### 2. The prior + a model

`Prior` blends a model's logits, a symbolic prior, and the legal mask — built once,
then a session loop. The same `prior.apply` is what the vLLM processor calls, so
behavior is identical whether you own the loop or vLLM does:

```python
import pyprior as pp
from pyprior import PriorConfig
from pyprior.demo import RandomDecoder          # flat-logit stand-in model (the torch extra)

def sample(logits, temperature=1.0):            # [B, V] -> [B]
    import torch
    return torch.multinomial(torch.softmax(logits / temperature, -1), 1)[:, 0]

tok = pp.Tokenizer(value_range=(0, 9))
prior = pp.Prior(tok, PriorConfig(apply="entropy"), max_len=30)  # entropy | fixed | additive | off
model = RandomDecoder(tok.vocab_size)           # flat logits -> entropy ~max -> follows the prior

def function_prefix(arity):
    return (tok.id_of("BLK_1"), tok.id_of("FUNC_1"), tok.id_of(f"ARITY_{arity}"))

prior.start([function_prefix(1), function_prefix(2)])  # or start(n=B) for free-form
while not prior.done():
    logits = prior.apply(model(prior.input_ids)[:, -1])   # combine per PriorConfig.apply
    prior.advance(sample(logits))
prior.show(0)                                   # animate the first sample
```

Apply modes: **entropy** (param-free confidence gate — lean on the prior when the
model is unsure, fade as it sharpens; default), **fixed** (convex blend, weight
`alpha`), **additive** (prior as a bias), **off** (mask only).

Runnable walk-throughs in **[`examples/api/`](examples/api)**: `prior.py`,
`torch_nn.py` (a real `torch.nn` model), and `vllm_adapter.py` (serving). For
batched production serving the engine ships a
vLLM V1 logits processor (`pyprior.integrations.vllm.PyPriorLogitsProcessor`); the
model-free walk-through is `tests/integration/test_vllm_processor.py`.

## What it generates

A small, bounded, free-form subset of Python over integers — a module of top-level
statements, where a `def` is one statement form:

```python
x = 3                     # top-level statements: assign / if / for / def
def func(a, b):           # a def is one of them (params + locals are scoped)
    d = a * b + 1         # assignments to fresh / existing locals
    if d < 0:             # if / else
        return 0          # return is a statement (early returns OK), function-body only
    return d              # ...or fall through to None
```

The shape is configured entirely on the `Tokenizer`: `value_range` (integer
range), `max_block_stmts`, `max_expr_depth`, `max_control_depth`, `max_funcs`,
and `max_vars`. The grammar is otherwise *maximal* — bounded `for` loops (one- and
two-arg `range`), non-recursive `def` calls, `print(expr)`, and comparisons used
as int values (`b = a == 7` → `int(a == 7)`) are always available, and the
language stays total (recursion is structurally impossible, loops are bounded). By
default a `return` ends its block (no unreachable code) — pass `no_dead_code=False`
to allow statements after a return. To *restrict* generation, subtract with
`exclude=` rather than a flag: `exclude={"CALL"}` (no calls), `exclude={"for"}`
(no loops), `exclude={"PRINT"}` (pure functions). To generate a single function,
start from an explicit prefix such as `BLK_1 FUNC_1 ARITY_2` and set
`max_block_stmts=1`.
`init(..., exclude=...)` forbids whole forms during generation (e.g.
`exclude={"binop"}` or `{"func"}`).

## Watch it decode

`pp.show` replays a token stream into the animation above — one legal token at a
time, holes (`<expr>`, `<stmt>`, `<block>`) filling in:

```python
import pyprior as pp

tok = pp.Tokenizer(value_range=(0, 9))
ids = pp.init(vocab=tok, max_len=30)               # ...sample a stream (see "Using PyPrior")
pp.show(tok, ids.ids[0])                            # animate in the terminal
pp.show(tok, ids.ids[0], save_as_gif=True, gif_path="decode.gif")   # ...or export a GIF
```

## Status

Supports Integer types. List, strings, ... etc remains TBD.

## License

MIT — see [LICENSE](LICENSE).
