Metadata-Version: 2.4
Name: atpx
Version: 0.0.2
Summary: Agentic mathematics workbench: every result is a certificate
Project-URL: Homepage, https://phvv.me/atpx
Project-URL: Documentation, https://phvv.me/atpx
Project-URL: Repository, https://github.com/phvv-me/atpx
Project-URL: Issues, https://github.com/phvv-me/atpx/issues
Author-email: Pedro Valois <contact@phvv.me>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: agents,atp,atpx,certificate,mathematics,proof-assistant,provenance,smt,theorem-proving
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.14
Requires-Dist: cvc5>=1.2
Requires-Dist: cyclopts>=4
Requires-Dist: httpx>=0.28
Requires-Dist: mpmath>=1.3
Requires-Dist: patos>=0.0.6
Requires-Dist: plumbum>=2
Requires-Dist: pydantic>=2.13
Requires-Dist: python-flint>=0.8
Requires-Dist: sympy>=1.14
Requires-Dist: z3-solver>=4.12
Provides-Extra: dev
Requires-Dist: dirty-equals>=0.8; extra == 'dev'
Requires-Dist: faker>=30; extra == 'dev'
Requires-Dist: hypothesis>=6.100; extra == 'dev'
Requires-Dist: mypy>=1.11; extra == 'dev'
Requires-Dist: pyrefly>=1; extra == 'dev'
Requires-Dist: pytest-cov>=5; extra == 'dev'
Requires-Dist: pytest-mock>=3.14; extra == 'dev'
Requires-Dist: pytest-subprocess>=1.5; extra == 'dev'
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: respx>=0.22; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Requires-Dist: syrupy>=4; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
Requires-Dist: mkdocs-static-i18n>=1.2; extra == 'docs'
Description-Content-Type: text/markdown

<div align="center">

<!-- [![atpx banner](https://raw.githubusercontent.com/phvv-me/atpx/main/docs/assets/banner.png)](https://phvv.me/atpx) -->

[![CI](https://github.com/phvv-me/atpx/actions/workflows/ci.yml/badge.svg)](https://github.com/phvv-me/atpx/actions/workflows/ci.yml)
[![Publish](https://github.com/phvv-me/atpx/actions/workflows/publish.yml/badge.svg)](https://github.com/phvv-me/atpx/actions/workflows/publish.yml)
[![PyPI](https://img.shields.io/pypi/v/atpx)](https://pypi.org/project/atpx/)
[![Python](https://img.shields.io/pypi/pyversions/atpx)](https://pypi.org/project/atpx/)
[![Docs](https://img.shields.io/badge/docs-phvv.me%2Fatpx-EAB308)](https://phvv.me/atpx)
[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen)](https://github.com/phvv-me/atpx/actions/workflows/ci.yml)

</div>

[🇧🇷](https://phvv.me/atpx/pt-BR/) [🇲🇽](https://phvv.me/atpx/es/) [🇯🇵](https://phvv.me/atpx/ja/) [🇨🇳](https://phvv.me/atpx/zh/)

Agentic mathematics workbench: every result is a certificate

atpx (Automatic Theorem Prover Accelerated, formerly published as prova) is the
ledger underneath an agentic math loop. Every operation returns a
`Certificate` stamping the claim, the result, the engine name and version, the
hostname, device, seed, git revision, timestamp, and exit status. There are no
naked results anywhere, no daemon, and no database, just three surfaces over one
filesystem state.

1. **Python API.** `atpx.workspace()` returns a `Workspace` whose methods are the verbs.
2. **CLI.** `atpx <verb>` exposes the same methods through cyclopts, which owns the
   event loop and runs the async verbs to completion.
3. **Filesystem.** Blueprint directories hold per-host append-only evidence ledgers
   at `evidence/<hostname>.json`, and vault zettels carry node status.

## Install

```sh
pip install atpx
```
For a persistent CLI install:

```sh
uv tool install atpx   # or: pipx install atpx
```

## Use

A workspace root is any directory whose `atpx.toml` declares a `[workspace]`
table pointing at the blueprints directory and the zettel vault.

```sh
atpx check <slug> <claim>      # run a blueprint claim, stamp + persist a certificate
atpx check <slug> <claim> --background   # detach it, the child persists the certificate
atpx checks <slug>             # background submissions, pending or landed
atpx verify [<slug>]           # freshness sweep, re-run cheap claims, flag stale evidence
atpx brief <slug>              # the full agent context bundle for one node, as markdown
atpx judge_brief <slug>        # what changed since the last refuter judgment
atpx status                    # nodes grouped by zettel status
atpx graph                     # open nodes whose dependencies are all settled
atpx recall "<query>"          # federated search, one certificate of hits per source
atpx connect <slug>            # OEIS lookups for integer runs in the node's evidence
atpx strategies                # close-rates by strategy tag over the append-only logs
atpx lean_candidates           # sketched nodes ranked for formalization
atpx log <zettel> <role> <tag> "message" --status sketched   # role-gated journal
atpx index --write             # regenerate the results index note
atpx compute <engine> <operation> <payload>                  # one typed engine op
atpx prove "<goal>"            # try ATP engines in order, certify who closed it
atpx cross_check <operation> <payload>   # same probe on independent engines
```

Arguments are plain shell tokens under cyclopts, so a multi-word query needs
only ordinary quoting, `atpx recall "Leech lattice"`, never the doubled
quoting the old fire CLI required. Options follow their verb's signature
(`--seed 7`, `--background`, `--sources oeis`), and `atpx --help` lists every
verb with its parameters.

Status transitions are enforced in code, not prose. Only a refuter sets
`sketched` or `refuted` and only the formalizer sets `verified`.

```python
import atpx

ws = atpx.workspace()

# the I/O verbs are async, so they compose on one event loop
certificate = await ws.check("voronoi-e8-codec", "bijectivity-m1")
await ws.recall("196560, 16773120")   # the Leech theta series turns up OEIS A008408

# synchronous scripts and one-liners block on them through the sync facade
ws.sync.recall("196560, 16773120")
ws.cross_check("evaluate", "exp(pi*sqrt(163))")   # CPU-bound verbs stay plain sync
```

A blueprint claim is either a bare command string or a table with `command` and
`requires`. A claim whose requirement this host cannot meet, `requires = "cuda"`
on a machine without an NVIDIA driver, is skipped gracefully: the certificate
says so and nothing enters the evidence ledger, since a skip is not a run.

```toml
[claims]
bijectivity-m1 = "python {dir}/checks.py bijectivity 1"

[claims.gaussian-ladder]
command = "python {dir}/checks.py claim3"
requires = "cuda"
```

## Recall

`recall` fans one query out to every engine with the `search` capability and
returns a single certificate listing the hits per source. A source that is
unavailable or fails at run time becomes an entry under `errors` and the
certificate exits nonzero, so a partial recall is never mistaken for a complete
one. The network engines carry short timeouts and no API keys; loogle expects a
Lean identifier or pattern such as `Real.sqrt _ * _` and reports plain-English
queries as errors.

## Loop mechanics

These verbs exist because loop bookkeeping, not mathematics, is what slows an
agentic prove-and-refute cycle down.

`brief <slug>` assembles the whole opening context in one command, the node
text, its dependency statuses from the wikilink walk, the per-host evidence
summary with stale flags against the current git revision, the last refuter
judgment verbatim, and the blueprint file list, all as markdown on stdout.

`judge_brief <slug>` keeps re-judgment rounds cheap. Every refuter log line on
a node with a blueprint snapshots the full node text into the blueprint's
`judgments/` directory (a verbatim snapshot rather than a hash or git, since it
needs no repository around the vault and always yields a real diff). The verb
then prints the unified diff since that snapshot plus the claims whose
certificates landed after it.

`check <slug> <claim> --background` detaches the run in its own session with
stdout under the blueprint's `checks/` directory. The child stamps and persists
the certificate exactly as a foreground check does, and `checks <slug>` reports
each submission as pending or landed by reading the evidence ledgers. Remote
stays composition, lote runs this same CLI on another host.

`verify [<slug>]` is the freshness sweep. It re-runs every claim this host can
run (`requires`-gated claims it cannot meet are reported skipped), appends the
fresh certificates, and flags claims whose latest prior certificate carries a
git revision different from the tree now. Nothing is ever deleted.

`strategies` aggregates the `[who/strategy date]` journal lines across all
nodes into close-rates per strategy tag, and `lean_candidates` ranks sketched
nodes by backlink count over statement length. Both are v1 tables built on
documented heuristics, no learning and no scoring model.

`connect <slug>` extracts integer runs from the node's persisted certificate
payloads and queries each through the OEIS search engine via `recall`, ambient
conjecture generation under the same certificate discipline.

## Adversarial probes

`atpx.adversarial` is the refuter's typed toolkit, four reusable attack
probes that turn folklore into one-liners.

```python
from atpx.adversarial import boundary_ties, precision_tilt, rederive, seed_sensitivity

sweep = seed_sensitivity(lambda seed: run_check(seed), seeds=range(8))
assert sweep.stable, sweep.outcomes          # a nonzero spread is a refutation lead

ties = boundary_ties([[2, 0], [1, 1]])       # exact dyadic midpoints v/2, decoder tie bait
batches = precision_tilt(ties, [2**-20, 2**-30, 2**-40])   # verdicts must survive tilts

verdict = rederive(basis_a, basis_b)         # exact unimodular change-of-basis check
assert verdict.same_lattice, verdict.determinant
```

`boundary_ties` returns points exactly halfway between the origin and nearby
lattice vectors, exactly representable in binary floating point, so a decoder
must break the tie deterministically. `rederive` runs over exact rationals
through flint and reports integrality and the determinant of the basis-change
matrix.

## Concurrency

atpx has no threads. The verbs with real I/O underneath are `async def` and
compose on the caller's event loop. `recall` awaits every search source
concurrently (`httpx.AsyncClient` for the web sources and an asyncio
subprocess for the vault), `check` runs the claim command as an asyncio
subprocess, `verify` re-runs claims with at most four in flight, and
`connect` awaits `recall` per integer run. The purely local or CPU-bound
verbs (`status`, `graph`, `brief`, `compute`, `prove`, `cross_check`, ...)
stay plain sync; `cross_check` probes its in-process engines in a sequential
loop on purpose, since there is no I/O for a loop to overlap.

The CLI never exposes any of this, cyclopts owns the event loop and runs sync
and async verbs alike. In Python, async code awaits the verbs directly and
can fan them out with `asyncio.gather`; synchronous scripts use the
`workspace().sync` facade, which drives one verb to completion on a fresh
loop per call. Calling the facade from inside an already running loop fails
with a clear error instead of a nested-loop crash, await the verb directly
there.

### Free threading (historical)

Free-threaded CPython 3.14t was evaluated while these surfaces still ran on
`ThreadPoolExecutor`. That question is now moot by design, since no
shared-state parallelism remains anywhere in the package. The wheel matrix
from the evaluation stays below as historical context. It was measured on
`cpython-3.14.3+freethreaded` (macOS arm64, uv-managed, 2026-06).

| dependency | free-threaded wheel | imports under `PYTHON_GIL=0` |
| ------------- | ------------------- | ---------------------------- |
| pydantic-core | yes | yes, GIL stays off |
| z3-solver | yes | yes, GIL stays off |
| python-flint | yes (0.8.0) | yes, GIL stays off |
| cvc5 | **no** (through 1.3.4, `cp38`-`cp314` only) | n/a |
| sympy, mpmath, httpx, fire, plumbum, patos | yes | yes |

At the time, `uv pip install atpx` on 3.14t failed to resolve because of
cvc5, while with cvc5 stubbed out 130 of the 132 tests passed under
`PYTHON_GIL=0`. The package was already free-threading clean and the threads
have since been removed entirely. Standard CPython 3.14 is the supported
runtime.

## Engines

Engines self-register through a typed registry and stamp their name and version
into every certificate. Each declares one capability.

| engine  | capability  | availability                                  |
| ------- | ----------- | --------------------------------------------- |
| sympy   | evaluate    | always (hard dependency)                       |
| mpmath  | evaluate    | always (hard dependency)                       |
| flint   | factor      | always (hard dependency, python-flint)         |
| pari    | factor      | linux only, and only when cypari2 is installed |
| z3      | solve-smt   | always (hard dependency, z3-solver)            |
| cvc5    | solve-smt   | always (hard dependency)                       |
| eprover | prove-tptp  | only when the `eprover` binary is on PATH      |
| vampire | prove-tptp  | only when the `vampire` binary is on PATH      |
| vault   | search      | only when the `chefe` binary is on PATH        |
| oeis    | search      | always, network EAFP at run time               |
| loogle  | search      | always, network EAFP at run time               |
| arxiv   | search      | always, network EAFP at run time               |
| zbmath  | search      | always, network EAFP at run time               |

cypari2 is not a dependency because its wheel only installs cleanly on linux
x86-64, so the pari engine reports unavailable everywhere else. The two
first-order provers are subprocess engines and never Python dependencies.

The vault engine searches the Zettelkasten through `qmd search` inside the
chefe env, the no-LLM BM25 surface, so recall never waits on a model download.
The four web sources are keyless: OEIS by sequence values or words, loogle for
mathlib declarations, arXiv full-field phrase search, and the zbMATH Open REST
API, which answers without a key.

## Documentation

Full documentation lives at [https://phvv.me/atpx](https://phvv.me/atpx).

For LLM-assisted use, start with [`llms.txt`](https://phvv.me/atpx/llms.txt).

## Development

The dev environment is managed by [uv](https://docs.astral.sh/uv/).

- Install: `uv sync --extra dev`
- Lint: `uv run ruff check . && uv run ruff format --check .`
- Typecheck: `uv run mypy src && uv run pyrefly check`
- Test: `uv run pytest -q`
- Docs: `uv run --extra docs mkdocs build -d site`
- Build: `uv build`
