Metadata-Version: 2.4
Name: lap-score
Version: 0.3.0
Summary: lap - measure & improve the token-efficiency of agent-facing APIs (OpenAPI & MCP): scorer, linter, the LAP profile, and a token benchmark
License: MIT
Project-URL: Homepage, https://github.com/lCrazyblindl/lap
Project-URL: Roadmap, https://github.com/lCrazyblindl/lap/blob/main/ROADMAP.md
Project-URL: Changelog, https://github.com/lCrazyblindl/lap/blob/main/CHANGELOG.md
Project-URL: Leaderboard, https://github.com/lCrazyblindl/lap/blob/main/docs/LEADERBOARD.md
Keywords: llm,agents,openapi,mcp,tokens,api,token-efficiency
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx
Requires-Dist: tiktoken
Requires-Dist: pyyaml
Provides-Extra: mcp
Requires-Dist: fastmcp; extra == "mcp"
Provides-Extra: faithful
Requires-Dist: anthropic; extra == "faithful"
Provides-Extra: all
Requires-Dist: fastmcp; extra == "all"
Requires-Dist: anthropic; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# lap — token-efficiency scorer for agent-facing APIs

`lap` is the open, neutral, standalone toolkit (no pet-zoo dependency) that measures
how many **tokens** an API's definitions cost an LLM. It answers: *is my agent-API
menu efficient, and by how much could it shrink?* — an open, reproducible number to set
beside the fast-growing MCP/OpenAPI tooling (see [`../docs/LANDSCAPE.md`](../docs/LANDSCAPE.md)
for the neighbors LAP builds on and credits).

## Install

```bash
pip install -e .                  # from the repo root (or: pip install lap-score once published)
pip install -e ".[mcp]"           # + real-MCP baseline (fastmcp)
pip install -e ".[faithful]"      # + faithful Anthropic count_tokens
```

Core deps are just `httpx` + `tiktoken` + `pyyaml`; `fastmcp` and `anthropic` are optional extras.
Robust to real specs: `allOf`/`oneOf`/`anyOf`, `$ref` in params/requestBodies/responses,
path-item-level parameters, OpenAPI 3.1 `type` lists, external `$ref`s (left intact), YAML input,
**Swagger/OpenAPI 2.0** (response `schema`, `in: body` params, type-on-parameter, `#/definitions`),
and non-JSON media types (`*+json`, form, XML). Verified crash-free + non-degenerate across 175+
real APIs.guru specs — re-run with [`../experiments/fuzz_corpus.py`](../experiments/fuzz_corpus.py).

## Quickstart

```bash
lap score  https://petstore3.swagger.io/api/v3/openapi.json   # menu (bucket A) token cost
lap lint   https://petstore3.swagger.io/api/v3/openapi.json   # flag LAP rule violations
lap score  --mcp-url http://localhost:8080/mcp                # score a live MCP server's tools
lap score  lap/examples/bookstore.openapi.json

# no install needed, from the repo root:
python -m lap.score lap/examples/bookstore.openapi.json
```

Example output:

```
LAP menu score - Bookstore API
operations: 6   referenced component schemas: 2

  variant       A tokens  saved vs full  form
  openapi_full  418       +0%            6 tool(s)
  compact_sig   205       +51%           manifest text
  numbered      168       +60%           manifest text

Menu efficiency: compact signatures are +51% vs naive OpenAPI->tools (418 -> 205 tokens).
```

When `fastmcp` is installed, the score also includes a **real-MCP baseline**
(`FastMCP.from_openapi`) — what an actual MCP generator emits — plus its
output-schema-inclusive figure (`--no-mcp` to skip). On a real public API
(**Swagger Petstore**, 19 ops) the real MCP server costs **2226** menu tokens
(**3844** with output schemas) vs **415** for compact signatures — an ~81%
reduction. The toy finding holds in the wild: a real MCP generator is *heavier*
than the naive baseline.

The score also includes a lazy **`tool_search`** form (the Anthropic Tool Search /
Cloudflare Code Mode pattern: a fixed 2-tool menu + a name index, schemas loaded on
demand). Because it doesn't preload schemas, its bucket A is ~flat in the number of
operations — on a 120-operation API it collapses the menu ~83% vs full schemas,
beating even compact signatures at scale (Petstore: 1740 → 207, −88%).

- **Faithful counts:** set `ANTHROPIC_API_KEY` (uses the free Anthropic `count_tokens`
  endpoint; tool defs counted via the real `tools=` parameter). Without it, a GPT-style
  `tiktoken` approximation — absolute numbers approximate, **relative ordering robust**.

## CI gate

`--json` makes both commands machine-readable; thresholds set the exit code so LAP can fail a build:

```bash
lap score openapi.json --gate-form compact_sig --max-menu-tokens 800   # exit 1 if the menu is too heavy
lap lint  openapi.json --fail-on warn                                  # exit 1 on any warning
lap lint  openapi.json --ignore R2,A1                                  # suppress rules (or a ./.lapignore file)
```

GitHub Actions:

```yaml
- run: pip install lap-score          # or: pip install -e .
- run: lap score api/openapi.json --gate-form compact_sig --max-menu-tokens 800
- run: lap lint  api/openapi.json --fail-on warn
```

…or the bundled composite **Action** (one step, no manual install):

```yaml
- uses: lCrazyblindl/lap@v0.3.0
  with:
    spec: api/openapi.json
    max-menu-tokens: "800"     # gate the compact_sig menu (omit = report only)
    fail-on: warn              # fail on any lint warning (omit = report only)
```

Already lint OpenAPI with **Spectral**? The same LAP rules ship as a ruleset —
see [`../spectral/`](../spectral/README.md).

## What it measures (and what it doesn't)

It measures **bucket A** (the definitions/menu the model carries in context) and **estimates
C** (result size, from each response schema + an assumed `--page-size` — a structural lower
bound that captures keys/nesting/types). **B** (the call) still depends on per-API tasks; for a
full measured A/B/C run see
[`../experiments/token-bench`](../experiments/token-bench/README.md). The conventions
behind the compact form are the [LAP profile](../profile/llm-api-profile.md).

## Files

| file | role |
| --- | --- |
| `openapi_ir.py` | load any OpenAPI (file/URL) → normalized operations + `inline_refs` |
| `menu.py` | render the menu forms (openapi_full / compact_sig / numbered) from the IR |
| `mcp_form.py` | real-MCP baseline via `FastMCP.from_openapi` (optional; `--no-mcp` to skip) |
| `mcp_client.py` | scores a live MCP server's advertised tools (`lap score --mcp-url`) |
| `estimate.py` | estimates bucket C (result size) from response schemas (`--page-size`) |
| `tokens.py` | token counting (Anthropic endpoint, or tiktoken approx) |
| `score.py` | the `lap score` CLI |
| `lint.py` | the `lap lint` CLI — checks a spec against the LAP profile rules (D3/R1/R2/R3/W1/E1/A1) |
| `examples/` | sample specs: a Bookstore API, a gnarly OpenAPI 3.1 (allOf / $ref-params / nullable / external-ref), and a Swagger 2.0 spec (`swagger2.json`) |
