Metadata-Version: 2.4
Name: llm-speed
Version: 0.0.2
Summary: Benchmark any LLM on any hardware. CLI for the llm-speed.com flywheel.
Author-email: meadow-kun <2424351+meadow-kun@users.noreply.github.com>
License: Apache-2.0
Project-URL: Homepage, https://llm-speed.com
Project-URL: Source, https://github.com/meadow-kun/llm-speed
Project-URL: Documentation, https://llm-speed.com/methodology
Project-URL: Issues, https://github.com/meadow-kun/llm-speed/issues
Keywords: llm,benchmark,inference,tokens-per-second,ollama,llama.cpp,vllm,mlx
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Environment :: GPU
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Benchmark
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx<0.30,>=0.27
Requires-Dist: psutil>=5.9
Requires-Dist: rich>=13.7
Requires-Dist: cryptography>=44.0.1
Requires-Dist: joserfc>=1.0
Provides-Extra: seed
Requires-Dist: praw<8,>=7.7; extra == "seed"
Provides-Extra: mlx
Requires-Dist: mlx-lm>=0.18; extra == "mlx"
Provides-Extra: vllm
Requires-Dist: vllm>=0.6; extra == "vllm"
Provides-Extra: exllamav2
Requires-Dist: exllamav2>=0.2; extra == "exllamav2"
Provides-Extra: mcp
Requires-Dist: mcp>=1.10; extra == "mcp"
Requires-Dist: cachetools>=5; extra == "mcp"
Provides-Extra: test
Requires-Dist: pytest>=8; extra == "test"
Requires-Dist: pytest-cov>=5; extra == "test"
Dynamic: license-file

# llm-speed-web

Source for **llm-speed.com** — the canonical, crowdsourced source of truth for
how fast LLMs actually run, across hosted APIs, consumer GPUs, and prosumer rigs.

## Install

### One-liners

- pipx (recommended for local-LLM users): `pipx install llm-speed`
- uv: `uv tool install llm-speed`
- Homebrew: `brew install llm-speed/tap/llm-speed` *(coming soon)*
- Docker: `docker run --rm -it llmspeed/llm-speed bench` *(coming soon)*
- npm: `npm install -g llm-speed` *(coming soon)*
- Standalone binary: download from [Releases](https://github.com/meadow-kun/llm-speed/releases) *(coming soon)*

### Optional backends

- MLX (Apple Silicon): `pip install 'llm-speed[mlx]'`
- vLLM (NVIDIA): `pip install 'llm-speed[vllm]'`
- ExLlamaV2 (NVIDIA): `pip install 'llm-speed[exllamav2]'`
- llama.cpp + ollama are detected as binaries on PATH; no extras needed.

See `docs/RELEASE.md` for the publishing runbook (release steps + rollback).

## Status

Phase 0 (seeding). The CLI and website come next; see `docs/` for the brief and plan.

## Layout

```
docs/                Strategic & design documents
  BRIEF.md           Project brief — why, who, monetization, phases
  CLI.md             llm-speed CLI requirements + portability strategy
  DATA_SOURCES.md    Folklore source inventory + scrape policy
  MARKETING.md       CLI launch & flywheel marketing strategy

db/
  schema.sql         SQLite seed schema (mirrors the eventual Postgres prod schema)
  seed.sqlite        Created on first run

seed/                The seeding pipeline (Python)
  models.py            Plain dataclasses (RawDocument, Claim)
  db.py                Connection + insert helpers
  extractors.py        Regex-based (model × hardware × backend × tok/s) extractor
  reddit/client.py     Reddit (PRAW) — r/LocalLLaMA + neighbors, full sweep
  scrapers/hn.py             Hacker News (Algolia API)
  scrapers/openrouter.py     OpenRouter API
  scrapers/localscore.py     LocalScore HTML + Next.js data
  scrapers/artificial_analysis.py   AA public pages (cross-reference)
  scrapers/github.py         GitHub issues/PRs in inference backends
  scrapers/blogs.py          Curated blog list
  run.py               Top-level orchestrator
```

## Quick start

```sh
# 1. Install
python -m venv .venv && source .venv/bin/activate
pip install -e .                       # uses pyproject.toml

# 2. Set credentials (only what you have available; missing ones get skipped)
export REDDIT_CLIENT_ID=...
export REDDIT_CLIENT_SECRET=...
export REDDIT_USER_AGENT="llm-speed-seeder/0.1 (+https://llm-speed.com)"
# Reddit's API rules want a description + contact info. The project URL is
# the right contact for a project-operated bot — DO NOT put `by u/<handle>`
# here, since Reddit logs the user-agent on every request and that ties
# every scrape back to a personal handle.
export GITHUB_TOKEN=...                # optional but strongly recommended

# 3. Smoke test
python -m seed.run --quick --only hn artificial_analysis blogs

# 4. Full sweep
python -m seed.run

# 5. Inspect what landed
python -m seed.run --stats
sqlite3 db/seed.sqlite \
  'SELECT model_family, hardware_name, backend, AVG(decode_tps), COUNT(*)
   FROM claims
   WHERE confidence > 0.5
   GROUP BY 1,2,3 ORDER BY 5 DESC LIMIT 30;'
```

## Per-seeder usage

Each module is also runnable on its own:

```sh
python -m seed.reddit.client --quick                      # auth smoke test
python -m seed.scrapers.hn --query "Qwen3 Coder tok/s"
python -m seed.scrapers.openrouter --no-endpoints
python -m seed.scrapers.localscore --max-tests 50
python -m seed.scrapers.github --repo ggerganov/llama.cpp
python -m seed.scrapers.blogs --urls-file my_urls.txt
python -m seed.scrapers.artificial_analysis
```

## Local CI (no GitHub Actions)

This project does its CI on the developer's machine — the lint / test /
smoke / API-roundtrip / build pipeline that used to run on every push
to GitHub Actions now lives at `scripts/check.sh`.

```bash
# Full check (~30s — lint, test, smoke, API roundtrip, build)
./scripts/check.sh

# Inner-loop fast pass (~10s — lint + test only)
./scripts/check.sh --quick

# Skip individual phases:
SKIP_BUILD=1 ./scripts/check.sh
SKIP_LINT=1 SKIP_BUILD=1 ./scripts/check.sh
```

To run it automatically on every `git push` (skippable per-push with
`--no-verify`), opt in to the bundled hook once per clone:

```bash
git config core.hooksPath .githooks
```

The pre-push hook runs `--quick` by default; `CHECK_FULL=1 git push`
runs the full check.

The previous `.github/workflows/ci.yml` was deleted; only manual /
release-event workflows remain (`release.yml`, `daily-metrics.yml`,
`scheduled-seed.yml`, `reddit-poster.yml`). None of them fire on push.

## Design notes

- **Idempotent.** Re-running a seeder is safe; documents are uniqued by `(source, source_id)`.
- **Provenance preserved.** Every claim links back to its source URL + author + scrape time. Folklore stays distinguishable from CLI-verified canonical results.
- **Confidence-scored.** Regex extraction caps at ~0.85; structured-API extraction reaches ~0.9; nothing in the seed phase counts as canonical (that's the CLI's job).
- **Polite.** All scrapers honor source-appropriate rate limits and identify themselves in `User-Agent`.
- **No LLM calls in this phase.** Heuristic extraction only. An LLM second-pass for ambiguous cases is a Phase 1 add-on (see `docs/CLI.md`).

## Next milestone

`docs/CLI.md` — design for the `llm-speed` benchmark CLI. The seeded folklore is
inventory; the CLI is the actual data-quality moat. Ship CLI in 2–4 weeks; if
adoption fails the kill criterion (see `docs/MARKETING.md`), stop before building
the website.
