Metadata-Version: 2.4
Name: tokenflame
Version: 0.1.0
Summary: Token-level behavioral profiler for LLM generation. Flame graph for LLM reasoning.
Project-URL: Homepage, https://github.com/bh3r1th/tokenflame
Project-URL: Repository, https://github.com/bh3r1th/tokenflame
Project-URL: Bug Tracker, https://github.com/bh3r1th/tokenflame/issues
License: MIT
License-File: LICENSE
Keywords: entropy,llm,logprobs,profiler,rag,tokenizer,visualization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: anthropic>=0.25
Requires-Dist: click>=8.1
Requires-Dist: httpx>=0.27
Requires-Dist: openai>=1.30
Requires-Dist: pydantic>=2.0
Requires-Dist: python-dotenv>=1.0
Requires-Dist: rich>=13
Requires-Dist: tiktoken>=0.7
Requires-Dist: websockets>=12
Provides-Extra: all
Requires-Dist: aiosqlite>=0.20; extra == 'all'
Requires-Dist: fastapi>=0.110; extra == 'all'
Requires-Dist: groq>=0.9; extra == 'all'
Requires-Dist: together>=1.2; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'all'
Provides-Extra: groq
Requires-Dist: groq>=0.9; extra == 'groq'
Provides-Extra: server
Requires-Dist: aiosqlite>=0.20; extra == 'server'
Requires-Dist: fastapi>=0.110; extra == 'server'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'server'
Provides-Extra: together
Requires-Dist: together>=1.2; extra == 'together'
Description-Content-Type: text/markdown

# tokenflame

Token-level behavioral profiler for LLM generation.
**"Flame graph for LLM reasoning."**

Works with prepared RAG context and structured prompt inputs.
Not a document ingestion framework.

---

## Install

From PyPI (coming soon):
  pip install tokenflame

From source (current):
  git clone https://github.com/bh3r1th/tokenflame
  cd tokenflame/packages/backend
  pip install -e .

## Quick Start

tokenflame run --prompt "What is DNA?" --model-a openai/gpt-5.5 --model-b ollama/llama4-maverick

## Output Files
Traces are auto-named with timestamp and prompt slug by default:
  tokenflame_what_is_dna_20260525_172300.json
  tokenflame_what_is_dna_20260525_172300.html
Use --out to specify an explicit filename:
  tokenflame run --prompt "..." --out my_trace.json --html

## With RAG Context

tokenflame run \
  --prompt "What does the policy say about returns?" \
  --context-file retrieved_chunks.txt \
  --context-mode system \
  --model-a openai/gpt-5.5 \
  --model-b ollama/llama4-maverick \
  --out trace_returns.json \
  --html \
  --open

## View a Saved Trace

tokenflame view trace_returns.json
tokenflame view trace_returns.html

## Supported Context Formats

| Format | Notes |
|--------|-------|
| .txt   | Plain text, UTF-8 |
| .md    | Markdown |
| .json  | Pretty-printed JSON |
| .jsonl | One JSON object per line |
| .yaml  | Requires: pip install pyyaml |
| .csv   | Tabular, max 500 rows |

PDF and DOCX are not supported.
Extract text first, save as .txt.

## Configure

Create ~/.tokenflame/config.toml:

[providers]
openai_api_key    = "sk-..."
anthropic_api_key = "sk-ant-..."
groq_api_key      = "..."
ollama_host       = "http://localhost:11434"

[ui]
default_model_a = "openai/gpt-5.5"
default_model_b = "ollama/llama4-maverick"

Or use environment variables:
  OPENAI_API_KEY
  ANTHROPIC_API_KEY
  GROQ_API_KEY
  OLLAMA_HOST

## Available Models

| Model ID | Provider | Logprobs |
|----------|----------|----------|
| openai/gpt-5.5 | OpenAI | ❌ (reasoning model, no logprobs) |
| openai/gpt-5.4 | OpenAI | ❌ (reasoning model, no logprobs) |
| anthropic/claude-sonnet-4-6 | Anthropic | ❌ |
| anthropic/claude-opus-4-6 | Anthropic | ❌ |
| anthropic/claude-opus-4-7 | Anthropic | ❌ |
| ollama/llama4-maverick | Ollama (local) | ⚠️ (logprobs if Ollama >= 0.12.11) |
| ollama/qwen3:32b | Ollama (local) | ⚠️ (logprobs if Ollama >= 0.12.11) |

Models without logprob support stream tokens but show no entropy signal.

## What tokenflame Shows

- **Entropy heatmap** — which tokens were uncertain vs confident
- **Tokenizer diff** — where GPT and Llama split text differently
- **DTW alignment** — where outputs structurally diverged
- **Divergence markers** — exact fork points scored by similarity
- **Replay** — scrub through generation token by token

## What tokenflame Does NOT Do

- Retrieval or embedding
- Document parsing (PDF, DOCX)
- Eval scoring or CI/CD gates
- Agent tracing
- Multi-turn conversation

## Trace Schema

All traces are saved as JSON following schema/trace.schema.json.
Build your own viewer on top of the schema — it is stable.

## License

MIT

