Metadata-Version: 2.4
Name: stackunderflow
Version: 0.6.1
Summary: A local-first knowledge base for your AI coding sessions
Project-URL: Homepage, https://github.com/0bserver07/StackUnderflow
Project-URL: Repository, https://github.com/0bserver07/StackUnderflow
Project-URL: Issues, https://github.com/0bserver07/StackUnderflow/issues
Project-URL: Documentation, https://0bserver07.github.io/StackUnderflow/
Project-URL: Changelog, https://github.com/0bserver07/StackUnderflow/blob/main/CHANGELOG.md
Author: StackUnderflow Contributors
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.11
Requires-Dist: click>=8.0.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: mcp>=1.2.0
Requires-Dist: orjson>=3.9.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: python-multipart>=0.0.6
Requires-Dist: rich>=13.0.0
Requires-Dist: uvicorn[standard]>=0.23.0
Requires-Dist: uvloop>=0.17.0; sys_platform != 'win32'
Requires-Dist: winloop>=0.1.8; sys_platform == 'win32'
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == 'dev'
Requires-Dist: mypy>=1.5.0; extra == 'dev'
Requires-Dist: psutil>=5.9.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: twine>=4.0.0; extra == 'dev'
Requires-Dist: types-psutil; extra == 'dev'
Requires-Dist: types-python-dateutil; extra == 'dev'
Description-Content-Type: text/markdown

# StackUnderflow

**StackUnderflow: the local observability for your coding agents.** Search, replay, and analyse every session, all offline. Starts with [Claude Code](https://claude.ai/code).

[Quickstart](#quickstart) | [Features](#features) | [Configuration](#configuration) | [Architecture](#architecture) | [Contributing](#contributing)

![StackUnderflow Dashboard](assets/dashboard.png)

## Quickstart

**Requirements:** Python 3.10+ and an existing `~/.claude/` directory from using Claude Code. Adapters for more coding agents are on the way.

```bash
pip install stackunderflow
stackunderflow init
```

Your browser opens to `http://localhost:8081` with every project under `~/.claude/projects/` indexed and ready to browse.

**Common knobs:**

```bash
stackunderflow init --no-browser      # don't auto-open the browser
stackunderflow cfg set port 8090      # change the port
stackunderflow backup create          # snapshot ~/.claude/ before risky changes
stackunderflow --help                 # everything else
```

If port 8081 is taken: `stackunderflow cfg set port <free-port>` then re-run `init`.

### Nix (reproducible)

A flake is provided for Nix users — no Python or Node setup required:

```bash
nix run github:0bserver07/StackUnderflow      # launch the dashboard
nix build github:0bserver07/StackUnderflow    # build, output at ./result
nix develop                                   # dev shell with python + node
```

See [flake.nix](flake.nix) for the full derivation.

### Development setup

If you want to hack on StackUnderflow (or install from source), you'll also need Node 18+ to build the React UI:

```bash
git clone https://github.com/0bserver07/StackUnderflow.git
cd StackUnderflow

# 1. Build the React UI (one-time)
cd stackunderflow-ui && npm install && npm run build && cd ..

# 2. Install the Python package in editable mode
pip install -e .

# 3. Launch the dashboard
stackunderflow init
```

## Features

- **Analytics dashboard** — token usage, cost breakdown, model distribution, error patterns, hourly activity
- **Cost tab** — token-burn attribution: top sessions by cost, most expensive commands (click → Messages tab), tool-cost ranking, token composition (donut + stacked daily), cache ROI, outliers, retry-loop signals, week-over-week trends, and an error-cost estimate. Filter state (range / session / tool) is URL-encoded so views are shareable. Implementation: `stackunderflow-ui/src/pages/cost/` + `stackunderflow/routes/cost.py` + `stackunderflow/routes/commands.py`.
- **Session viewer** — browse individual JSONL session files with conversation replay, sub-agent grouping, per-session cost. Deep-linked detail views show a breadcrumb + back button.
- **Light / dark theme** — toggle in the header (sun/moon). Persists to `localStorage['suf:theme']`.
- **Full-text search** — across all sessions, with filters for date, model, and role
- **Q&A pair detection** — heuristic extraction of question-answer pairs based on text patterns and follow-up cues
- **Auto-tagging** — tags sessions by language, framework, topic, and intent (`build`, `fix`, `explore`, `refactor`, `test`, `ops`) using keyword and pattern matching
- **Resolution status** — flags Q&A pairs as `resolved`, `looped`, or `abandoned` based on follow-up patterns, with loop counts surfaced in the dashboard
- **Bookmarks** — save and organise important conversations
- **Incremental backups** — `stackunderflow backup create` snapshots `~/.claude/` with hard-linked `rsync --link-dest` (use `backup auto` on macOS for daily scheduling)
- **Multi-project** — switch between projects, view cross-project statistics
- **Legacy project recovery** — pre-January 2026 Claude Code stored prompts in `~/.claude/history.jsonl` instead of per-project JSONL files. StackUnderflow auto-detects these old projects and surfaces them from that file (prompts and timestamps only — token/model data wasn't stored locally in the old format).

## Using as a Library

StackUnderflow also works as a Python package for scripting and automation. The
public API reads from the local SQLite store at `~/.stackunderflow/store.db` so
you get every project from every provider that has been ingested:

```python
import stackunderflow

# Every project the local store knows about — provider-tagged.
projects = stackunderflow.list_projects()
# [{"slug": ..., "provider": "claude" | "codex" | ...,
#   "display_name": ..., "path": ..., "first_seen": ..., "last_modified": ...}, ...]

# Filter to one provider:
codex_only = stackunderflow.list_projects(provider="codex")

# Pipeline-formatted messages + statistics for one project:
messages, stats = stackunderflow.process(projects[0]["slug"])

tokens = stats["overview"]["total_tokens"]
print(f"Sessions: {stats['overview']['sessions']}")
print(f"Tokens: {tokens['input']:,} in / {tokens['output']:,} out")
print(f"Total cost: ${stats['overview']['total_cost']:.2f}")
```

If the store doesn't exist yet (fresh install, no `stackunderflow init` run),
`list_projects()` returns an empty list rather than raising. `process()` raises
`KeyError` when a slug isn't found.

For lower-level access, the store helpers are still importable:

```python
from stackunderflow.store import db, queries
from stackunderflow.stats import classifier, enricher, aggregator, formatter
from stackunderflow.infra.discovery import locate_logs
```

## MCP server

StackUnderflow ships an [MCP](https://modelcontextprotocol.io/) server that exposes your local Claude Code session logs as a tool any MCP client can call. With it wired up, your AI can answer "what tools did I run in the last hour" or "find the last error I hit" by reading your real on-disk session files.

```bash
stackunderflow-mcp     # console script
stackunderflow mcp     # equivalent CLI subcommand
```

To wire it into Claude Desktop, add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "stackunderflow": {
      "command": "stackunderflow-mcp"
    }
  }
}
```

Restart Claude Desktop and the `session_query` tool becomes callable. See [`docs/mcp.md`](docs/mcp.md) for the full tool reference, Cursor / Claude Code wiring, supported agent roots, and the architectural rationale (stateless, adapter-imported, schema-evolution insulated).

## Configuration

```bash
# Change port (default: 8081)
stackunderflow cfg set port 8090

# Disable auto-opening browser
stackunderflow cfg set auto_browser false

# Show current settings
stackunderflow cfg ls

# Reset a setting to default
stackunderflow cfg rm port

# Sync the session store with the latest JSONL files (incremental)
stackunderflow reindex
```

| Key | Default | Description |
|-----|---------|-------------|
| `port` | 8081 | Server port |
| `host` | 127.0.0.1 | Server host |
| `auto_browser` | true | Auto-open browser on start |
| `max_date_range_days` | 30 | Default date range for analytics |

See [CLI reference](docs/cli-reference.md) for all commands.

## Architecture

```
stackunderflow/
  adapters/       # source-adapter layer (one adapter per AI tool)
    base.py       #   LogAdapter ABC — discover() + stream_messages() protocol
    claude.py     #   Claude Code adapter — reads ~/.claude/projects/ + history.jsonl
  ingest/         # mtime-gated incremental import into the store
    enumerate.py  #   fan all adapters' SessionRefs into one iterable
    writer.py     #   transactional writer — one file → one transaction → one ingest_log row
  store/          # SQLite session store (~/.stackunderflow/store.db)
    db.py         #   connection factory (WAL mode, row_factory)
    schema.py     #   CREATE TABLE migrations
    queries.py    #   typed read helpers (list_projects, get_project_stats, …)
    types.py      #   frozen dataclasses returned by query helpers
  stats/          # message classification + analytics (no I/O — pure transforms)
    classifier.py #   tag message types and error patterns
    enricher.py   #   build dataset with interaction chains
    aggregator.py #   compute statistics (one pass, collector-based)
    formatter.py  #   shape messages for the REST API
  infra/
    discovery.py  # find and enumerate Claude log directories
    costs.py      # per-model cost estimation
  routes/         # FastAPI route modules
    projects.py   #   project selection and listing
    data.py       #   stats, dashboard-data, messages, refresh (store-backed)
    sessions.py   #   session browsing (store-backed)
    search.py     #   full-text search
    qa.py         #   Q&A pair browsing
    tags.py       #   auto-tags and manual tagging
    bookmarks.py  #   bookmark CRUD + session metadata enrichment
    misc.py       #   pricing, health, static
  services/       # search, Q&A, tags, bookmarks, pricing
  deps.py         # shared state (config, services, store_path)
  server.py       # thin shell — app creation, middleware, lifespan
  settings.py     # env → file → default config resolution (descriptor-based)
  cli.py          # click CLI (init, start, cfg, backup, clear-cache, reindex)

stackunderflow-ui/  # React + TypeScript + Tailwind frontend
```

### Data flow

```
~/.claude/projects/<slug>/*.jsonl
         │
         ▼
  ClaudeAdapter.enumerate() + read()
         │  (incremental: skip unchanged mtime/offset)
         ▼
  ingest/writer.py  →  SQLite store (~/.stackunderflow/store.db)
         │
         ▼
  queries.get_project_stats(conn, project_id)
         │  (reconstructs messages, feeds stats/ chain)
         ▼
  stats/{classifier → enricher → aggregator → formatter}
         │
         ▼
  /api/stats, /api/dashboard-data, /api/messages
```

### How refresh works

`stackunderflow reindex` (or `/api/refresh`) fans every registered adapter's discovered files through `ingest/enumerate.py`. For each file the runner compares the stored `last_mtime` and `last_offset` and only reads newly-appended bytes. New messages are written transactionally to the store. The store is then the source of truth for all API endpoints — session browsing, cross-project aggregation (`reports/aggregate.py`), and the waste-finding heuristic (`reports/optimize.py`).

On first successful ingest the legacy `~/.stackunderflow/cache/` directory is removed if present.

### Source adapters

StackUnderflow ingests sessions from more than one coding agent. Four adapters are default-on; twelve more are opt-in beta.

| Provider | Status | Source | Default state |
|----------|--------|--------|---------------|
| Claude Code | stable | `~/.claude/projects/<slug>/*.jsonl` (+ legacy `~/.claude/history.jsonl`) | on |
| Codex | stable | `~/.codex/` rollout JSONL | on |
| Cursor | stable | `~/Library/Application Support/Cursor/User/globalStorage/state.vscdb` | on |
| Cline | stable | `~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/tasks/` | on |
| KiloCode | beta | `~/Library/Application Support/Code/User/globalStorage/kilocode.kilo-code/tasks/` | off |
| Roo Code | beta | `…/rooveterinaryinc.roo-cline/tasks/` | off |
| OpenCode | beta | SQLite under `$XDG_DATA_HOME/opencode/` | off |
| Cursor Agent | beta | `~/.cursor/projects/{p}/agent-transcripts/` (+ ai-tracking.db) | off |
| Qwen | beta | `~/.qwen/projects/{p}/chats/*.jsonl` | off |
| Gemini | beta | `~/.gemini/tmp/{p}/chats/session-*.{json,jsonl}` | off |
| Copilot | beta | `~/.copilot/session-state/` + VS Code transcripts | off |
| Codeium | beta | `~/.codeium/` (discovery stub — protobuf decoding deferred) | off |
| Continue | beta | `~/.continue/*.{db,sqlite,sqlite3}` | off |
| Droid | beta | `$FACTORY_DIR` (or `~/.factory/sessions/`) | off |
| Kiro | beta | `~/Library/Application Support/Kiro/User/globalStorage/kiro.kiroagent/` | off |
| OpenClaw | beta | `~/.openclaw/`, `~/.clawdbot/`, `~/.moltbot/`, `~/.moldbot/` | off |
| Pi + OMP | beta | `~/.pi/agent/sessions/` and `~/.omp/agent/sessions/` | off |

Cursor and Cline shipped behind beta flags from v0.4.0 through v0.6.0. They've been promoted to default-on in v0.7.0 — both have full test coverage, fingerprint-based caching for the Cursor vscdb, and have been stable against real local data for several releases.

#### Beta adapters

The remaining beta adapters are off by default. Opt in with an env var when starting the server:

```bash
STACKUNDERFLOW_BETA_KILOCODE=1 stackunderflow start
STACKUNDERFLOW_BETA_QWEN=1 STACKUNDERFLOW_BETA_GEMINI=1 stackunderflow start
```

The flag parser accepts `1`, `true`, `yes`, `on` (case-insensitive). Most beta adapters are macOS-only in v1. See [docs/multi-provider.md](docs/multi-provider.md) for the full feature reference and [docs/adapters.md](docs/adapters.md) for how to write a new adapter.

## Privacy

StackUnderflow processes all your Claude Code logs locally.

**What it reads on your machine:**
- `~/.claude/projects/<slug>/*.jsonl` — per-project conversation logs (new format)
- `~/.claude/history.jsonl` — centralized prompt history (legacy format, pre-January 2026)
- `~/.claude/settings*.json` — only via the `backup` command, for snapshots

**What it does with that data:**
- Parsing, search indexing, and analytics run locally — nothing is uploaded
- The session store is written to `~/.stackunderflow/store.db` (SQLite, WAL mode)
- Backups (opt-in) are written to `~/.stackunderflow/backups/` unencrypted — protect this directory like you would your `~/.claude/`

**What leaves your machine (only if you enable it):**
- **Pricing** — fetches model cost data from a public GitHub source (no user data sent)

No telemetry, no tracking, no crash reports. No sharing.

## Contributing

```bash
# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
python -m pytest tests/stackunderflow/ -v

# Lint
bash lint.sh
```

See [docs/README-DEV.md](docs/README-DEV.md) for architecture details.

## License

MIT — see [LICENSE](LICENSE).
