Metadata-Version: 2.4
Name: claude-session-logger
Version: 0.3.36
Summary: Sync your claude code sessions and memories in the cloud via Motherduck.
Author-email: Keith Fajardo <contact@keithfajardo.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/keith-fajardo/claude-session-logger
Project-URL: Repository, https://github.com/keith-fajardo/claude-session-logger
Keywords: claude,claude-code,duckdb,motherduck,logging
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: duckdb==1.5.3
Requires-Dist: pyyaml>=6
Requires-Dist: textual>=0.60
Requires-Dist: textual-plotext>=0.2
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Dynamic: license-file

# claude-session-logger

**Your Claude Code history and notes follow you across machines.**

## The problem

Claude Code keeps its sessions, costs, and the notes it takes locally, per machine.
Switch laptops — or just `rm -rf` a project — and that history is gone. You lose:

- **Which sessions happened, what they cost, what you worked on.**
- **The issues/findings/decisions** you asked Claude to jot down mid-session.

This tool fixes that by mirroring both to the cloud (MotherDuck), so a fresh
machine pulls everything back.

## What you get

- **Every session auto-logged** — cost, tokens, duration, project, session type —
  pushed to a cloud DB you can query from anywhere.
- **Browse, search & resume any past session** — an interactive menu
  (`claude-session-logger`, no arguments) lists every session (not just the
  recent few), filters by title/project/date or **transcript content**, opens
  several at once to read, and **resumes one** straight into Claude. A **Summary**
  view charts tokens/cost over time, stacked by model, with per-project and
  per-model breakdowns.
- **On-demand notes** ("log an issue: …", "log a decision: …") that survive a
  machine wipe and **sync across devices**. The issue table is even **shared** —
  a teammate can resolve an issue you opened, and you get an inbox badge.
- **Local-first:** the local DuckDB file is the source of truth; the cloud copy is
  a synced mirror. Sessions log at session end; notes are written on demand.
  Offline? Writes queue locally and flush on the next run — nothing is lost.

## Prerequisites

You need a **MotherDuck account** — that's the cloud DB the sync targets. The
[free tier](https://motherduck.com) is plenty (this stores tiny metadata, not
transcripts). Sign up at [app.motherduck.com](https://app.motherduck.com) — the
setup wizard will prompt you for a token during `claude-session-logger-setup`.

> **Token type:** use a **personal access token** (Settings → Access Tokens), not
> a service account token. Service account tokens can't create the DuckDB session
> needed by the sync extension.

> **Platform support:**
> - **macOS** — fully supported; background sync uses launchd.
> - **Linux / WSL** — fully supported; background sync uses a user crontab entry
>   (every 15 min). On WSL, cron must be running: `sudo service cron start`.
> - **Windows (native)** — CLI commands work but hook auto-wiring and background
>   sync are not supported. Use WSL for the full experience.

## Quickstart

```bash
pip install claude-session-logger     # 1. install (3 commands land on PATH)
claude-session-logger-setup           # 2. wire it up — run ONCE
```

That's it. Step 2 (`claude-session-logger-setup`) is the one-time wiring: it
creates the local schema, installs the SessionStart/PostToolUse/SessionEnd hooks
into `~/.claude/settings.json`, registers background sync (launchd on macOS,
crontab on Linux/WSL), and **prompts for your MotherDuck token** (input hidden).
It's **idempotent** (safe to re-run — skips the token prompt if the file already
exists) and backs up `settings.json.bak`. After that, sessions log themselves —
you never call it again.

Verify:

```bash
claude-session-logger recent          # should print a (possibly empty) banner
```

On a **new machine**: same two commands. Once the token's in place, your sessions
and notes pull back down on the next session start.

## Uninstalling

```bash
claude-session-logger-setup --uninstall   # removes hooks, background sync, token, and local DB
pip uninstall claude-session-logger       # removes the package
```

`--uninstall` is a clean slate: hooks removed from `settings.json`, background sync
removed (LaunchAgent on macOS, cron entry on Linux/WSL), MotherDuck token file
deleted, and local `sessions.duckdb` deleted. Your data in MotherDuck is untouched —
reinstalling and re-running setup pulls it back down on the first sync.

## Updating

```bash
pip install --upgrade claude-session-logger   # pull the latest release
claude-session-logger-setup                   # re-run: refreshes hooks + the chat-usage doc + @imports
claude-session-logger --version               # confirm the new version
```

After updating, **start a new Claude session** (so the updated SessionStart hooks
fire) and **restart any running `claude-hive`** (a live TUI process won't
hot-reload). `claude-session-logger-setup` is idempotent — it backs up
`settings.json`, never duplicates hooks, and overwrites only the package-managed
`~/.claude/claude-session-logger.md` usage doc.

> **Editable/dev install gotcha.** If you installed from a checkout with
> `pip install -e .`, note that `pip install --upgrade claude-session-logger`
> **replaces** the editable link with the PyPI wheel — the package moves to
> `site-packages` and your working tree is no longer live. To keep developing
> against the repo, re-link after upgrading:
>
> ```bash
> cd ~/.claude/claude-session-logger && pip install -e .
> ```
>
> Check which mode you're in with `pip show claude-session-logger` (look at
> `Location:` / `Editable project location:`). A re-run of `pip install -e .` is
> also needed whenever `pyproject.toml` adds a **new console script** (e.g.
> `claude-hive`) or a **new dependency** — editable installs only register those on
> (re)install.

## Browse & resume sessions (the `sessions` TUI)

Run the bare command to open an interactive menu (a Textual TUI, like
`claude-hive`):

```bash
claude-session-logger          # → menu: 1) Resume a session  2) Summary  q) Quit
# (subcommands like `log`, `list`, `record`, `sync` still work as before)
```

**1) Resume a session** — every logged session, newest first:

- **Search** as you type — matches title / project / date instantly **and
  transcript content** (greps the on-disk transcripts in the background; the
  matching line is shown with the term **highlighted**). `^F` toggles content
  matching off for fast metadata-only filtering.
- **`Space`** selects rows; **`o`** opens the selected set in a read-only viewer —
  the transcript is rendered as a readable **conversation** (user/assistant turns
  + the tools called), not raw JSON. Page through them; large transcripts are
  capped for speed.
- **`Enter` / `r`** resumes the **highlighted** row — it runs
  `claude --resume <id>` for you. You can only ever resume **one** session;
  selection is just for opening. A session whose transcript is gone shows as
  `⚠ transcript gone` and can't be resumed.

> Resume needs the session's transcript on **this** machine — it's read from
> `~/.claude/projects/`. Sessions logged on another device show up in the list
> (for the Summary) but are resumable only where their transcript lives.

**2) Summary of sessions** — usage analytics from the local DB:

- A **stacked-by-model time series** of tokens (or cost), showing the whole
  period at a glance — every **hour of the day**, **day of the month**, or
  **month of the year** — with the value labelled on top of each bar.
- **Bar breakdowns** by **project**, by **model**, and by **token component**
  (input / output / cache-write / cache-read) — the component split shows *why*
  a total is large: cache-read dominates token volume, while output and
  cache-read dominate cost (green bars, orange headers).
- Toggle **`t`** tokens ↔ cost, **`h` / `d` / `m`** the period, **`q`** back.

The Summary reads the per-model `session_models` table (filled at session end,
and synced so it reflects all your devices — see [Schema](#schema)). Sessions
recorded before this feature roll up under an `(unknown)` model until you
backfill them (one-off: re-records per-model usage from on-disk transcripts).

## How it works

```
Claude Code session ends
        │
        ▼
SessionEnd hook (~/.claude/settings.json)
        │  passes session_id + transcript_path + cwd on stdin (JSON)
        ▼
log_session.py record --no-sync
        │  1. parse transcript JSONL → aggregate tokens by model, derive metadata
        │  2. compute cost from prices.json (incl. cache-write/read rates)
        │  3. UPSERT one row into LOCAL sessions table (synced = FALSE)
        ▼
launchctl kickstart …sync            (fires asynchronously, off the hot path)
        ▼
LaunchAgent → claude-session-logger sync
        │  push all synced = FALSE rows → MotherDuck, mark synced = TRUE
        ▼
sessions.duckdb  ──sync──▶  md:claude_sessions  (cloud)
```

Session end writes locally and returns fast (`--no-sync`); the cloud push runs
out-of-band via a background sync job (launchd on macOS, crontab every 15 min on
Linux/WSL). This keeps session exit instant and avoids the script blocking on a
slow/offline network. On the next `SessionStart`, the macOS LaunchAgent is kicked
again to drain anything still queued; on Linux the cron job handles periodic drain.

### Concurrency

DuckDB allows **one** read-write process **or many** read-only processes per file.
Concurrent Claude sessions plus the flush LaunchAgent contend for the write lock,
so `db.connect()` retries with exponential backoff on a transient
`Conflicting lock is held` error. The startup banner (`recent`) connects
**read-only** so it coexists with a writer and degrades silently if the file is
missing, locked past retries, or has no tables yet.

## Files

Code is a pip package under `src/claude_session_logger/`. Installing exposes three
console commands: **`claude-session-logger`** (`cli.py`),
**`claude-session-logger-setup`** (`_setup.py`), and **`claude-memory-sync`**
(`memory.py`, the separate `.md` memory system).

| File (`src/claude_session_logger/`) | Purpose |
|---|---|
| `cli.py` | CLI entry point. Subcommands: `init`, `record`, `log`, `resolve`, `list`, `show`, `recent`, `inbox`, `sync` (`flush` kept as alias), `sessions` (the TUI; also the no-arg default). |
| `db.py` | DuckDB connect (with lock retry), schema DDL + idempotent migrations, CRUD helpers (incl. `session_models`), token resolution + self-heal, cross-device `sync()`, inbox/meta helpers. |
| `transcript.py` | Parse a session transcript `.jsonl` → tokens by model, models, timestamps, skills, message/tool counts, metadata; `render_markdown()` for the readable conversation view. |
| `cost.py` | Compute USD cost from per-model token usage and `prices.json`; shape per-model `session_models` rows (`model_rows_from_usage`). |
| `sessions.py` | Pure data layer for the sessions TUI: locate/grep transcripts, and analytics queries (`totals`, `by_project`, `by_model`, `by_component`, stacked `usage_series_by_model`, `backfill_session_models`). No Textual import. |
| `sessions_tui.py` | Textual app for `claude-session-logger sessions`: the menu + the `resume()` handoff (`execvp claude --resume`). |
| `sessions_resume.py` | Resume screen — list/search/content-search, open many, resume the one highlighted row. |
| `sessions_summary.py` | Summary screen — stacked-by-model time series (`textual-plotext`) + Rich-bar breakdowns by project/model/component; tokens/cost + period toggles; humanized counts + family model names. |
| `text_highlight.py` | Shared search-match helpers (`stylize_all`, `snippet`) used by both the hive and sessions lists to emphasize the matched query. |
| `classify.py` | Infer `session_type` from skills used + project/cwd. |
| `identity.py` | Resolve the current username (`$CLAUDE_LOGGER_USER` → `git config user.name`) for `created_by`/`updated_by`. |
| `resolve.py` | Resolve the active `session_id` from the newest `.jsonl` in the cwd's project dir. |
| `memory.py` | `.md` file-memory ↔ MotherDuck (`md:claude_memory`); `push`/`pull`/`sync`/`search`/`get`/`hook`. |
| `_setup.py` | Post-install wiring: LaunchAgent + idempotent hook merge + interactive token prompt. `--uninstall` reverses everything (hooks, LaunchAgent, token file, local DB). |
| `prices.json` | Editable model → `{input, output, cache_write, cache_read}` per-1M-token rates. |
| `sessions.duckdb` | Local buffer / source of truth (created at `~/.claude/claude-session-logger/`, gitignored). |
| `CLAUDE.md` / `DESIGN.md` | Contributor guide (incl. security guardrails) / as-built architecture. |
| `tests/` | pytest suite for each module + the CLI + setup. |

> Legacy note: examples below use `claude-session-logger <cmd>`; the equivalent
> `claude-session-logger <cmd>` still works from a source checkout.

## Setup details (what `claude-session-logger-setup` automates)

The [Quickstart](#quickstart) above is all most people need. This section explains
each piece it wires up — useful for debugging, a manual/Linux install, or
understanding what changed in your `settings.json`.

> Dev/editable install from a checkout: `pip install -e .` instead of the PyPI
> install. DuckDB is pinned to 1.5.3 (MotherDuck's max). Three commands land on
> PATH: `claude-session-logger`, `claude-session-logger-setup`, `claude-memory-sync`.

1. **MotherDuck token** — cloud sync resolves a token from, in order: the
   `motherduck_token` env var, `MOTHERDUCK_TOKEN`, then the file
   `~/.config/motherduck/token` (chmod 600). That path is **MotherDuck's own**
   credential store — the `motherduck` CLI / device-auth writes it, so the scripts,
   the flush LaunchAgent, and official tooling all read **one** file (no second
   copy to drift out of sync on rotation). The file matters because **launchd does
   not inherit your shell exports** — the flush LaunchAgent relies on it, so
   without it sync would fall back to interactive browser device-auth (and hang).
   If a stale shell/uppercase token outranks a freshly rotated file token, the
   ATTACH self-heals by retrying once with the file token. Without any token, local
   writes still work and sync raises a clean error (logged to `sync.log`, retried
   later).

2. **Create the schema:**

   ```bash
   claude-session-logger init
   ```

3. **Wire the hooks** into `~/.claude/settings.json` (merge — don't replace
   existing hooks). `claude-session-logger-setup` writes these for you; shown here for reference.

   **macOS** (with launchctl kick for background sync):
   ```jsonc
   "SessionStart": [
     { "hooks": [
       { "type": "command", "command": "python3 -m claude_session_logger.cli recent" },
       { "type": "command", "command": "python3 -m claude_session_logger.cli inbox --count" },
       { "type": "command", "command": "launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" },
       { "type": "command", "command": "python3 -m claude_session_logger.memory pull 2>&1 || true" },
       { "type": "command", "command": "python3 -m claude_session_logger.hive sync 2>&1 || true" }
     ]}
   ],
   "PostToolUse": [
     { "matcher": "Write|Edit", "hooks": [
       { "type": "command", "command": "python3 -m claude_session_logger.memory hook 2>/dev/null || true", "async": true }
     ]}
   ],
   "SessionEnd": [
     { "hooks": [
       { "type": "command", "command": "python3 -m claude_session_logger.cli record --no-sync; launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" }
     ]}
   ]
   ```

   **Linux / WSL** (launchctl omitted — cron handles periodic sync):
   ```jsonc
   "SessionStart": [
     { "hooks": [
       { "type": "command", "command": "python3 -m claude_session_logger.cli recent" },
       { "type": "command", "command": "python3 -m claude_session_logger.cli inbox --count" },
       { "type": "command", "command": "python3 -m claude_session_logger.memory pull 2>&1 || true" },
       { "type": "command", "command": "python3 -m claude_session_logger.hive sync 2>&1 || true" }
     ]}
   ],
   "SessionEnd": [
     { "hooks": [{ "type": "command", "command": "python3 -m claude_session_logger.cli record --no-sync" }]}
   ]
   ```

   The `inbox --count` line prints the shared-issue badge (silent when nothing is
   unseen).

4. **Background sync** runs `claude-session-logger sync` out-of-band, pushing
   queued rows to the cloud and pulling any missing rows back down. Since `sync`
   does a **bidirectional** `log_entries` merge, this also pulls other users'
   resolves/reopens to this device. On first run, `sync` auto-creates the
   `claude_sessions` database in MotherDuck if it doesn't exist yet.

   - **macOS** — a launchd LaunchAgent
     (`~/Library/LaunchAgents/com.keithfajardo.claude-session-logger.flush.plist`)
     kicked by the SessionStart and SessionEnd hooks. The plist embeds **no token**
     — the code resolves it from the token file in-process (launchd doesn't inherit
     shell exports, so the file path is what keeps unattended flush working).
   - **Linux / WSL** — a user crontab entry that runs every 15 minutes. No token
     in the crontab — same in-process resolution. On WSL, cron must be enabled:
     `sudo service cron start` (add to `~/.bashrc` or `/etc/rc.local` to persist
     across WSL restarts).

5. **Username (optional)** — set `CLAUDE_LOGGER_USER` in the hook environment to
   control the name recorded in `created_by`/`updated_by`. If unset, it falls back
   to `git config user.name`. Needed only if you share the issue table.

## Usage

```bash
# Create local + cloud schema (idempotent)
claude-session-logger init

# Record a session (reads hook JSON on stdin). --no-sync = local write only.
claude-session-logger record [--no-sync]

# Sync queued (synced = FALSE) rows to/from the cloud (bidirectional)
claude-session-logger sync

# Print the most recent sessions banner (read-only)
claude-session-logger recent [--limit 3]

# Inbox of shared-issue changes by other users
claude-session-logger inbox            # list changes, clear the badge
claude-session-logger inbox --count    # badge only (read-only; used by SessionStart)
```

### On-demand log entries

Tell Claude in plain language and it runs the right command. session_id
auto-resolves from the active transcript, so this works from **any** repo.

**What to say → what runs:**

| Say | Runs |
|---|---|
| "log an issue: \<title\> — \<details\>" | `log --category issue --status open …` |
| "log a finding: \<title\>" | `log --category finding --status open …` |
| "log a task: \<title\>" | `log --category task --status open …` |
| "log a decision: \<title\>" | `log --category decision --status info …` |
| "log a note: \<title\>" | `log --category note --status info …` |
| "resolve the \<words\> issue" | `resolve --title-match "<words>" --status resolved` |
| "resolve \<id\> [because …]" | `resolve --id <id> --status resolved [--note …]` |
| "set \<id\> to blocked / in progress" | `resolve --id <id> --status <status>` |
| "list open issues" / "what did I log" | `list [--category …] [--open] …` |
| "check my inbox" / "what changed" | `inbox` (lists others' changes, clears the badge) |

Default status: issue/finding/task → `open`; decision/note → `info`. Add "… status
blocked" (etc.) to override. The headline goes in `--title`, extra context in
`--body`.

```bash
# Add an entry. session_id auto-resolved from the active transcript if omitted.
# Prints the new id: "logged issue (open) id=<uuid>" — keep it to resolve later.
claude-session-logger log \
  --category issue --status open \
  --title "staging model double-counts refunds" \
  --body "details…" [--session-id <id>]

# Resolve / update an existing entry (appends a note to the body if given).
# By words — no UUID needed; must match exactly one entry, else it lists candidates.
claude-session-logger resolve --title-match "double-counts refunds" --status resolved \
  [--note "fixed in PR #42"]
# Or by exact id
claude-session-logger resolve --id <entry-id> --status resolved [--note "fixed in PR #42"]

# List entries, newest first (by updated_at, falling back to logged_at).
# Use this to grab an id when a title match is ambiguous.
claude-session-logger list [--category issue] [--status open] [--open] [--limit N]

# Show ONE entry's full body (read-only). By id / id prefix, or title words.
# Use this instead of a hand-written duckdb query — `list` omits the body.
claude-session-logger show <id-or-prefix>
claude-session-logger show --title-match "double-counts refunds"
```

`--open` excludes `status = 'resolved'` entries. `show` accepts a short id prefix
(e.g. `b045dd70`) and prints the id/type/status/title/timestamps/body; if a title
match is ambiguous it lists the candidates instead. `resolve` takes **either**
`--title-match` (resolve by words; errors if it matches 0 or >1 entries) **or**
`--id` (exact).

**Categories:** `finding | task | decision | issue | note`
**Statuses:** `open | in_progress | blocked | resolved | info`

## Sharing the issue table (multi-user)

The `log_entries` table is **shared and bidirectional**: install the logger on a
second device/account and both people see and resolve the same issues. `sessions`
stay **private** (push-only) — only `log_entries` is pulled back down.

**Identity.** Every entry records who created/changed it. Username resolves from
`$CLAUDE_LOGGER_USER` → `git config user.name` → `"unknown"`:
- `created_by` — who logged the entry.
- `updated_by` — who last changed its status (e.g. who resolved it).

**Sync model.** `sync`/`record` run a last-write-wins merge (newest
`updated_at` wins) of `log_entries` between local and `md:claude_sessions`. So
Jane can resolve an issue Keith opened; on Keith's next sync the resolve lands
locally with `updated_by = "Jane"`. Concurrent edits to the *same* entry: latest
timestamp wins (low-contention assumption — like the rest of the system).

### Inbox — change notifications

When someone else resolves or reopens an issue, you get a count badge at session
start, then read the details on demand.

```bash
# Badge (wired into the SessionStart hook): read-only, prints e.g.
#   📬 2 shared issue update(s) — run: …log_session.py inbox
# Silent when nothing is unseen. Never blocks; reads local state only.
claude-session-logger inbox --count

# Inbox view: lists changes made by others since you last looked,
# then advances the "seen" watermark (clears the badge).
claude-session-logger inbox
```

A change counts as unseen if `COALESCE(updated_at, logged_at)` is newer than the
local watermark **and** the actor (`updated_by`, else `created_by`) isn't you. The
watermark lives in a local-only `meta` table — it is **not** synced, so "seen" is
per-device. The badge reads whatever the last sync pulled down (kept current by
the flush LaunchAgent), so it never makes a network call on the session-start hot
path.

> Real-time push (email/Slack) is intentionally out of scope — the badge + inbox
> is a pull-and-diff, no extra infrastructure.

## Hive memory (share with others)

Hive memory lets you publish curated `.md` memory files to teammates without using
the shared issue table. Use `claude-hive` (a Textual TUI) to **Publish**, **Subscribe**,
or **Refresh**.

**You and your teammate both need a MotherDuck account** (the same
[free tier](https://motherduck.com) that claude-session-logger uses). Published content is
visible to **anyone with the share URL** — treat it like a public link (no secrets, PII,
or credentials).

- **Publish:** `claude-hive` → select **Publish** → toggle the `.md` files you want to
  share → apply → share the URL with a teammate.
- **Subscribe:** from the shared URL, run `claude-hive` → select **Subscribe** → paste
  the URL → toggle which projects' files you want → apply. Files land in
  `~/.claude/hive/<project>/` and are auto-loaded via `@hive/HIVE.md` in your
  `~/.claude/CLAUDE.md`.
- **Refresh:** SessionStart auto-runs `python -m claude_session_logger.hive sync` to
  pull fresh shared content (non-fatal if it fails).

Unlike the shared issue table (`log_entries`), hive memory is **read-only for subscribers**
and **curated by the publisher** — no bidirectional editing or per-subscriber access control.

### Team issue sharing (hive mesh)

Teammates on separate MotherDuck accounts can share a project's issues. In the
repo-root `CLAUDE.md`, add YAML frontmatter (committed to git, so everyone gets it):

```markdown
---
project-name: Acme Pipeline
is-hive-project: true
---
```

Then exchange hive share URLs in `claude-hive` (Subscribe to each other). Issues
for that project merge both ways on each `claude-hive sync` — last edit wins.
A subscriber receives issues for **all projects** their teammate publishes (not just
the current one), visible via `claude-session-logger list --all-projects`.
Memories stay one-way; issues are bidirectional.

## Pulling memory (local & remote)

"Memory" spans **two separate systems**. Know which one you want before pulling.

| | System A — Session memory (**this repo**) | System B — File-based memory (sibling) |
|---|---|---|
| What | Sessions + on-demand log entries | Markdown memory files Claude writes |
| Local store | `~/.claude/claude-session-logger/sessions.duckdb` | `~/.claude/projects/<slug>/memory/*.md` |
| Remote store | MotherDuck `md:claude_sessions` | MotherDuck `md:claude_memory` |
| Tool | `log_session.py` (this repo) | `~/.claude/scripts/memory_sync.py` |

### System A — session memory (this repo)

**Pull local** — read straight from `sessions.duckdb`:

```bash
# On-demand log entries (findings/tasks/decisions/issues/notes), newest first
claude-session-logger list [--category finding] [--status open] [--open] [--limit N]

# Recent sessions banner
claude-session-logger recent [--limit 5]

# Or query the local file directly
duckdb ~/.claude/claude-session-logger/sessions.duckdb \
  "SELECT session_date, project, cost_usd, total_tokens FROM sessions ORDER BY ended_at DESC LIMIT 10"
```

**Pull remote** — query the MotherDuck mirror. DuckDB's `md:` reads the token from
the `motherduck_token` **env var**. Load it from the file into the env **for this
one process** (the value is never printed, and never goes in the DSN — a token in a
connection string leaks into shell history and error messages):

```bash
motherduck_token="$(< ~/.config/motherduck/token)" duckdb "md:" \
  "SELECT category, status, title, logged_at
   FROM claude_sessions.log_entries
   WHERE title ILIKE '%refund%' ORDER BY logged_at DESC"
```

Local is the source of truth; remote is the synced mirror, useful from another
device. Both hold the same `sessions` + `log_entries` tables. `sessions` syncs
**up only** (private); `log_entries` syncs **both ways** (shared — see
[Sharing the issue table](#sharing-the-issue-table-multi-user)).

### System B — file-based memory (`memory_sync.py`)

Separate system: the `.md` memory files Claude maintains per project, mirrored to
MotherDuck `md:claude_memory` for cross-device access. It lives in
`~/.claude/scripts/`, **not** in this repo. **Always use the script — never
hand-write inline `duckdb` queries (token-wasteful, fragile).**

```bash
# Pull remote — grep memory by content (current project). Matching lines only.
python3 ~/.claude/scripts/memory_sync.py search "<thing>"
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --project-key <repo-name>
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --all-projects   # every project
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --full           # whole files

# Pull remote — print one file's full content (name substring match)
python3 ~/.claude/scripts/memory_sync.py get <name-fragment> [--project-key <repo-name>]

# Refresh local .md files from remote (newer wins), or reconcile both ways
python3 ~/.claude/scripts/memory_sync.py pull
python3 ~/.claude/scripts/memory_sync.py sync
```

`search`/`get` are read-only (no local write). Workflow: `search` to find the file
+ matching lines, then `get` only if the full file is needed. Local `.md` files
are read directly by the Claude Code harness; `pull`/`sync` keep them current
across devices.

## Schema

Three synced tables plus a local-only `meta` table. The `synced` column is **local
only** (tracks what still needs pushing); it is carried into the cloud copy but
only meaningful locally.

### `sessions` (one row per session, auto at session end)

`session_id` (PK), `project`, `cwd`, `session_date`, `started_at`, `ended_at`,
`duration_min`, `models`, `input_tokens`, `output_tokens`, `cache_write_tokens`,
`cache_read_tokens`, `total_tokens`, `cost_usd`, `message_count`,
`tool_call_count`, `skills_used`, `git_branch`, `cc_version`, `title`,
`session_type`, `created_by`, `synced`. Synced both ways now (push **and**
pull-down) so the Summary reflects every device — session rows are write-once, so
the pull is a simple missing-row insert.

### `session_models` (per-model usage, one row per model per session)

`session_id`, `model` (compound PK), `input_tokens`, `output_tokens`,
`cache_write_tokens`, `cache_read_tokens`, `total_tokens`, `cost_usd`, `synced`.
Powers the Summary's by-model breakdown. **Metadata only** (no message content),
so it is cloud-safe and on the guardrail allow-list. Written at session end from
the same `usage_by_model` the cost is computed from; sync push + pull-down.

### `log_entries` (on-demand, shared)

`id` (PK, UUID), `session_id`, `logged_at`, `category`, `status`, `title`, `body`,
`synced`, `updated_at` (NULL until first resolve/update), `created_by`,
`updated_by`. `create_tables()` runs idempotent `ADD COLUMN IF NOT EXISTS`
migrations (`updated_at`, `created_by`, `updated_by`) for older DBs.

### `meta` (local only — not synced)

`key` (PK), `value`. Holds the per-device `inbox_watermark` (last time you cleared
the inbox). Never created cloud-side.

All timestamps are stored in UTC.

> **Column order note:** sync uses **explicit column lists**, never `SELECT *`.
> `ADD COLUMN` appends physically at the end, so a migrated table and a freshly
> created one can have different column *positions* — positional `SELECT *` across
> local↔cloud would silently misalign (e.g. map `created_by` onto `synced`).

## Cost model

Per assistant message, token usage is accumulated per model
(`input`, `output`, `cache_creation` → `cache_write`, `cache_read`), de-duplicated
by message uuid/id. Cost:

```
cost_usd = Σ_model ( input·p.input + output·p.output
                   + cache_write·p.cache_write + cache_read·p.cache_read ) / 1e6
```

Rates live in `prices.json` (per 1M tokens). Seeded values:

| model | input | output | cache_write | cache_read |
|---|---|---|---|---|
| claude-opus-4-8 | 5.00 | 25.00 | 6.25 | 0.50 |
| claude-sonnet-4-6 | 3.00 | 15.00 | 3.75 | 0.30 |
| claude-haiku-4-5 | 1.00 | 5.00 | 1.25 | 0.10 |

An unknown model contributes 0 cost and is logged to `sync.log` so it can be added.

## `session_type` — auto-tagging sessions (customizable)

Every session gets **one** `session_type` label at session end (so you can later
filter/report — "how much did debugging cost this month"). It's assigned by simple
**keyword rules**: you say *"if I see keyword X, label the session Y."*

### The mental model

A rule is just **keywords → a label**:

```yaml
- type: infra                       # the label to assign
  skills: [terraform, k8s]          # keyword(s) to look for in the skills you used
  path: [ops/, infra/]              # …and/or in the folder you ran claude in
```

- **`skills`** matches *what you did* — the skills used this session.
- **`path`** matches *where you did it* — the directory you launched `claude` in
  (its cwd + folder name), as plain text. Not where any file lives.
- A rule fires if **`skills` OR `path`** matches — **either one is enough**, never
  both required. You can also give just one of them.
- Matching is **case-insensitive substring**, so one keyword catches a family:
  `dbt` matches the skill `superpowers:dbt-helper`; `ops/` matches cwd `.../ops/api`.

### How it picks ONE type

The classifier re-runs the whole rule list **every session**, against *that*
session's own skills + folder, and stops at the **first rule that matches** (rule
order = priority). Nothing matches → `general`.

You do **not** list every skill that exists — only the few keywords for the buckets
you care about. Everything else falls to `general` automatically. It's opt-in
tagging, not a registry.

| You ran claude in… | used skill… | → type |
|---|---|---|
| `~/work/ops/api` | — | `infra` (path has `ops/`) |
| `~/anything` | `systematic-debugging` | `debugging` (skill match) |
| `~/he-dbt-bi` | — | `dbt` (path has `dbt`) |
| `~/random` | none of the keywords | `general` |

> **path vs skills:** prefer **`skills`** — it travels with you across repos
> (debugging is debugging anywhere). Use **`path`** only when your folders follow a
> naming convention (`ops/`, `web/`); it's brittle if project names vary.

### Built-in defaults (`classify.py`), first match wins

1. skill has `dbt`/`kimball`, or folder path has `dbt` → `dbt`
2. skill `systematic-debugging` → `debugging`
3. skill `deep-research`/`brainstorming`/`research` → `research`
4. skill `writing-plans`/`executing-plans` → `planning`
5. else → `general`

### Add your own

Drop a YAML file at `~/.claude/claude-session-logger/session_types.yaml` (copy
`session_types.example.yaml` — it has the full annotated format). **Your rules run
before the defaults**, so they can add a new type or override a built-in one. No code
changes.

Loading is hook-safe — a missing file, malformed YAML, or absent PyYAML silently
falls back to the defaults, so session-end never breaks.

## Reliability

- **Cloud unreachable / token missing** → local write succeeds, sync fails silently,
  logged to `sync.log`, retried on the next `sync`.
- **Lock contention** → `connect()` retries with backoff; the read-only banner
  degrades silently.
- **Malformed transcript line** → skipped (counted, not fatal).

## Tests

```bash
python3 -m pytest
```

## Limitations

- **Single active session per directory** assumption: `resolve.py` picks the
  newest-mtime `.jsonl` in the cwd's project dir. Concurrent sessions in the same
  directory can mis-attribute an on-demand log entry.
- No per-message conversation storage (by design — avoids dumping file contents /
  secrets into the cloud).
- Background sync is platform-specific: launchd on macOS, crontab on Linux/WSL,
  not supported on native Windows. Use WSL on Windows for full functionality.
