Metadata-Version: 2.4
Name: openclaw-session-grep
Version: 0.1.2
Summary: Fast local search for OpenClaw session transcript JSONL files.
Project-URL: Homepage, https://github.com/pfrederiksen/openclaw-session-grep
Project-URL: Issues, https://github.com/pfrederiksen/openclaw-session-grep/issues
Author: OpenClaw contributors
License: MIT
License-File: LICENSE
Keywords: cli,jsonl,openclaw,search,transcripts
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == 'dev'
Description-Content-Type: text/markdown

# openclaw-session-grep

`openclaw-session-grep` is a small CLI for quickly searching local OpenClaw session and transcript files. It is meant for finding prior conversations, tool usage, errors, expensive model runs, channel-specific history, and session labels without manually opening JSONL files.

It stays deliberately narrow: fast transcript search with useful filters, not an analytics database.

## Install

```bash
pipx install openclaw-session-grep
```

For local development:

```bash
python -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[dev]"
```

## Where It Searches

By default, the CLI searches these locations if they exist:

- `$OPENCLAW_SESSION_DIR`
- `$OPENCLAW_TRANSCRIPT_DIR`
- `~/.openclaw/sessions`
- `~/.openclaw/transcripts`
- `~/.openclaw`

`OPENCLAW_SESSION_DIR` and `OPENCLAW_TRANSCRIPT_DIR` may contain multiple paths separated with your platform path separator. You can also pass `--path` one or more times to search specific files or directories.

Candidate files currently use these extensions: `.jsonl`, `.json`, `.log`, and `.ndjson`.

## Examples

```bash
openclaw-session-grep "morning briefing"
openclaw-session-grep "timeout" --channel telegram --last 7d
openclaw-session-grep "message tool" --agent main --json
openclaw-session-grep "network\\s+timeout" --regex --tool-only
openclaw-session-grep --tool shell --summary
openclaw-session-grep "briefing" --context 2 --open
```

## Filters

- `--agent main`
- `--channel telegram`
- `--last 7d`, `--last 24h`, `--since 2026-04-10T00:00:00Z`, `--until ...`
- `--model gpt-5.4`
- `--tool message`
- `--session alpha`
- `--type assistant|user|tool|system`
- `--tool-only`

Search is case-insensitive by default. Use `--case-sensitive` to change that, and `--regex` to treat the query as a Python regular expression.

## Output Modes

Default terminal output includes timestamp, session, agent, channel, optional tool name, and a short excerpt.

Other modes:

```bash
openclaw-session-grep "timeout" --json
openclaw-session-grep "timeout" --markdown
openclaw-session-grep "timeout" --count-only
openclaw-session-grep "timeout" --summary
```

Use `--usage` to include usage/cost fields when present in the transcript record. Use `--open` to include `path:line` references.

## Transcript Format

OpenClaw transcript schemas may vary, so the parser is intentionally tolerant. It reads newline-delimited JSON and extracts common fields such as:

- timestamp: `timestamp`, `time`, `created_at`, `createdAt`, `ts`
- session: `session_key`, `sessionKey`, `session`, `label`, `conversation_id`
- agent: `agent`, `agent_id`, `agentId`, `role`
- channel: `channel`, `source`, `transport`
- message type: `type`, `kind`, `role`
- model: `model`, `model_name`, `modelName`
- tool: `tool`, `tool_name`, `toolName`, `tool.name`, `function.name`, first `tool_calls[]`
- usage/cost: `usage`, `cost`, `token_usage`, `tokenUsage`

Malformed JSON lines are treated as plain text records so older logs can still be searched.

## Performance Notes

The utility streams candidate files by path, then loads one transcript file at a time to support context records around each hit. That keeps memory bounded by the largest single transcript file rather than the whole transcript directory.

For large stores, pass a narrower `--path`, combine structured filters, or use `--limit` when you only need the first few hits.

## Known Limitations

- There is no persistent index; every invocation scans files.
- Date filters skip records without timestamps instead of rejecting them.
- Tool and structured filters are exact matches.
- JSON object files that are not line-delimited are not recursively unpacked as full transcript arrays yet.

## Development

```bash
python -m pip install -e ".[dev]"
python -m unittest discover -s tests
```

Sample fixtures live in `tests/fixtures`.

## Releases

Tagged releases use GitHub Actions:

- `v*` tags build and publish distributions to PyPI using the `PYPI_API_TOKEN` repository secret.
- The Homebrew workflow updates `pfrederiksen/homebrew-tap` with a formula that installs from the PyPI source distribution.

Do not commit PyPI tokens. Store PyPI credentials as GitHub Actions secrets, or adapt the workflow to PyPI trusted publishing after configuring the publisher on PyPI.
