Metadata-Version: 2.4
Name: home-agent
Version: 0.1.1
Summary: Windows personal ops agent with pluggable providers
Requires-Python: >=3.12
Requires-Dist: claude-agent-sdk<0.1.49,>=0.1.48
Requires-Dist: cryptography>=43.0.0
Requires-Dist: desktop-notifier>=6.2.0
Requires-Dist: google-api-python-client>=2.154.0
Requires-Dist: google-auth-oauthlib>=1.2.1
Requires-Dist: msal>=1.31.0
Requires-Dist: requests>=2.32.3
Requires-Dist: sentence-transformers>=3.0.1
Requires-Dist: sqlite-vec>=0.1.6
Requires-Dist: structlog>=25.5.0
Description-Content-Type: text/markdown

# home-agent

Windows-friendly personal operations agent for triaging email and calendar items with:

- provider plugins
- deterministic urgency rules
- human-reviewed memory rules
- Claude/Codex classification
- semantic retrieval over historical items
- swappable vector search backends (`sqlite-vec` or `Qdrant`)
- SQLite-backed observability for costs, runs, and retrieval traces

## Why this project exists

`home-agent` is built around a simple problem: inboxes and calendars contain operational risk, but most personal automation tools are either brittle rule engines or opaque LLM demos.

This project aims for a middle ground:

- deterministic rules handle obvious signals
- reviewed memory rules capture repeated patterns
- retrieval finds semantically similar historical items
- the LLM gets compact prior context before making the final urgency call

The result is a small local-first AI system that is easier to reason about than a generic agent loop.

## Architecture

Current classification flow:

1. provider plugins collect recent email and calendar items
2. rule scoring assigns a first-pass urgency score
3. approved memory rules can boost priority deterministically
4. items are rendered into canonical text and embedded
5. item text is embedded through a pluggable embedding provider
6. similar historical items are retrieved through a swappable vector backend
7. the Claude shell runner receives the current item plus bounded retrieval context
8. usage, retrieval traces, items, todos, and runs are persisted in SQLite

Key design choices:

- SQLite-first storage keeps the architecture lightweight and inspectable
- `sqlite-vec` is the default retrieval backend because it fits the local-first design
- `Qdrant` is a first-class optional backend for a more production-style vector-service setup
- embeddings are pluggable across local and API-backed providers
- Claude shell execution is intentionally preserved for now so retrieval can be investigated independently of SDK migration
- LangGraph is intentionally not used because this pipeline is mostly linear

## Quick start

```bash
uv sync --dev
uv run pytest
uv run mypy
uv run home-agent doctor
uv run home-agent run --debug
```

## Auth setup

Set provider app credentials in `.env`:

```bash
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
MICROSOFT_CLIENT_ID=...
MICROSOFT_TENANT_ID=consumers
```

Initialize tokens (stored encrypted under `.data/tokens`):

```bash
uv run home-agent auth google --init
uv run home-agent auth microsoft --init
```

## Memory review commands

```bash
uv run home-agent memory list-candidates --status pending
uv run home-agent memory approve --rule-key subject_token:university
uv run home-agent memory reject --rule-key subject_token:promo --reason "noise"
```

## Embeddings and retrieval

The embeddings pipeline is additive. It does not replace rules or reviewed memory.

What gets stored:

- canonical item text used for embeddings
- compact summary text used for retrieval context
- embedding vectors keyed by provider/model
- retrieval traces for both `retrieved` and `prompt_included` stages

Supported embedding providers:

- `local_sentence_transformers`
- `voyage`

Supported retrieval backends:

- `sqlite_vec` (default)
- `qdrant` (optional)

Useful commands:

```bash
uv run home-agent embeddings backfill --dry-run
uv run home-agent embeddings backfill --kind email
uv run home-agent embeddings backfill --limit 100 --batch-size 25
uv run home-agent embeddings backfill --rebuild
uv run home-agent retrieval doctor
uv run home-agent retrieval stats
uv run home-agent retrieval rebuild-index
```

Why backfill matters:

- retrieval is weak if the corpus starts empty
- backfill makes the feature immediately testable on historical items
- changing rendering or embedding logic can be handled with `--rebuild`

## Example retrieval value

Keyword-only memory can miss cases like:

- new item: `FIT3171 project due Friday`
- older item: `assignment deadline tomorrow`

Those strings may not share the exact keyword you approved, but they are semantically related. The retrieval layer can surface the older urgent item and pass it to the LLM as prior context.

Another example:

- new item: `final notice: action required`
- older item: `urgent submission reminder`

If the older item was previously classified as high urgency, retrieval can make the new decision more consistent and easier to explain later.

## Vector backend choices

### `sqlite-vec`

Use `sqlite-vec` when you want:

- local-first runtime
- one-database deployment
- minimal infrastructure
- a strong “pick the right tool for the scale” engineering story

This is the default backend in `home-agent`.

### `Qdrant`

Use `Qdrant` when you want:

- a dedicated vector service
- a stronger production-style portfolio signal
- easier future growth toward larger corpora or service-based deployment

This backend is optional and selected through config.

## LLM runners

- Scheduler default uses Claude via `claude -p --output-format json`
- Codex wiring is available via `codex exec --json`
- Usage and cost metadata are persisted in SQLite table `llm_usage`
- Retrieval context is appended to the prompt in a bounded form rather than dumping raw historical content

## Logging

Runtime logs now go to `.data/logs/home-agent.jsonl` by default.

- `uv run home-agent run` writes structured JSON logs and keeps raw shell payloads off by default
- `uv run home-agent run --debug` enables debug console logging and raw Claude/Codex stdout/stderr capture
- `uv run home-agent run --raw-shell-io` enables raw subprocess output capture without changing the rest of the console verbosity
- `uv run home-agent run --log-dir .data/custom-logs` overrides the log directory for that run

Useful fields in the JSON log:

- run lifecycle: `orchestrator.run.start`, `orchestrator.run.completed`
- plugin and item flow: `orchestrator.plugin.collection.completed`, `orchestrator.item.processed`
- subprocess boundaries: `llm.claude.command.*`, `llm.codex.command.*`, `notifications.toast.*`

Config file support:

```toml
[logging]
directory = ".data/logs"
file_name = "home-agent.jsonl"
console_level = "INFO"
file_level = "DEBUG"
capture_raw_payloads = false
subprocess_preview_chars = 4000
```

## Observability

SQLite persists:

- `runs`
- `items`
- `todos`
- `memory_candidates`
- `memory_rules`
- `memory_reviews`
- `budget`
- `llm_usage`
- `item_text_representations`
- `item_embeddings`
- `retrieval_events`

This makes the system inspectable after each run instead of relying on prompt anecdotes.

Vector storage details:

- relational source-of-truth data stays in SQLite
- `sqlite-vec` uses an in-database vector index when selected
- `Qdrant` stores vectors externally while retrieval traces still persist in SQLite

## Development workflow

Implementation in this repo is intended to follow:

- test-driven development with vertical red-green-refactor slices
- strict typing with `mypy --strict`
- one commit per completed phase of work

Current verification commands:

```bash
uv run pytest
uv run mypy
```
