Metadata-Version: 2.4
Name: smart-autocommit
Version: 0.1.0
Summary: Smart git auto-commit: deterministic gates decide what may enter history; the LLM only writes messages and researches unknown paths — and never blocks a commit.
Project-URL: Homepage, https://github.com/crhan/smart-autocommit
Project-URL: Source, https://github.com/crhan/smart-autocommit
Project-URL: Changelog, https://github.com/crhan/smart-autocommit/blob/main/CHANGELOG.md
Author-email: crhan <crhan123@gmail.com>
License: MIT
License-File: LICENSE
Keywords: autocommit,automation,cli,git,llm,systemd
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Version Control :: Git
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# smart-autocommit (`sac`)

Smart git auto-commit. **Deterministic rules decide what may enter your history;
the LLM only writes the commit message and researches unfamiliar paths — and it
never blocks a commit.**

Traditional autocommit (`cron` + `git add -A`, or a hand-maintained allowlist) has
two chronic problems:

- **A static allowlist rots.** The moment your repo layout changes you must hand-edit
  the list — miss one and it isn't backed up, add one too many and it's noise.
- **`add -A` is reckless.** It rakes dependency dirs, caches, nested repos, and orphan
  data straight into history — and **git history is a one-way door; you can't fully
  delete it.**

`sac` replaces the allowlist with **a controlled `add -A` behind six deterministic
gates**, then hands the two judgement-heavy, fault-tolerant jobs (writing the message,
classifying an unknown path) to an LLM that is *always optional*.

```
tracked changes ─┐
                 ├─▶ 6 deterministic gates ─▶ git commit ─▶ git push
untracked entries┘        (zero LLM)            ▲
                                                │ commit message: LLM → template → auto:
unknown top-level paths ──▶ research (LLM) ──▶ suggestion + warning (never auto-applied)
```

## Why it's safe

- **The LLM never blocks a commit.** Any LLM/network/timeout/error silently degrades to
  the pure deterministic path. A global off switch (`message.enabled` / `research.enabled`
  `false`) returns you to rules-only.
- **Unknown paths default to *excluded*, not included.** A brand-new top-level directory
  is held back and flagged — never auto-committed into history.
- **The LLM only ever sees `--stat` / `--name-status`** (filenames + add/del counts),
  never the diff body — so secret *contents* can't leak into a message or be sent out.
- **Gates are directory-level**, not single-file — because real junk is thousands of
  small files.
- **Push never `--force`s.** A diverged remote keeps your local commit and asks for a human.
- **Push allowlist.** Set `push_allowed_hosts` and sac refuses to push anywhere else — a
  secret-bearing config repo can't be pushed to the wrong remote.
- **Provenance.** Every autocommit carries an `auto(sac): ` subject prefix and an
  `Auto-committed-by:` trailer, so it's never mistaken for a human commit in `git log`.

## The six gates

| # | Gate | What it stops |
|---|------|---------------|
| 1 | **denylist** | `.gitignore` pre-filters junk; an optional engine glob denylist adds defense-in-depth |
| 2 | **nested repo skip** | a path containing `.git` would become a gitlink polluting the parent |
| 3 | **unknown top-level path** | a new top-level segment not in the known set — excluded + researched (the core safety gate) |
| 4 | **dir size / file count** | a new entry over the size or file-count cap (measured on the entry itself) |
| 5 | **sensitive files** | only `--stat`/`--name-status` ever reach the LLM; sensitive-looking filenames are flagged |
| 6 | **bulk delete** | too many staged deletions aborts the commit (guards against a wiped directory) |

## Install

Requires Python ≥ 3.14 and `git`. Recommended (isolated, on your PATH):

```sh
uv tool install smart-autocommit          # or: pipx install smart-autocommit
```

From a checkout, for system-service hosting:

```sh
./install.sh                              # installs `sac` + writes ~/.config/smart-autocommit/config.json
```

## Use

### In-place (like git) — zero registration

```sh
cd /any/git/repo
sac --dry-run        # show the decisions + message, commit nothing
sac                  # smart-commit the repo you're standing in
sac init             # drop a .smart-autocommit.json so policy travels with the repo
```

### Managed — one config, many repos, unattended

```sh
sac --all                       # process every enabled repo in the central config
sac --repo openclaw-config      # just one
sac --all --dry-run             # rehearse the whole batch
```

Run it on a schedule with the systemd user timer in [`systemd/`](systemd/) (see
[ARCHITECTURE.md](ARCHITECTURE.md#service-mode)).

### CI / scripts

```sh
sac --repo build-artifacts --json    # structured result on stdout
echo "exit: $?"                      # 0 ok · 1 a repo failed a gate · 2 usage/config error
```

## Configure (three layers)

Lowest to highest precedence: **built-in defaults → central `defaults` → repo-local
`.smart-autocommit.json` → the repo's `repos[]` entry → CLI flags.**

```jsonc
{
  "defaults": {
    "push": true, "remote": "origin", "branch": "main",
    "gates": { "max_dir_size_mb": 5, "max_dir_files": 200, "max_deletes": 20, "denylist": [] },
    "message":  { "enabled": true, "language": "en", "provider": "default", "timeout": 45 },
    "research": { "enabled": true, "provider": "default", "timeout": 120, "skip_on_dry_run": true }
  },
  "providers": {
    "default": { "type": "openai", "base_url": "https://api.openai.com/v1",
                 "model": "gpt-4o-mini", "api_key_env": "OPENAI_API_KEY" }
  },
  "repos": [
    { "name": "openclaw-config", "path": "/home/me/.openclaw" }
  ]
}
```

If you set nothing, `sac` builds an OpenAI-compatible provider from the environment
(`SAC_API_KEY`/`OPENAI_API_KEY`, `SAC_BASE_URL`, `SAC_MODEL`). No key → messages fall
back to a template, and the commit still happens.

**Providers are pluggable** (`base_url` + `model` + `apiKey`): any OpenAI-compatible
endpoint via `openai`, the Aliyun `bl` CLI via `bailian`, or any command via `command`.
Unknown-path research uses a coding agent (`agent`, e.g. `pi`) or, by default, the same
chat model over read-only evidence the engine gathers. See
[`config.example.json`](config.example.json) and [ARCHITECTURE.md](ARCHITECTURE.md).

> **Note:** the optional `agent` research provider runs a real coding agent with
> file-reading tools, so it is *outside* the "only filenames reach the LLM" guarantee —
> it may read file contents while investigating an unknown path. The default `chat`
> researcher does not (it only sees metadata the engine gathers). Use `agent` only where
> sending file contents to your model is acceptable.

## License

MIT — see [LICENSE](LICENSE).
