Metadata-Version: 2.4
Name: quiclabel-coco-sync
Version: 0.0.4
Summary: CLI to incrementally sync a QuicLabel COCO dataset (annotations + images) from quiclabel-admin
Project-URL: Homepage, https://github.com/weavejam/quiclabel/tree/main/apps/quiclabel-sync-project-coco
Project-URL: Repository, https://github.com/weavejam/quiclabel
Project-URL: Issues, https://github.com/weavejam/quiclabel/issues
Author: weavejam / quiclabel contributors
License-Expression: MIT
License-File: LICENSE
Keywords: annotation,coco,computer-vision,dataset,quiclabel,sync
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: requests>=2.31
Description-Content-Type: text/markdown

# quiclabel-coco-sync

CLI to incrementally sync a QuicLabel COCO dataset (annotations + images)
from `quiclabel-admin`. Pulls a fresh `annotations-YYYYMMDD-HHMMSS.json`
next to your existing dataset and multi-threadedly downloads only the
images you don't already have.

## Prerequisites

- **uv** — Python package & runtime manager. Install:
  - macOS / Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`
  - Windows: `winget install astral-sh.uv` (or `irm https://astral.sh/uv/install.ps1 | iex`)
  - via pipx: `pipx install uv`
- **An API key** — get one from quiclabel-admin: *Settings → API Keys → New key*.
  Copy the `qk_...` value immediately (it's only shown once).

## Quick start (from PyPI — recommended)

No clone, no install — `uvx` downloads, caches and runs in one shot:

```bash
uvx quiclabel-coco-sync path/to/annotations.json \
  --admin-url https://quiclabel-admin.example.com \
  --api-key qk_xxxxxxxxxxxxxxxxxxxxxx
```

Or set env vars and call it bare:

```bash
export QUICLABEL_ADMIN_URL=https://quiclabel-admin.example.com
export QUICLABEL_API_KEY=qk_xxxxxxxxxxxxxxxxxxxxxx
uvx quiclabel-coco-sync path/to/annotations.json
```

Prefer a persistent install? Use `uv tool`:

```bash
uv tool install quiclabel-coco-sync
quiclabel-coco-sync path/to/annotations.json --admin-url ... --api-key ...
```

## From the monorepo (contributors)

```bash
# From the repo root
pnpm sync-project-coco path/to/annotations.json \
  --admin-url https://quiclabel-admin.example.com \
  --api-key qk_xxxxxxxxxxxxxxxxxxxxxx
```

Or directly with `uv` against this app directory:

```bash
cd apps/quiclabel-sync-project-coco
uv sync
uv run quiclabel-coco-sync path/to/annotations.json \
  --admin-url https://quiclabel-admin.example.com \
  --api-key qk_xxxxxxxxxxxxxxxxxxxxxx
```

## What it does

1. Reads `path/to/annotations.json` and its `meta` block (added by the COCO exporter).
2. Calls `GET /api/v1/projects/<project_id>/coco` with the same filters,
   paging by cursor — so 10k+ task projects don't blow up server memory.
3. Writes `path/to/annotations-20260519-143045.json` (timestamped — never
   overwrites your input).
4. Diffs `task_id` sets, downloads any missing images to `path/to/images/`
   using a thread pool. Files already on disk are skipped by file name.

The old `annotations.json` and the existing `images/*` files are never touched.

## Configuration priority

Each value is resolved in this order — first wins:

1. CLI flag (`--project-id`, `--statuses`, …)
2. Env var (`QUICLABEL_ADMIN_URL`, `QUICLABEL_API_KEY`)
3. `meta` block of the input json

If anything required is missing from all three, the CLI exits with a clear
message naming the missing key and where to provide it.

## Recovery

- **Partial failure** (some images failed mid-run): just re-run the same
  command. Already-downloaded files are skipped by file name, so retry only
  fetches the remaining ones. The CLI tells you this in the failure summary.
- **Corrupt image file**: delete it, then re-run.
- **A `.part` file in `images/`** indicates a crashed download. Safe to delete.

## Development

```bash
cd apps/quiclabel-sync-project-coco
uv sync --group dev
uv run pytest
```

## Releasing to PyPI (maintainers)

Releases are tag-driven. Pushing a `coco-sync-v<version>` tag triggers
`.github/workflows/coco-sync-release.yml`, which builds, runs tests, and
publishes to PyPI via Trusted Publishers (OIDC — no API token needed).

```bash
cd apps/quiclabel-sync-project-coco

# 1. bump version in pyproject.toml (e.g. 0.0.1 -> 0.0.2)
# 2. add a section in CHANGELOG.md
# 3. commit, then tag the exact same version
git add pyproject.toml CHANGELOG.md
git commit -m "coco-sync: release v0.0.2"
git tag coco-sync-v0.0.2
git push && git push origin coco-sync-v0.0.2
```

The release workflow asserts `git tag` ↔ `pyproject.toml` agreement and
fails the publish if they drift.

### One-time PyPI Trusted Publisher setup

In <https://pypi.org/manage/project/quiclabel-coco-sync/settings/publishing/>,
add a publisher with:

| Field           | Value                       |
| --------------- | --------------------------- |
| Owner           | `weavejam`                  |
| Repository      | `quiclabel`                 |
| Workflow name   | `coco-sync-release.yml`     |
| Environment     | `pypi`                      |

Also create a `pypi` environment in GitHub repo settings (optional protection
rules: required reviewers, branch restrictions).

### Manual fallback

If the workflow is unavailable, publish from your machine. Note that
`uv publish` **does not** read `~/.pypirc` (unlike twine) — pass the
token via `UV_PUBLISH_TOKEN` or `--token`:

```bash
cd apps/quiclabel-sync-project-coco
uv build
# Powershell: extract token from ~/.pypirc and publish
$env:UV_PUBLISH_TOKEN = (Get-Content (Join-Path $env:USERPROFILE '.pypirc') |
  Where-Object { $_ -match '^password\s*=' } |
  ForEach-Object { ($_ -replace '^password\s*=\s*','').Trim() } |
  Select-Object -First 1)
uv publish

# bash equivalent:
export UV_PUBLISH_TOKEN=$(grep '^password' ~/.pypirc | head -1 | sed 's/^password *= *//')
uv publish
```

Get a PyPI API token at <https://pypi.org/manage/account/token/>.

> ⚠️ **PyPI versions are immutable.** Once a version is published it
> cannot be overwritten or re-uploaded — even after `yank`. Every release
> must bump the `version` in `pyproject.toml` (e.g. `0.0.1` → `0.0.2`).

