Reading additional input from stdin...
OpenAI Codex v0.128.0 (research preview)
--------
workdir: J:\CLAUDE\PROJECTS\Wakeword
model: gpt-5.5
provider: openai
approval: never
sandbox: danger-full-access
reasoning effort: xhigh
reasoning summaries: none
session id: 019e0233-5693-7c13-93b6-d2dc72a8c757
--------
user
Environment: Windows 11, bash shell. Project root: J:\CLAUDE\PROJECTS\Wakeword (master).

PROBLEM
We just discovered three runtime bugs in the training pipeline that ONLY surface when actually invoking the pipeline (not caught by 147 unit tests):

1. `OWWModel()` defaulting to TFLite when bundled tflite_runtime can't read current openwakeword .tflite schema → fixed by pinning `inference_framework="onnx"` in `src/violawake_sdk/tools/train.py` (3 sites, lines ~732, ~872, ~1764).

2. openwakeword resource files not pre-downloaded in container → fixed by `download_models()` call in `entrypoint.sh`.

3. fast_mp3_augment dependency missing → added to `console/backend/requirements.txt`.

Those were the obvious ones the live test surfaced. Find the OTHER ones we haven't tripped over yet — the classes of latent bugs in the training + SDK paths that would bite users on a fresh deploy.

CRITICAL CONSTRAINTS
- Do NOT use PowerShell with complex quoting.
- Use Read tool / `head` / `tail` / `sed -n` / `grep`.
- READ-ONLY investigation primarily; only fix things that are clearly bugs and have minimum-touch fixes (one-line / one-import additions). Document larger bugs with proposed-fix notes — DO NOT do big refactors.

INVESTIGATE
Audit these classes of issues across both `src/violawake_sdk/` and `console/backend/`:

A. **Hardcoded backend / framework / format choices that fail when the assumed default isn't there.**
   - Search for: `OWWModel(`, `Interpreter(`, `tflite`, `ort.InferenceSession(`, `torch.load(`, `transformers.AutoModel`, `whisper.load_model(`.
   - For each, check: does the constructor pick the right backend by default in our container? Are there fallbacks? If the assumed file/format isn't present, does the error message give a clear hint?

B. **Imports inside functions (lazy) that fail at runtime, not import time.**
   - Search for: `def [^_].*:.*\n.*import ` (function-local imports) — if a dependency is missing, the function fails to call but the module loads fine, masking the issue from CI.
   - List all such imports in `src/violawake_sdk/tools/train.py`, `src/violawake_sdk/oww_backbone.py`, and any backend services.

C. **External-network dependencies in the synchronous training path.**
   - Search for: `urllib.request`, `requests.`, `httpx.`, `urlopen(`, `Communicate(`, `download_models`, `download_url`.
   - Each one is a potential failure point on a customer machine without internet, behind a corporate proxy, or with rate-limited APIs.

D. **Filesystem expectations.**
   - Search for: `Path(__file__).resolve().parent`, `Path("corpus")`, `Path.home()`, `os.environ.get(.*[Dd]ir`.
   - Each is a path the code expects to exist; document which ones are required vs. optional, and whether failure is loud or silent.

E. **Race conditions / startup order.**
   - Look at `console/backend/app/main.py` startup hooks. Do any depend on Postgres being ready? On models being downloaded? If so, do they retry?

F. **Quota + rate limit gotchas.**
   - Look at `console/backend/app/services/training_service.py`. What happens if a user is at quota and tries to train? Does it 4xx cleanly or 500?

DECIDE & IMPLEMENT
Produce a single audit doc at `docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md` containing:

For EACH finding:
- Severity: CRITICAL (would block deploy) / HIGH (would block a customer) / MEDIUM (degrades quality) / LOW (cosmetic)
- File + line(s)
- Concrete evidence (a code snippet or a `grep` result)
- Proposed fix (one paragraph; do NOT implement except for trivially-mechanical one-liners)
- Estimated effort

If you find a CRITICAL or HIGH issue with an OBVIOUS one-line fix (e.g., another forgotten `inference_framework="onnx"` somewhere, missing requirement), apply it and commit separately.

Do NOT refactor. Do NOT rewrite functions. The point is the audit.

PROVE IT
1. Show the audit doc head -100.
2. List any one-line fixes you applied with their commit SHAs.
3. Confirm no test regressions: `cd console && python -m pytest tests/ --no-cov --timeout=30 --ignore=tests/e2e -q 2>&1 | tail -3`.

REPORT
- Number of findings by severity.
- Any commits made.
- One-liner takeaway: what's the next biggest landmine after the four we already fixed?

Time budget: ~25 min.
codex
I’ll keep this audit read-only first, then only patch if there’s an unambiguous one-line fix. I’ll use bash-style `grep`/`sed`/`head` commands and avoid PowerShell-heavy quoting as requested.
mcp: local-terminal/run_command started
mcp: local-terminal/run_command (failed)
mcp: local-terminal/run_command started
mcp: local-terminal/run_command (failed)
mcp: local-terminal/run_command started
mcp: local-terminal/run_command (completed)
mcp: local-terminal/run_command started
mcp: local-terminal/run_command (completed)
  → Map repo state and relevant training/backend files
  • Audit framework defaults, lazy imports, network calls, filesystem paths, startup, quota behavior
  • Apply only obvious one-line fixes if any, with separate commits
  • Write audit doc and verify with requested commands/tests
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"sed -n '1,260p' console/backend/app/services/training_service.py\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git status --short --branch"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"sed -n '1,220p' console/backend/app/main.py\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "find src/violawake_sdk console/backend -maxdepth 3 -type f | sort | head -120"' in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1123ms:
"""Training pipeline helpers used by the async job queue."""

from __future__ import annotations

import json
import logging
import random
import shutil
import tempfile
import time
from collections.abc import Callable
from dataclasses import dataclass
from pathlib import Path
from typing import Any

from app.config import settings
from app.monitoring import log_exception
from app.storage import get_storage

logger = logging.getLogger("violawake.training")


class TrainingCancelledError(RuntimeError):
    """Raised when a running training job is cancelled."""


@dataclass(slots=True)
class TrainingArtifact:
    """Artifacts produced by a completed training run."""

    local_path: Path
    config_json: str | None
    config_bytes: bytes | None
    d_prime: float | None
    size_bytes: int


def run_training_job_sync(
    *,
    job_id: int,
    wake_word: str,
    recording_identifiers: list[str],
    output_path: Path,
    epochs: int,
    timeout_seconds: int,
    progress_callback: Callable[[dict[str, Any]], None],
    is_cancelled: Callable[[], bool],
    negatives_dir: Path | None = None,
) -> TrainingArtifact:
    """Run the ViolaWake SDK training pipeline synchronously."""
    positives_dir: Path | None = None
    neg_temp_dir: Path | None = None
    storage = get_storage()

    def _ensure_not_cancelled() -> None:
        if is_cancelled():
            raise TrainingCancelledError("Training cancelled by user")

    try:
        _ensure_not_cancelled()
        progress_callback({
            "status": "running",
            "progress": 0.0,
            "epoch": 0,
            "total_epochs": epochs,
            "train_loss": 0.0,
            "val_loss": 0.0,
            "message": "Preparing training data...",
            "error": None,
        })

        positives_dir = Path(tempfile.mkdtemp(prefix="violawake_train_", dir=str(settings.tmp_dir)))
        for index, recording_identifier in enumerate(recording_identifiers):
            _ensure_not_cancelled()
            if not storage.exists(recording_identifier):
                logger.warning("Recording %s was missing for training job %s", recording_identifier, job_id)
                continue

            dst = positives_dir / f"sample_{index:04d}.wav"
            dst.write_bytes(storage.download(recording_identifier))

        wav_count = len(list(positives_dir.glob("*.wav")))
        if wav_count < 5:
            raise RuntimeError("Only %s valid WAV files found. Need at least 5." % wav_count)

        pos_files = sorted(positives_dir.glob("*.wav"))

        progress_callback({
            "status": "running",
            "progress": 2.0,
            "epoch": 0,
            "total_epochs": epochs,
            "train_loss": 0.0,
            "val_loss": 0.0,
            "message": "Loaded %s recordings. Generating TTS corpus..." % wav_count,
            "error": None,
        })

        # -- Production pipeline: full auto-corpus (matching CLI train) --
        from violawake_sdk.tools.train import (
            _generate_confusable_negatives,
            _generate_speech_negatives,
            _generate_tts_positives,
            _train_temporal_cnn,
        )

        neg_temp_dir = Path(tempfile.mkdtemp(prefix="violawake_neg_", dir=str(settings.tmp_dir)))

        # Auto-generate TTS positives when user has <100 samples (production behavior)
        # Keep track of user-provided files so augmentation targets only real recordings
        user_pos_files = list(pos_files)
        if len(pos_files) < 100:
            tts_pos_dir = neg_temp_dir / "tts_positives"
            try:
                tts_pos_files = _generate_tts_positives(
                    wake_word,
                    tts_pos_dir,
                    verbose=False,
                )
                if tts_pos_files:
                    pos_files = list(pos_files) + tts_pos_files
                    logger.info(
                        "Generated %s TTS positives for job %s (total: %s)",
                        len(tts_pos_files), job_id, len(pos_files),
                    )
            except Exception as exc:
                logger.error(
                    "TTS positive generation FAILED for job %s: %s — "
                    "model quality will be degraded without TTS diversity",
                    job_id, exc,
                )

            _ensure_not_cancelled()
            progress_callback({
                "status": "running",
                "progress": 3.0,
                "epoch": 0,
                "total_epochs": epochs,
                "train_loss": 0.0,
                "val_loss": 0.0,
                "message": "Corpus: %s positives. Generating negatives..." % len(pos_files),
                "error": None,
            })
        neg_tag_map: dict[str, list[Path]] = {}

        # Source 1: User/paid-tier corpus negatives
        if negatives_dir and negatives_dir.exists():
            user_neg = sorted(
                list(negatives_dir.rglob("*.wav")) + list(negatives_dir.rglob("*.flac"))
            )
            if user_neg:
                neg_tag_map["neg_user"] = user_neg
                logger.info("Loaded %s corpus negatives for job %s", len(user_neg), job_id)

        _ensure_not_cancelled()

        # Source 2: Auto-generated confusable negatives (phonetically similar)
        # Two rounds matching CLI production pipeline:
        #   Round 1: 30 confusables x 10 voices (broad phonetic coverage)
        #   Round 2: 16 confusables x 10 voices (tight variants for hard negatives)
        confusable_dir_r1 = neg_temp_dir / "confusables_r1"
        try:
            confusable_r1 = _generate_confusable_negatives(
                wake_word,
                confusable_dir_r1,
                n_confusables=30,
                voices_per_word=10,
                verbose=False,
            )
            if confusable_r1:
                neg_tag_map["neg_confusable_r1"] = confusable_r1
        except Exception as exc:
            logger.error(
                "Confusable round 1 FAILED for job %s: %s — "
                "model will have higher false positive rate on similar-sounding words",
                job_id, exc,
            )

        _ensure_not_cancelled()

        confusable_dir_r2 = neg_temp_dir / "confusables_r2"
        try:
            confusable_r2 = _generate_confusable_negatives(
                wake_word,
                confusable_dir_r2,
                n_confusables=16,
                voices_per_word=10,
                verbose=False,
            )
            if confusable_r2:
                neg_tag_map["neg_confusable_r2"] = confusable_r2
        except Exception as exc:
            logger.error(
                "Confusable round 2 FAILED for job %s: %s",
                job_id, exc,
            )

        _ensure_not_cancelled()
        progress_callback({
            "status": "running",
            "progress": 4.0,
            "epoch": 0,
            "total_epochs": epochs,
            "train_loss": 0.0,
            "val_loss": 0.0,
            "message": "Generated confusables. Generating speech negatives...",
            "error": None,
        })

        # Source 3: Auto-generated speech negatives (common phrases)
        # 5 voices matching CLI production pipeline (was 3)
        speech_dir = neg_temp_dir / "speech"
        try:
            speech_files = _generate_speech_negatives(
                speech_dir,
                n_voices=5,
                verbose=False,
            )
            if speech_files:
                neg_tag_map["neg_speech"] = speech_files
        except Exception as exc:
            logger.error(
                "Speech neg generation FAILED for job %s: %s — "
                "model will have higher false positive rate on general speech",
                job_id, exc,
            )

        _ensure_not_cancelled()

        # Source 4: Universal corpus (LibriSpeech, MUSAN) if available
        _CORPUS_SEARCH_PATHS = [
            Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
            Path.home() / ".violawake" / "corpus",
            Path("corpus"),
        ]
        _CORPUS_SUBDIRS: dict[str, tuple[str, ...]] = {
            "neg_librispeech": ("librispeech",),
            "neg_musan_speech": ("musan/musan/speech", "musan/speech"),
            "neg_musan_music": ("musan/musan/music", "musan/music"),
            "neg_musan_noise": ("musan/musan/noise", "musan/noise"),
        }
        _rng = random.Random(42)
        for tag, subdirs in _CORPUS_SUBDIRS.items():
            for corpus_root in _CORPUS_SEARCH_PATHS:
                if not corpus_root.exists():
                    continue
                for subdir in subdirs:
                    candidate = corpus_root / subdir
                    if candidate.exists():
                        corpus_files = sorted(
                            list(candidate.rglob("*.wav")) + list(candidate.rglob("*.flac"))
                        )
                        if corpus_files:
                            if len(corpus_files) > 2000:
                                corpus_files = sorted(_rng.sample(corpus_files, 2000))
                            neg_tag_map[tag] = corpus_files
                            break
                if tag in neg_tag_map:
                    break


 succeeded in 1146ms:
"""FastAPI application for ViolaWake Console backend."""

from __future__ import annotations

import asyncio
import logging
import secrets
from collections.abc import AsyncIterator
from contextlib import asynccontextmanager, suppress

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from slowapi import _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded

from app.config import settings
from app.database import init_db
from app.health import router as health_router
from app.job_queue import init_job_queue, shutdown_job_queue
from app.middleware import (
    ErrorHandlingMiddleware,
    MaxBodySizeMiddleware,
    RequestLoggingMiddleware,
    SecurityHeadersMiddleware,
    configure_logging,
    init_sentry,
    register_exception_handlers,
)
from app.monitoring import APP_VERSION, init_monitoring_state, log_exception, mark_startup_complete
from app.rate_limit import limiter
from app.routes import auth, billing, files, jobs, models, recordings, teams, training

configure_logging()
logger = logging.getLogger("violawake.console")
init_sentry()

_RETENTION_INTERVAL_SECONDS = 24 * 60 * 60  # 24 hours


async def _retention_loop() -> None:
    """Run retention cleanup every 24 hours in the background."""
    from app.retention import (
        cleanup_expired_models,
        cleanup_expired_recordings,
        cleanup_soft_deleted_recordings,
    )

    while True:
        try:
            await cleanup_soft_deleted_recordings()
            await cleanup_expired_recordings()
            await cleanup_expired_models()
        except Exception as exc:
            log_exception(logger, exc, message="Retention cleanup cycle failed", source="retention")

        await asyncio.sleep(_RETENTION_INTERVAL_SECONDS)


@asynccontextmanager
async def lifespan(app: FastAPI) -> AsyncIterator[None]:
    """Application lifespan: initialize DB and mark the app ready."""
    init_monitoring_state(app)
    logger.info(
        "%s",
        "Starting ViolaWake Console backend",
        extra={"event_data": {"source": "startup", "database_target": settings.database_log_target}},
    )

    retention_task: asyncio.Task[None] | None = None
    try:
        await init_db()
        await init_job_queue()

        # Eagerly initialize the email service so the "Resend not configured"
        # warning is logged at startup rather than on the first registration.
        from app.email_service import get_email_service
        email_svc = get_email_service()
        if not email_svc.enabled:
            logger.warning(
                "Email service is disabled (VIOLAWAKE_RESEND_API_KEY not set). "
                "Users will be auto-verified on registration."
            )

        retention_task = asyncio.create_task(_retention_loop(), name="retention-cleanup")

        mark_startup_complete(app)
        logger.info(
            "%s",
            "ViolaWake Console backend ready",
            extra={"event_data": {"source": "startup", "version": APP_VERSION}},
        )
        yield
    except Exception as exc:
        log_exception(logger, exc, message="Application startup failed", source="startup")
        raise
    finally:
        if retention_task is not None:
            retention_task.cancel()
            with suppress(asyncio.CancelledError):
                await retention_task
        await shutdown_job_queue()
        logger.info("%s", "Shutting down ViolaWake Console backend", extra={"event_data": {"source": "shutdown"}})


app = FastAPI(
    title="ViolaWake Console",
    description="Backend API for ViolaWake wake word training console",
    version=APP_VERSION,
    lifespan=lifespan,
)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
init_monitoring_state(app)
register_exception_handlers(app)

app.add_middleware(ErrorHandlingMiddleware)
app.add_middleware(MaxBodySizeMiddleware)
app.add_middleware(RequestLoggingMiddleware)
app.add_middleware(SecurityHeadersMiddleware)
app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.effective_cors_origins,
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"],
    allow_headers=["Authorization", "Content-Type", "Accept", "Origin", "X-Requested-With"],
)

app.include_router(health_router)
app.include_router(auth.router)
app.include_router(recordings.router)
app.include_router(jobs.router)
app.include_router(training.router)
app.include_router(models.router)
app.include_router(billing.router)
app.include_router(files.router)
app.include_router(teams.router)


# ---------------------------------------------------------------------------
# Admin endpoint: POST /api/admin/cleanup
# ---------------------------------------------------------------------------
# Guarded by VIOLAWAKE_ADMIN_TOKEN.  When the env var is not set (development
# default), the endpoint is disabled and returns 404 so it is never reachable
# without explicit configuration.
# ---------------------------------------------------------------------------

from fastapi import APIRouter, Depends, Header, HTTPException, status  # noqa: E402

_admin_router = APIRouter(prefix="/api/admin", tags=["admin"])


async def _require_admin(x_admin_token: str | None = Header(default=None)) -> None:
    """Dependency that verifies the X-Admin-Token header against config."""
    admin_token: str = getattr(settings, "admin_token", "")
    if not admin_token:
        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Not found")
    if not secrets.compare_digest(x_admin_token or "", admin_token):
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Forbidden")


@_admin_router.post("/cleanup")
async def trigger_cleanup(
    _: None = Depends(_require_admin),
) -> dict[str, int]:
    """Manually trigger retention cleanup for recordings and models.

    Requires ``VIOLAWAKE_ADMIN_TOKEN`` to be set in the environment.
    Protected by the ``X-Admin-Token`` request header.
    """
    from app.retention import (
        cleanup_expired_models,
        cleanup_expired_recordings,
        cleanup_soft_deleted_recordings,
    )

    soft_deleted_purged = await cleanup_soft_deleted_recordings()
    recordings_deleted = await cleanup_expired_recordings()
    models_deleted = await cleanup_expired_models()
    logger.info(
        "Admin cleanup triggered: %s soft-deleted recording(s) purged, %s recording(s) and %s model(s) deleted",
        soft_deleted_purged,
        recordings_deleted,
        models_deleted,
    )
    return {
        "soft_deleted_recordings_purged": soft_deleted_purged,
        "recordings_deleted": recordings_deleted,
        "models_deleted": models_deleted,
    }


app.include_router(_admin_router)

 succeeded in 1215ms:
## master...origin/master
?? .codex_log_a.txt
?? .codex_log_alpha.txt
?? .codex_log_b.txt
?? .codex_log_c.txt
?? .codex_log_delta.txt
?? .codex_prompt_a.txt
?? .codex_prompt_alpha.txt
?? .codex_prompt_b.txt
?? .codex_prompt_c.txt
?? .codex_prompt_delta.txt

 succeeded in 1263ms:
console/backend/.coverage
console/backend/.env
console/backend/.env.example
console/backend/__pycache__/run.cpython-311.pyc
console/backend/alembic.ini
console/backend/alembic/env.py
console/backend/alembic/script.py.mako
console/backend/alembic/versions/20260326_0001_a1b2c3d4e5f6_initial_schema.py
console/backend/alembic/versions/20260328_0001_b2c3d4e5f6a7_add_teams.py
console/backend/alembic/versions/20260405_0001_c3d4e5f6a7b8_add_login_lockout_and_soft_delete.py
console/backend/alembic/versions/20260507_0001_d4e5f6a7b8c9_add_processed_stripe_events_table.py
console/backend/app/__init__.py
console/backend/app/__pycache__/__init__.cpython-311.pyc
console/backend/app/__pycache__/auth.cpython-311.pyc
console/backend/app/__pycache__/config.cpython-311.pyc
console/backend/app/__pycache__/database.cpython-311.pyc
console/backend/app/__pycache__/email_service.cpython-311.pyc
console/backend/app/__pycache__/health.cpython-311.pyc
console/backend/app/__pycache__/job_queue.cpython-311.pyc
console/backend/app/__pycache__/main.cpython-311.pyc
console/backend/app/__pycache__/middleware.cpython-311.pyc
console/backend/app/__pycache__/models.cpython-311.pyc
console/backend/app/__pycache__/monitoring.cpython-311.pyc
console/backend/app/__pycache__/rate_limit.cpython-311.pyc
console/backend/app/__pycache__/retention.cpython-311.pyc
console/backend/app/__pycache__/schemas.cpython-311.pyc
console/backend/app/__pycache__/storage.cpython-311.pyc
console/backend/app/auth.py
console/backend/app/config.py
console/backend/app/database.py
console/backend/app/email_service.py
console/backend/app/health.py
console/backend/app/job_queue.py
console/backend/app/main.py
console/backend/app/middleware.py
console/backend/app/models.py
console/backend/app/monitoring.py
console/backend/app/rate_limit.py
console/backend/app/retention.py
console/backend/app/routes/__init__.py
console/backend/app/routes/auth.py
console/backend/app/routes/billing.py
console/backend/app/routes/files.py
console/backend/app/routes/jobs.py
console/backend/app/routes/models.py
console/backend/app/routes/recordings.py
console/backend/app/routes/teams.py
console/backend/app/routes/training.py
console/backend/app/schemas.py
console/backend/app/services/__init__.py
console/backend/app/services/training_service.py
console/backend/app/storage.py
console/backend/data/auth-test-1774540988789338000.db
console/backend/data/auth-test-inline.db
console/backend/data/auth-test-temp.db
console/backend/data/demo.db
console/backend/data/e2e_test.db
console/backend/data/job_queue.db
console/backend/data/violawake.db
console/backend/entrypoint.sh
console/backend/requirements.txt
console/backend/run.py
console/backend/scripts/train_full_pipeline.py
console/backend/tests/__init__.py
console/backend/tests/__pycache__/__init__.cpython-311.pyc
console/backend/tests/__pycache__/test_change_password.cpython-311.pyc
console/backend/tests/__pycache__/test_change_password.cpython-311-pytest-8.4.1.pyc
console/backend/tests/test_change_password.py
src/violawake_sdk/__init__.py
src/violawake_sdk/__pycache__/__init__.cpython-311.pyc
src/violawake_sdk/__pycache__/__init__.cpython-313.pyc
src/violawake_sdk/__pycache__/_constants.cpython-311.pyc
src/violawake_sdk/__pycache__/_exceptions.cpython-311.pyc
src/violawake_sdk/__pycache__/_exceptions.cpython-313.pyc
src/violawake_sdk/__pycache__/async_detector.cpython-311.pyc
src/violawake_sdk/__pycache__/audio.cpython-311.pyc
src/violawake_sdk/__pycache__/audio_source.cpython-311.pyc
src/violawake_sdk/__pycache__/confidence.cpython-311.pyc
src/violawake_sdk/__pycache__/ensemble.cpython-311.pyc
src/violawake_sdk/__pycache__/models.cpython-311.pyc
src/violawake_sdk/__pycache__/models.cpython-313.pyc
src/violawake_sdk/__pycache__/noise_profiler.cpython-311.pyc
src/violawake_sdk/__pycache__/oww_backbone.cpython-311.pyc
src/violawake_sdk/__pycache__/pipeline.cpython-311.pyc
src/violawake_sdk/__pycache__/pipeline.cpython-313.pyc
src/violawake_sdk/__pycache__/power_manager.cpython-311.pyc
src/violawake_sdk/__pycache__/speaker.cpython-311.pyc
src/violawake_sdk/__pycache__/stt.cpython-311.pyc
src/violawake_sdk/__pycache__/stt.cpython-313.pyc
src/violawake_sdk/__pycache__/stt_engine.cpython-311.pyc
src/violawake_sdk/__pycache__/stt_engine.cpython-313.pyc
src/violawake_sdk/__pycache__/tts.cpython-311.pyc
src/violawake_sdk/__pycache__/tts.cpython-313.pyc
src/violawake_sdk/__pycache__/tts_engine.cpython-311.pyc
src/violawake_sdk/__pycache__/vad.cpython-311.pyc
src/violawake_sdk/__pycache__/vad.cpython-313.pyc
src/violawake_sdk/__pycache__/vad_engine.cpython-311.pyc
src/violawake_sdk/__pycache__/wake_detector.cpython-311.pyc
src/violawake_sdk/__pycache__/wake_detector.cpython-313.pyc
src/violawake_sdk/_constants.py
src/violawake_sdk/_exceptions.py
src/violawake_sdk/async_detector.py
src/violawake_sdk/audio.py
src/violawake_sdk/audio_source.py
src/violawake_sdk/backends/__init__.py
src/violawake_sdk/backends/__pycache__/__init__.cpython-311.pyc
src/violawake_sdk/backends/__pycache__/base.cpython-311.pyc
src/violawake_sdk/backends/__pycache__/onnx_backend.cpython-311.pyc
src/violawake_sdk/backends/__pycache__/tflite_backend.cpython-311.pyc
src/violawake_sdk/backends/base.py
src/violawake_sdk/backends/onnx_backend.py
src/violawake_sdk/backends/tflite_backend.py
src/violawake_sdk/cli/__init__.py
src/violawake_sdk/cli/__pycache__/__init__.cpython-311.pyc
src/violawake_sdk/cli/__pycache__/download.cpython-311.pyc
src/violawake_sdk/cli/__pycache__/evaluate.cpython-311.pyc
src/violawake_sdk/cli/__pycache__/train.cpython-311.pyc
src/violawake_sdk/cli/download.py
src/violawake_sdk/cli/evaluate.py
src/violawake_sdk/cli/train.py

codex
I’ve confirmed the working tree has only untracked Codex log/prompt files so far. Next I’m scanning the requested bug classes with line-numbered grep results, then I’ll read the surrounding code for anything that looks like a customer-facing failure rather than just an implementation detail.
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInE 'OWWModel\\(|Interpreter\\(|tflite|ort\\.InferenceSession\\(|torch\\.load\\(|transformers\\.AutoModel|whisper\\.load_model\\(' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInE 'urllib\\.request|requests\\.|httpx\\.|urlopen\\(|Communicate\\(|download_models|download_url' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInE 'Path\\(__file__\\)\\.resolve\\(\\)\\.parent|Path\\(\\\"corpus\\\"\\)|Path\\.home\\(\\)|os\\.environ\\.get\\(.*[Dd]ir' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInE '"'^[[:space:]]+(import |from [A-Za-z0-9_'"\\.]+ import )' src/violawake_sdk/tools/train.py src/violawake_sdk/oww_backbone.py console/backend/app console/backend/scripts | head -300\"" in J:\CLAUDE\PROJECTS\Wakeword
2026-05-07T11:30:41.754139Z ERROR codex_core::tools::router: error=Exit code: 1
Wall time: 1.6 seconds
Output:
/usr/bin/bash: -c: line 1: unexpected EOF while looking for matching `''

 exited 1 in 1612ms:
/usr/bin/bash: -c: line 1: unexpected EOF while looking for matching `''

 succeeded in 1916ms:
src/violawake_sdk/tools/train.py:68:    import numpy as np
src/violawake_sdk/tools/train.py:227:    import onnxruntime as ort
src/violawake_sdk/tools/train.py:267:    import asyncio
src/violawake_sdk/tools/train.py:268:    import io
src/violawake_sdk/tools/train.py:269:    import tempfile
src/violawake_sdk/tools/train.py:272:        import edge_tts
src/violawake_sdk/tools/train.py:290:                import concurrent.futures
src/violawake_sdk/tools/train.py:304:            from pydub import AudioSegment
src/violawake_sdk/tools/train.py:322:            import torchaudio
src/violawake_sdk/tools/train.py:343:    import numpy as np
src/violawake_sdk/tools/train.py:344:    from scipy.signal import resample_poly
src/violawake_sdk/tools/train.py:363:    import numpy as np
src/violawake_sdk/tools/train.py:366:        from violawake_sdk.tts import TTS_SAMPLE_RATE, TTSEngine
src/violawake_sdk/tools/train.py:400:    import numpy as np
src/violawake_sdk/tools/train.py:402:    from violawake_sdk.training.augment import (
src/violawake_sdk/tools/train.py:418:            from violawake_sdk.tts import AVAILABLE_VOICES, TTS_SAMPLE_RATE, TTSEngine
src/violawake_sdk/tools/train.py:472:                    from violawake_sdk.audio import load_audio
src/violawake_sdk/tools/train.py:473:                    from violawake_sdk.training.augment import apply_additive_noise
src/violawake_sdk/tools/train.py:522:    from violawake_sdk.tools.confusables import generate_confusables
src/violawake_sdk/tools/train.py:604:    import wave
src/violawake_sdk/tools/train.py:606:    import numpy as np
src/violawake_sdk/tools/train.py:634:    import numpy as np
src/violawake_sdk/tools/train.py:637:        from audiomentations import (
src/violawake_sdk/tools/train.py:686:    import numpy as np
src/violawake_sdk/tools/train.py:688:    from violawake_sdk._constants import CLIP_SAMPLES
src/violawake_sdk/tools/train.py:689:    from violawake_sdk.audio import center_crop
src/violawake_sdk/tools/train.py:721:    import numpy as np
src/violawake_sdk/tools/train.py:724:        from openwakeword.model import Model as OWWModel
src/violawake_sdk/tools/train.py:820:    from violawake_sdk.audio import load_audio
src/violawake_sdk/tools/train.py:865:    import numpy as np
src/violawake_sdk/tools/train.py:867:    from violawake_sdk._constants import CLIP_SAMPLES
src/violawake_sdk/tools/train.py:868:    from violawake_sdk.audio import center_crop, load_audio
src/violawake_sdk/tools/train.py:871:        from openwakeword.model import Model as OWWModel
src/violawake_sdk/tools/train.py:950:    import numpy as np
src/violawake_sdk/tools/train.py:1056:        import numpy as np
src/violawake_sdk/tools/train.py:1057:        import torch
src/violawake_sdk/tools/train.py:1058:        import torch.optim as optim
src/violawake_sdk/tools/train.py:1059:        from torch.utils.data import DataLoader, TensorDataset
src/violawake_sdk/tools/train.py:1065:    from violawake_sdk.training.losses import FocalLoss
src/violawake_sdk/tools/train.py:1066:    from violawake_sdk.training.temporal_model import (
src/violawake_sdk/tools/train.py:1071:    from violawake_sdk.training.weight_averaging import (
src/violawake_sdk/tools/train.py:1087:    from violawake_sdk._constants import SAMPLE_RATE
src/violawake_sdk/tools/train.py:1088:    from violawake_sdk.audio import load_audio
src/violawake_sdk/tools/train.py:1396:    from violawake_sdk._constants import DEFAULT_THRESHOLD, get_feature_config
src/violawake_sdk/tools/train.py:1397:    from violawake_sdk.oww_backbone import get_openwakeword_backbone_hashes
src/violawake_sdk/tools/train.py:1447:            from violawake_sdk.training.evaluate import evaluate_onnx_model
src/violawake_sdk/tools/train.py:1538:    import tempfile
src/violawake_sdk/tools/train.py:1540:    import numpy as np
src/violawake_sdk/tools/train.py:1541:    import torch
src/violawake_sdk/tools/train.py:1543:    from violawake_sdk.tools.confusables import generate_confusables
src/violawake_sdk/tools/train.py:1728:        import numpy as np
src/violawake_sdk/tools/train.py:1729:        import torch
src/violawake_sdk/tools/train.py:1730:        import torch.nn as nn
src/violawake_sdk/tools/train.py:1731:        import torch.optim as optim
src/violawake_sdk/tools/train.py:1732:        from torch.utils.data import DataLoader, TensorDataset
src/violawake_sdk/tools/train.py:1738:    from violawake_sdk._constants import CLIP_SAMPLES, get_feature_config
src/violawake_sdk/tools/train.py:1739:    from violawake_sdk.audio import center_crop, load_audio
src/violawake_sdk/tools/train.py:1740:    from violawake_sdk.oww_backbone import get_openwakeword_backbone_hashes
src/violawake_sdk/tools/train.py:1741:    from violawake_sdk.training.augment import AugmentationPipeline
src/violawake_sdk/tools/train.py:1742:    from violawake_sdk.training.losses import FocalLoss
src/violawake_sdk/tools/train.py:1743:    from violawake_sdk.training.weight_averaging import (
src/violawake_sdk/tools/train.py:1750:        from openwakeword.model import Model as OWWModel
src/violawake_sdk/tools/train.py:2066:    import numpy as np
src/violawake_sdk/tools/train.py:2067:    import onnx
src/violawake_sdk/tools/train.py:2068:    from onnx import numpy_helper
src/violawake_sdk/tools/train.py:2451:                            import random
src/violawake_sdk/tools/train.py:2586:            from violawake_sdk.tools.evaluate import evaluate_onnx_model
src/violawake_sdk/oww_backbone.py:19:    from violawake_sdk.backends.base import InferenceBackend
src/violawake_sdk/oww_backbone.py:126:            from openwakeword.utils import download_models  # type: ignore[import-not-found]
src/violawake_sdk/oww_backbone.py:187:            from violawake_sdk.models import MODEL_REGISTRY
console/backend/app/config.py:128:            import logging
console/backend/app/config.py:142:            import logging
console/backend/app/job_queue.py:117:    from app.models import Subscription
console/backend/app/job_queue.py:791:                from app.email_service import get_email_service
console/backend/app/job_queue.py:1170:        from app.models import Subscription
console/backend/app/job_queue.py:1195:            from app.retention import mark_recordings_for_deletion
console/backend/app/main.py:42:    from app.retention import (
console/backend/app/main.py:76:        from app.email_service import get_email_service
console/backend/app/main.py:170:    from app.retention import (
console/backend/app/middleware.py:111:        import sentry_sdk
console/backend/app/middleware.py:112:        from sentry_sdk.integrations.fastapi import FastApiIntegration
console/backend/app/monitoring.py:21:    import tomllib
console/backend/app/monitoring.py:23:    import tomli as tomllib
console/backend/app/retention.py:32:        from app.job_queue import JobStatus, get_job_queue
console/backend/app/retention.py:38:    import aiosqlite
console/backend/app/retention.py:261:                from app.storage import build_companion_config_identifier
console/backend/app/routes/auth.py:407:            import stripe
console/backend/app/routes/billing.py:69:    import stripe
console/backend/app/routes/billing.py:273:            from app.email_service import get_email_service
console/backend/app/routes/billing.py:277:                import asyncio
console/backend/app/routes/recordings.py:128:    import io
console/backend/app/routes/recordings.py:130:    import numpy as np
console/backend/app/routes/recordings.py:131:    from scipy.io import wavfile
console/backend/app/routes/recordings.py:156:        from scipy.signal import resample
console/backend/app/services/training_service.py:100:        from violawake_sdk.tools.train import (
console/backend/app/storage.py:151:            import boto3
console/backend/app/storage.py:152:            from botocore.config import Config as BotoConfig
console/backend/app/storage.py:198:        from botocore.exceptions import ClientError
console/backend/scripts/train_full_pipeline.py:56:    import edge_tts
console/backend/scripts/train_full_pipeline.py:67:    from pydub import AudioSegment
console/backend/scripts/train_full_pipeline.py:76:    from violawake_sdk.training.augment import apply_additive_noise, rir_augment
console/backend/scripts/train_full_pipeline.py:77:    from violawake_sdk.audio import load_audio
console/backend/scripts/train_full_pipeline.py:169:    from violawake_sdk.tools.train import _generate_confusable_negatives
console/backend/scripts/train_full_pipeline.py:189:    from violawake_sdk.tools.train import _generate_speech_negatives
console/backend/scripts/train_full_pipeline.py:245:    from violawake_sdk.tools.train import _train_temporal_cnn

2026-05-07T11:30:54.168891Z ERROR codex_core::tools::router: error=Exit code: 124
Wall time: 14 seconds
Output:
command timed out after 14029 milliseconds

 exited 124 in 14029ms:
2026-05-07T11:30:54.184559Z ERROR codex_core::tools::router: error=Exit code: 124
Wall time: 14 seconds
Output:
command timed out after 14044 milliseconds

 exited 124 in 14044ms:
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' --include='*.sh' --exclude-dir='__pycache__' --exclude-dir='.venv' -E 'OWWModel\\(|Interpreter\\(|tflite|ort\\.InferenceSession\\(|torch\\.load\\(|transformers\\.AutoModel|whisper\\.load_model\\(' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' --include='*.sh' --exclude-dir='__pycache__' --exclude-dir='.venv' -E 'urllib\\.request|requests\\.|httpx\\.|urlopen\\(|Communicate\\(|download_models|download_url' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' --include='*.sh' --exclude-dir='__pycache__' --exclude-dir='.venv' -E 'Path\\(__file__\\)\\.resolve\\(\\)\\.parent|Path\\(\\x22corpus\\x22\\)|Path\\.home\\(\\)|os\\.environ\\.get\\(.*[Dd]ir' src/violawake_sdk console/backend | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' --include='*.sh' --exclude-dir='__pycache__' --exclude-dir='.venv' -E 'OPENAI|RATE_LIMIT|quota|Quota|subscription|training' console/backend/app/services/training_service.py console/backend/app/job_queue.py console/backend/app/routes/training.py console/backend/app/models.py | head -260\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1544ms:
console/backend/app/services/training_service.py:20:logger = logging.getLogger("violawake.training")
console/backend/app/services/training_service.py:24:    """Raised when a running training job is cancelled."""
console/backend/app/services/training_service.py:29:    """Artifacts produced by a completed training run."""
console/backend/app/services/training_service.py:38:def run_training_job_sync(
console/backend/app/services/training_service.py:50:    """Run the ViolaWake SDK training pipeline synchronously."""
console/backend/app/services/training_service.py:68:            "message": "Preparing training data...",
console/backend/app/services/training_service.py:76:                logger.warning("Recording %s was missing for training job %s", recording_identifier, job_id)
console/backend/app/services/training_service.py:358:            source="training",
console/backend/app/job_queue.py:1:"""Persistent async training job queue with circuit breaker protection."""
console/backend/app/job_queue.py:26:from app.services.training_service import TrainingCancelledError, run_training_job_sync
console/backend/app/job_queue.py:68:# Priority values assigned by subscription tier.
console/backend/app/job_queue.py:77:    """Persisted training job metadata."""
console/backend/app/job_queue.py:116:    """Return the queue priority for a user based on their subscription tier."""
console/backend/app/job_queue.py:142:    """Persistent async training job queue."""
console/backend/app/job_queue.py:210:        """Persist a new training job and enqueue it when capacity allows.
console/backend/app/job_queue.py:213:        user's subscription tier (free=0, developer=5, business=10).
console/backend/app/job_queue.py:283:            "Queued training job %s for user %s (priority=%s)",
console/backend/app/job_queue.py:301:                "message": "Queued for training.",
console/backend/app/job_queue.py:618:            logger.info("Resumed interrupted training jobs for user %s", user_id)
console/backend/app/job_queue.py:702:                raise RuntimeError(f"No valid recordings found for training job {job_id}")
console/backend/app/job_queue.py:720:                run_training_job_sync,
console/backend/app/job_queue.py:726:                timeout_seconds=settings.training_timeout,
console/backend/app/job_queue.py:767:            # Schedule post-training recording deletion (privacy: recordings
console/backend/app/job_queue.py:768:            # are deleted after training per the privacy FAQ).
console/backend/app/job_queue.py:789:            # Best-effort training-complete email notification.
console/backend/app/job_queue.py:799:                        await email_svc.send_training_complete(
console/backend/app/job_queue.py:1169:        # Check user's subscription tier
console/backend/app/job_queue.py:1186:        """Soft-delete recordings after training completes.
console/backend/app/job_queue.py:1191:        if settings.post_training_retention_hours <= 0:
console/backend/app/job_queue.py:1201:                "Failed to mark recordings for post-training deletion: %s",
console/backend/app/routes/training.py:1:"""Compatibility routes for legacy training endpoints."""
console/backend/app/routes/training.py:17:from app.routes.billing import check_training_quota
console/backend/app/routes/training.py:18:from app.routes.jobs import get_owned_job_or_404, submit_training_job
console/backend/app/routes/training.py:21:router = APIRouter(prefix="/api/training", tags=["training"])
console/backend/app/routes/training.py:26:async def _quota_user_with_rate_key(
console/backend/app/routes/training.py:28:    current_user: Annotated[User, Depends(check_training_quota)],
console/backend/app/routes/training.py:30:    """Resolve the user via training-quota check and stash ID for rate limiting."""
console/backend/app/routes/training.py:36:    """Map queue status names to the legacy training API values."""
console/backend/app/routes/training.py:44:async def start_training(
console/backend/app/routes/training.py:47:    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
console/backend/app/routes/training.py:50:    """Submit a training job through the persistent queue."""
console/backend/app/routes/training.py:51:    job_response = await submit_training_job(body, current_user, db)
console/backend/app/routes/training.py:56:async def get_training_status(
console/backend/app/routes/training.py:60:    """Get the current status of a queued training job."""
console/backend/app/routes/training.py:84:            expected_action="training_stream",
console/backend/app/routes/training.py:108:            detail="Verify your email address to access recording, training, and billing features.",
console/backend/app/routes/training.py:114:async def stream_training(
console/backend/app/routes/training.py:120:    """Stream queued training progress via Server-Sent Events."""
console/backend/app/routes/training.py:144:                    "event": "training",
console/backend/app/routes/training.py:180:                        "event": "training",
console/backend/app/models.py:30:    training_jobs: Mapped[list[TrainingJob]] = relationship(back_populates="user", cascade="all, delete-orphan")
console/backend/app/models.py:83:    __tablename__ = "training_jobs"
console/backend/app/models.py:97:    user: Mapped[User] = relationship(back_populates="training_jobs")
console/backend/app/models.py:118:    __tablename__ = "subscriptions"
console/backend/app/models.py:125:    stripe_subscription_id: Mapped[str | None] = mapped_column(String(255), nullable=True, unique=True)
console/backend/app/models.py:132:    user: Mapped[User] = relationship(backref="subscription")

 succeeded in 3411ms:
src/violawake_sdk/models.py:27:DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
src/violawake_sdk/tools/train.py:2425:        Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
src/violawake_sdk/tools/train.py:2426:        Path.home() / ".violawake" / "corpus",
src/violawake_sdk/_constants.py:131:DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
console/backend/app/config.py:30:    base_dir: Path = Path(__file__).resolve().parent.parent
console/backend/app/config.py:31:    data_dir: Path = Path(__file__).resolve().parent.parent / "data"
console/backend/app/config.py:32:    db_path: Path = Path(__file__).resolve().parent.parent / "data" / "violawake.db"
console/backend/app/config.py:33:    upload_dir: Path = Path(__file__).resolve().parent.parent / "data" / "recordings"
console/backend/app/config.py:34:    models_dir: Path = Path(__file__).resolve().parent.parent / "data" / "models"
console/backend/app/config.py:35:    tmp_dir: Path = Path(__file__).resolve().parent.parent / "data" / "tmp"
console/backend/app/services/training_service.py:232:            Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
console/backend/app/services/training_service.py:233:            Path.home() / ".violawake" / "corpus",
console/backend/scripts/train_full_pipeline.py:199:        Path(__file__).resolve().parent.parent.parent / "corpus",  # repo root
console/backend/scripts/train_full_pipeline.py:200:        Path.home() / ".violawake" / "corpus",

 succeeded in 3461ms:
src/violawake_sdk/backends/base.py:98:        """Short human-readable name (e.g. ``'onnx'``, ``'tflite'``)."""
src/violawake_sdk/backends/base.py:110:            model_path: Path to the model file (``.onnx`` or ``.tflite``).
src/violawake_sdk/backends/onnx_backend.py:88:            session = ort.InferenceSession(str(model_path), providers=providers)
src/violawake_sdk/backends/tflite_backend.py:4:``tflite-runtime`` (or full ``tensorflow``) is available but
src/violawake_sdk/backends/tflite_backend.py:8:Model files must be in ``.tflite`` format.  Use the
src/violawake_sdk/backends/tflite_backend.py:9:``convert_onnx_to_tflite()`` utility at the bottom of this module
src/violawake_sdk/backends/tflite_backend.py:16:    backend = get_backend("tflite")
src/violawake_sdk/backends/tflite_backend.py:17:    session = backend.load("model.tflite")
src/violawake_sdk/backends/tflite_backend.py:101:def _get_tflite_interpreter_class() -> type:
src/violawake_sdk/backends/tflite_backend.py:104:    Tries ``tflite_runtime`` first (lightweight, ~5 MB), then falls
src/violawake_sdk/backends/tflite_backend.py:111:        from tflite_runtime.interpreter import Interpreter
src/violawake_sdk/backends/tflite_backend.py:126:        "  pip install violawake[tflite]\n"
src/violawake_sdk/backends/tflite_backend.py:127:        "  pip install tflite-runtime>=2.14"
src/violawake_sdk/backends/tflite_backend.py:298:    """Backend powered by ``tflite-runtime`` (or ``tensorflow.lite``)."""
src/violawake_sdk/backends/tflite_backend.py:302:        self._interpreter_cls = _get_tflite_interpreter_class()
src/violawake_sdk/backends/tflite_backend.py:306:        return "tflite"
src/violawake_sdk/backends/tflite_backend.py:328:            _get_tflite_interpreter_class()
src/violawake_sdk/backends/tflite_backend.py:339:def convert_onnx_to_tflite(
src/violawake_sdk/backends/tflite_backend.py:341:    tflite_path: str | Path | None = None,
src/violawake_sdk/backends/tflite_backend.py:350:        .onnx  -->  TF SavedModel  -->  .tflite
src/violawake_sdk/backends/tflite_backend.py:354:        tflite_path: Destination path for the ``.tflite`` file.
src/violawake_sdk/backends/tflite_backend.py:360:        Path to the written ``.tflite`` file.
src/violawake_sdk/backends/tflite_backend.py:370:    tflite_path = onnx_path.with_suffix(".tflite") if tflite_path is None else Path(tflite_path)
src/violawake_sdk/backends/tflite_backend.py:376:    _saved_model_to_tflite(saved_model_dir, tflite_path, quantize=quantize)
src/violawake_sdk/backends/tflite_backend.py:389:        tflite_path.name,
src/violawake_sdk/backends/tflite_backend.py:390:        tflite_path.stat().st_size / 1024,
src/violawake_sdk/backends/tflite_backend.py:393:    return tflite_path
src/violawake_sdk/backends/tflite_backend.py:442:def _saved_model_to_tflite(
src/violawake_sdk/backends/tflite_backend.py:444:    tflite_path: Path,
src/violawake_sdk/backends/tflite_backend.py:447:    """Convert a TF SavedModel to a ``.tflite`` flatbuffer."""
src/violawake_sdk/backends/tflite_backend.py:461:        tflite_model = converter.convert()
src/violawake_sdk/backends/tflite_backend.py:465:    tflite_path.parent.mkdir(parents=True, exist_ok=True)
src/violawake_sdk/backends/tflite_backend.py:466:    tflite_path.write_bytes(tflite_model)
src/violawake_sdk/backends/tflite_backend.py:467:    logger.debug("SavedModel -> TFLite: %s", tflite_path)
src/violawake_sdk/backends/__init__.py:11:    backend = get_backend("auto")           # onnx > tflite
src/violawake_sdk/backends/__init__.py:39:            - ``"tflite"`` -- TFLite Runtime (requires ``tflite-runtime``
src/violawake_sdk/backends/__init__.py:54:    if name == "tflite":
src/violawake_sdk/backends/__init__.py:55:        return _make_tflite()
src/violawake_sdk/backends/__init__.py:58:    raise ValueError(f"Unknown backend {name!r}.  Choose from: 'onnx', 'tflite', 'auto'.")
src/violawake_sdk/backends/__init__.py:67:def _make_tflite() -> InferenceBackend:
src/violawake_sdk/backends/__init__.py:68:    from violawake_sdk.backends.tflite_backend import TFLiteBackend
src/violawake_sdk/backends/__init__.py:83:        backend = _make_tflite()
src/violawake_sdk/backends/__init__.py:91:        "violawake  OR  tflite-runtime: pip install violawake[tflite]"
src/violawake_sdk/models.py:88:    # temporal_cnn_tflite removed — TFLite conversion not yet validated.
src/violawake_sdk/models.py:89:    # Re-add once .tflite is generated and uploaded to GitHub Releases.
src/violawake_sdk/models.py:313:        format: Desired model format (``"onnx"`` or ``"tflite"``).
src/violawake_sdk/models.py:314:            When ``"tflite"`` is requested, the function first checks for
src/violawake_sdk/models.py:315:            a ``<model_name>_tflite`` entry in the registry, then falls back
src/violawake_sdk/models.py:316:            to looking for a locally converted ``.tflite`` file next to the
src/violawake_sdk/models.py:328:    # Handle format-based lookup: resolve "temporal_cnn" + format="tflite"
src/violawake_sdk/models.py:329:    # to the "temporal_cnn_tflite" registry entry if it exists.
src/violawake_sdk/models.py:331:    if format == "tflite" and not model_name.endswith("_tflite"):
src/violawake_sdk/models.py:332:        tflite_name = f"{model_name}_tflite"
src/violawake_sdk/models.py:333:        if tflite_name in MODEL_REGISTRY:
src/violawake_sdk/models.py:334:            effective_name = tflite_name
src/violawake_sdk/models.py:336:            # No registry entry; check for a locally converted .tflite file
src/violawake_sdk/models.py:339:                tflite_path = get_model_dir() / f"{spec.name}.tflite"
src/violawake_sdk/models.py:340:                if tflite_path.exists():
src/violawake_sdk/models.py:341:                    return tflite_path
src/violawake_sdk/models.py:343:                    f"TFLite version of '{model_name}' not found at {tflite_path}. "
src/violawake_sdk/models.py:344:                    f"Convert with: violawake-download --model {model_name} --format tflite"
src/violawake_sdk/oww_backbone.py:113:    extension = ".tflite" if backend_name == "tflite" else ".onnx"
src/violawake_sdk/tools/contamination_check.py:114:    oww = OWWModel()
src/violawake_sdk/tools/download_model.py:53:        choices=["onnx", "tflite"],
src/violawake_sdk/tools/download_model.py:56:        "When 'tflite' is specified, downloads the ONNX model first then "
src/violawake_sdk/tools/download_model.py:120:            if args.format == "tflite":
src/violawake_sdk/tools/download_model.py:121:                _convert_to_tflite(path, quantize=args.quantize)
src/violawake_sdk/tools/download_model.py:134:def _convert_to_tflite(onnx_path, *, quantize: bool = False) -> None:
src/violawake_sdk/tools/download_model.py:139:    tflite_path = onnx_path.with_suffix(".tflite")
src/violawake_sdk/tools/download_model.py:141:    if tflite_path.exists():
src/violawake_sdk/tools/download_model.py:142:        size_mb = tflite_path.stat().st_size / 1_000_000
src/violawake_sdk/tools/download_model.py:143:        print(f"  TFLite already exists: {tflite_path} ({size_mb:.1f} MB)")
src/violawake_sdk/tools/download_model.py:148:        from violawake_sdk.backends.tflite_backend import convert_onnx_to_tflite
src/violawake_sdk/tools/download_model.py:150:        result = convert_onnx_to_tflite(onnx_path, tflite_path, quantize=quantize)
src/violawake_sdk/tools/train.py:733:    # present, but the bundled tflite_runtime in our backend image rejects the
src/violawake_sdk/tools/train.py:734:    # current openwakeword .tflite schema with "Could not open ...". ONNX path
src/violawake_sdk/tools/train.py:736:    oww = OWWModel(inference_framework="onnx")
src/violawake_sdk/tools/train.py:876:    oww = OWWModel(inference_framework="onnx")
src/violawake_sdk/tools/train.py:1768:    oww = OWWModel(inference_framework="onnx")
src/violawake_sdk/training/evaluate.py:264:    oww = OWWModel()
src/violawake_sdk/training/evaluate.py:327:    oww = OWWModel()
src/violawake_sdk/training/evaluate.py:377:    session = ort.InferenceSession(str(model_path), providers=["CPUExecutionProvider"])
src/violawake_sdk/wake_detector.py:305:        backend: Inference backend selector (``"onnx"``, ``"tflite"``, ``"auto"``).
src/violawake_sdk/wake_detector.py:319:    _VALID_BACKENDS = ("onnx", "tflite", "auto")
src/violawake_sdk/wake_detector.py:577:        Resolves *model* to a file path (direct path, .onnx/.tflite suffix,
src/violawake_sdk/wake_detector.py:581:        the method looks for a sibling ``.tflite`` file with the same stem.
src/violawake_sdk/wake_detector.py:585:        # When using the TFLite backend, prefer a .tflite sibling if the
src/violawake_sdk/wake_detector.py:587:        if self._backend.name == "tflite" and model_path.suffix == ".onnx":
src/violawake_sdk/wake_detector.py:588:            tflite_sibling = model_path.with_suffix(".tflite")
src/violawake_sdk/wake_detector.py:589:            if tflite_sibling.exists():
src/violawake_sdk/wake_detector.py:590:                model_path = tflite_sibling
src/violawake_sdk/wake_detector.py:591:                logger.debug("TFLite backend: using .tflite sibling %s", model_path)
src/violawake_sdk/wake_detector.py:596:                    '"from violawake_sdk.backends.tflite_backend import '
src/violawake_sdk/wake_detector.py:597:                    "convert_onnx_to_tflite; convert_onnx_to_tflite('%s')\"",
src/violawake_sdk/wake_detector.py:615:        2. If *model* ends with ``.onnx`` or ``.tflite``, treat as a path
src/violawake_sdk/wake_detector.py:622:        if model.endswith((".onnx", ".tflite")):

 succeeded in 3460ms:
src/violawake_sdk/models.py:58:        # Not directly downloadable — fetched at runtime by openwakeword.utils.download_models()
src/violawake_sdk/models.py:199:    import urllib.request
src/violawake_sdk/models.py:234:        response = urllib.request.urlopen(spec.url, timeout=60)  # noqa: S310 — URLs validated as HTTPS above
src/violawake_sdk/models.py:483:        response = requests.get(spec.url, stream=True, timeout=30)
src/violawake_sdk/oww_backbone.py:121:        # the user is expected to call openwakeword.utils.download_models() once.
src/violawake_sdk/oww_backbone.py:126:            from openwakeword.utils import download_models  # type: ignore[import-not-found]
src/violawake_sdk/oww_backbone.py:129:                "OpenWakeWord backbone files missing (%s); fetching via openwakeword.utils.download_models()",
src/violawake_sdk/oww_backbone.py:132:            download_models()
src/violawake_sdk/oww_backbone.py:142:                f"{missing_str}. Tried to auto-download via openwakeword.utils.download_models() "
src/violawake_sdk/oww_backbone.py:144:                "`python -c \"from openwakeword.utils import download_models; download_models()\"`."
src/violawake_sdk/security/cert_pinning.py:630:    This wraps requests.get() with pre- and post-connection pin
src/violawake_sdk/security/cert_pinning.py:640:        **kwargs: Additional kwargs passed to requests.get().
src/violawake_sdk/security/cert_pinning.py:643:        The requests.Response object.
src/violawake_sdk/security/cert_pinning.py:664:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:673:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:680:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:706:    response = requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/tools/expand_corpus.py:89:    response = requests.get(url, stream=True, timeout=timeout)
src/violawake_sdk/tools/generate_samples.py:171:    communicate = edge_tts.Communicate(text, voice_name)
src/violawake_sdk/tools/train.py:278:        communicate = edge_tts.Communicate(text, voice)
console/backend/app/email_service.py:87:    async def send_training_complete(self, to: str, model_name: str, download_url: str) -> bool:
console/backend/app/email_service.py:93:            button_url=self._absolute_url(download_url),
console/backend/app/job_queue.py:798:                        download_url = f"/models/{model_id}/download"
console/backend/app/job_queue.py:802:                            download_url=download_url,
console/backend/entrypoint.sh:17:# wheel ships without ONNX/TFLite resources; download_models() fetches
console/backend/entrypoint.sh:22:python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
console/backend/scripts/train_full_pipeline.py:57:    communicate = edge_tts.Communicate(text, voice)

codex
One early signal is exactly the kind of repeat bug you warned about: `OWWModel()` is pinned in `tools/train.py`, but there are still default constructors in SDK evaluation/contamination tooling. I’m checking whether those are in fresh-deploy training paths before deciding if they qualify for tiny fixes.
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/training/evaluate.py | sed -n '240,390p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/contamination_check.py | sed -n '90,135p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/oww_backbone.py | sed -n '1,215p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '700,895p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1189ms:
     1	"""OpenWakeWord backbone resolution and streaming embedding extraction."""
     2	
     3	from __future__ import annotations
     4	
     5	import hashlib
     6	import importlib.util
     7	import logging
     8	from dataclasses import dataclass
     9	from pathlib import Path
    10	from typing import TYPE_CHECKING
    11	
    12	import numpy as np
    13	
    14	from violawake_sdk._exceptions import ModelNotFoundError
    15	
    16	logger = logging.getLogger(__name__)
    17	
    18	if TYPE_CHECKING:
    19	    from violawake_sdk.backends.base import InferenceBackend
    20	
    21	SAMPLE_RATE = 16_000
    22	MEL_FRAMES_PER_EMBEDDING = 76
    23	MEL_STRIDE = 8
    24	EMBEDDING_DIM = 96
    25	OWW_CHUNK_SAMPLES = 1_280
    26	_OWW_MELSPEC_CONTEXT_SAMPLES = 160 * 3
    27	# 10 seconds of raw audio at 16 kHz — rolling window for mel context.
    28	_MAX_RAW_SAMPLES = SAMPLE_RATE * 10
    29	# 10 seconds of mel frames at ~97 frames/sec — bounds spectrogram buffer growth.
    30	_MAX_MELSPEC_FRAMES = 10 * 97
    31	
    32	
    33	class _RingBuffer:
    34	    """Fixed-capacity ring buffer backed by a pre-allocated numpy int16 array.
    35	
    36	    Avoids the Python-object overhead of ``deque.extend(array.tolist())`` by
    37	    keeping samples in contiguous numpy memory and using a write pointer that
    38	    wraps around.
    39	    """
    40	
    41	    __slots__ = ("_buf", "_capacity", "_write_pos", "_count")
    42	
    43	    def __init__(self, capacity: int) -> None:
    44	        self._buf = np.zeros(capacity, dtype=np.int16)
    45	        self._capacity = capacity
    46	        self._write_pos = 0
    47	        self._count = 0
    48	
    49	    @property
    50	    def count(self) -> int:
    51	        """Number of samples currently stored."""
    52	        return self._count
    53	
    54	    def extend(self, data: np.ndarray) -> None:
    55	        """Append *data* (int16 ndarray) to the ring buffer."""
    56	        n = data.shape[0]
    57	        if n == 0:
    58	            return
    59	
    60	        if n >= self._capacity:
    61	            # Data larger than buffer — keep only the tail.
    62	            self._buf[:] = data[-self._capacity :]
    63	            self._write_pos = 0
    64	            self._count = self._capacity
    65	            return
    66	
    67	        end = self._write_pos + n
    68	        if end <= self._capacity:
    69	            self._buf[self._write_pos : end] = data
    70	        else:
    71	            first = self._capacity - self._write_pos
    72	            self._buf[self._write_pos :] = data[:first]
    73	            self._buf[: n - first] = data[first:]
    74	
    75	        self._write_pos = end % self._capacity
    76	        self._count = min(self._count + n, self._capacity)
    77	
    78	    def tail(self, n: int) -> np.ndarray:
    79	        """Return the last *n* samples in chronological order.
    80	
    81	        If fewer than *n* samples have been written, returns all available
    82	        samples.
    83	        """
    84	        n = min(n, self._count)
    85	        if n == 0:
    86	            return np.empty(0, dtype=np.int16)
    87	
    88	        start = (self._write_pos - n) % self._capacity
    89	        if start + n <= self._capacity:
    90	            return self._buf[start : start + n].copy()
    91	
    92	        # Wraps around — two slices.
    93	        return np.concatenate((self._buf[start:], self._buf[: self._write_pos]))
    94	
    95	
    96	@dataclass(frozen=True)
    97	class OpenWakeWordBackbonePaths:
    98	    """Resolved OpenWakeWord backbone asset paths."""
    99	
   100	    melspectrogram: Path
   101	    embedding_model: Path
   102	
   103	
   104	def resolve_openwakeword_backbone_paths(backend_name: str = "onnx") -> OpenWakeWordBackbonePaths:
   105	    """Resolve the installed OpenWakeWord backbone assets for the chosen backend."""
   106	    spec = importlib.util.find_spec("openwakeword")
   107	    if spec is None or not spec.submodule_search_locations:
   108	        raise ModelNotFoundError(
   109	            "OpenWakeWord is required for wake word detection. "
   110	            "Install with: pip install openwakeword"
   111	        )
   112	
   113	    extension = ".tflite" if backend_name == "tflite" else ".onnx"
   114	    resources_dir = Path(spec.submodule_search_locations[0]) / "resources" / "models"
   115	    melspectrogram = resources_dir / f"melspectrogram{extension}"
   116	    embedding_model = resources_dir / f"embedding_model{extension}"
   117	
   118	    missing = [path.name for path in (melspectrogram, embedding_model) if not path.exists()]
   119	    if missing:
   120	        # Recent openwakeword wheels do not bundle the ONNX backbone files; instead,
   121	        # the user is expected to call openwakeword.utils.download_models() once.
   122	        # We do that automatically here so `pip install violawake[oww]` works
   123	        # without surprises. Only triggered on the missing path; subsequent calls
   124	        # short-circuit because the files exist.
   125	        try:
   126	            from openwakeword.utils import download_models  # type: ignore[import-not-found]
   127	
   128	            logger.info(
   129	                "OpenWakeWord backbone files missing (%s); fetching via openwakeword.utils.download_models()",
   130	                ", ".join(missing),
   131	            )
   132	            download_models()
   133	        except Exception as exc:  # pragma: no cover - best-effort
   134	            logger.warning("Auto-download of OpenWakeWord backbone failed: %s", exc)
   135	
   136	        # Re-check after download attempt
   137	        missing = [path.name for path in (melspectrogram, embedding_model) if not path.exists()]
   138	        if missing:
   139	            missing_str = ", ".join(missing)
   140	            raise ModelNotFoundError(
   141	                "OpenWakeWord backbone files are missing from the installed package: "
   142	                f"{missing_str}. Tried to auto-download via openwakeword.utils.download_models() "
   143	                "but files are still missing. Try: "
   144	                "`python -c \"from openwakeword.utils import download_models; download_models()\"`."
   145	            )
   146	
   147	    return OpenWakeWordBackbonePaths(
   148	        melspectrogram=melspectrogram,
   149	        embedding_model=embedding_model,
   150	    )
   151	
   152	
   153	def _sha256_file(path: Path) -> str:
   154	    """Compute the SHA-256 digest for a file."""
   155	    digest = hashlib.sha256()
   156	    with path.open("rb") as f:
   157	        for chunk in iter(lambda: f.read(1024 * 1024), b""):
   158	            digest.update(chunk)
   159	    return digest.hexdigest()
   160	
   161	
   162	def get_openwakeword_backbone_hashes(backend_name: str = "onnx") -> dict[str, str]:
   163	    """Return SHA-256 hashes for the installed OpenWakeWord backbone files."""
   164	    paths = resolve_openwakeword_backbone_paths(backend_name)
   165	    return {
   166	        "oww_mel_sha256": _sha256_file(paths.melspectrogram),
   167	        "oww_emb_sha256": _sha256_file(paths.embedding_model),
   168	    }
   169	
   170	
   171	class OpenWakeWordBackbone:
   172	    """Streaming wrapper around OpenWakeWord's mel + embedding backbone."""
   173	
   174	    def __init__(self, backend: InferenceBackend) -> None:
   175	        paths = resolve_openwakeword_backbone_paths(backend.name)
   176	        self._verify_backbone_integrity(paths)
   177	        self._melspec_session = backend.load(paths.melspectrogram)
   178	        self._embedding_session = backend.load(paths.embedding_model)
   179	        self._melspec_input_name = self._melspec_session.get_inputs()[0].name
   180	        self._embedding_input_name = self._embedding_session.get_inputs()[0].name
   181	        self.reset()
   182	
   183	    @staticmethod
   184	    def _verify_backbone_integrity(paths: OpenWakeWordBackbonePaths) -> None:
   185	        """Verify OWW backbone files against known SHA-256 hashes."""
   186	        try:
   187	            from violawake_sdk.models import MODEL_REGISTRY
   188	
   189	            spec = MODEL_REGISTRY.get("oww_backbone")
   190	            if spec is None or spec.sha256.startswith("PLACEHOLDER"):
   191	                return  # No known hash to verify against
   192	
   193	            # Compute combined hash (mel + embedding) matching training-time pinning
   194	            actual_hashes = get_openwakeword_backbone_hashes()
   195	            combined = actual_hashes["oww_mel_sha256"] + actual_hashes["oww_emb_sha256"]
   196	            combined_hash = hashlib.sha256(combined.encode()).hexdigest()
   197	
   198	            if combined_hash != spec.sha256:
   199	                logger.warning(
   200	                    "OWW backbone hash mismatch: expected %s, got %s. "
   201	                    "The openwakeword package may have been updated. "
   202	                    "Re-train your model if detection accuracy degrades.",
   203	                    spec.sha256[:16],
   204	                    combined_hash[:16],
   205	                )
   206	        except Exception:
   207	            logger.debug("Skipping backbone integrity check", exc_info=True)
   208	
   209	    @property
   210	    def last_embedding(self) -> np.ndarray | None:
   211	        """Return the most recent 96-d embedding, if any."""
   212	        if self._last_embedding is None:
   213	            return None
   214	        return self._last_embedding.copy()
   215	

 succeeded in 1217ms:
   700	
   701	    audio_f32 = center_crop(audio_f32, CLIP_SAMPLES)
   702	    audio_i16 = np.clip(audio_f32, -1.0, 1.0)
   703	    audio_i16 = (audio_i16 * 32767).astype(np.int16)
   704	
   705	    if len(audio_i16) < CLIP_SAMPLES:
   706	        audio_i16 = np.pad(audio_i16, (0, CLIP_SAMPLES - len(audio_i16)))
   707	    else:
   708	        audio_i16 = audio_i16[:CLIP_SAMPLES]
   709	
   710	    return audio_i16
   711	
   712	
   713	def _extract_temporal_windows_from_audio(
   714	    audio_clips: list[np.ndarray],
   715	    source_ids: list[int],
   716	    tag: str,
   717	    verbose: bool = True,
   718	    seq_len: int = 9,
   719	) -> tuple[list[np.ndarray], list[int], list[str]]:
   720	    """Extract temporal OWW embedding windows from in-memory audio arrays."""
   721	    import numpy as np
   722	
   723	    try:
   724	        from openwakeword.model import Model as OWWModel
   725	    except ImportError as e:
   726	        print(f"ERROR: openwakeword required: {e}", file=sys.stderr)
   727	        sys.exit(1)
   728	
   729	    if len(audio_clips) != len(source_ids):
   730	        raise ValueError("audio_clips and source_ids must have the same length")
   731	
   732	    # Pin ONNX backend explicitly. openwakeword defaults to TFLite when both are
   733	    # present, but the bundled tflite_runtime in our backend image rejects the
   734	    # current openwakeword .tflite schema with "Could not open ...". ONNX path
   735	    # is the canonical production target anyway.
   736	    oww = OWWModel(inference_framework="onnx")
   737	    preprocessor = oww.preprocessor
   738	
   739	    all_embeddings: list[np.ndarray] = []
   740	    all_source_idx: list[int] = []
   741	    all_tags: list[str] = []
   742	    failures = 0
   743	
   744	    for clip_idx, audio in enumerate(audio_clips):
   745	        audio_i16 = _prepare_audio_for_oww(
   746	            audio,
   747	            clip_name=f"{tag}_{clip_idx:04d}",
   748	            verbose=verbose and failures == 0,
   749	        )
   750	        if audio_i16 is None:
   751	            failures += 1
   752	            continue
   753	
   754	        try:
   755	            frame_embeddings_3d = preprocessor.embed_clips(audio_i16.reshape(1, -1), ncpu=1)
   756	            frame_embeddings = frame_embeddings_3d[0]
   757	
   758	            if len(frame_embeddings.shape) == 1:
   759	                frame_embeddings = frame_embeddings.reshape(1, -1)
   760	
   761	            n_frames = frame_embeddings.shape[0]
   762	
   763	            if n_frames >= seq_len:
   764	                for i in range(n_frames - seq_len + 1):
   765	                    window = frame_embeddings[i : i + seq_len].astype(np.float32)
   766	                    all_embeddings.append(window)
   767	                    all_source_idx.append(source_ids[clip_idx])
   768	                    all_tags.append(tag)
   769	            elif n_frames > 0:
   770	                padded = np.zeros((seq_len, frame_embeddings.shape[1]), dtype=np.float32)
   771	                padded[:n_frames] = frame_embeddings
   772	                for j in range(n_frames, seq_len):
   773	                    padded[j] = frame_embeddings[-1]
   774	                all_embeddings.append(padded)
   775	                all_source_idx.append(source_ids[clip_idx])
   776	                all_tags.append(tag)
   777	        except Exception:
   778	            failures += 1
   779	
   780	        if verbose and (clip_idx + 1) % 100 == 0:
   781	            print(f"    {clip_idx + 1}/{len(audio_clips)} clips -> {len(all_embeddings)} windows")
   782	
   783	    if verbose:
   784	        print(
   785	            f"  [{tag}] {len(audio_clips)} clips -> {len(all_embeddings)} temporal windows "
   786	            f"({failures} failures)"
   787	        )
   788	
   789	    return all_embeddings, all_source_idx, all_tags
   790	
   791	
   792	# ---------------------------------------------------------------------------
   793	# Temporal embedding extraction (9-frame windows from OWW backbone)
   794	# ---------------------------------------------------------------------------
   795	
   796	
   797	def _extract_temporal_embeddings(
   798	    audio_files: list[Path],
   799	    tag: str,
   800	    verbose: bool = True,
   801	    seq_len: int = 9,
   802	) -> tuple[list[np.ndarray], list[int], list[str]]:
   803	    """Extract 9-frame temporal OWW embedding windows from audio files.
   804	
   805	    Uses OWW's preprocessor.embed_clips (batch mode) — the same embedding
   806	    extraction method used to train the production temporal_cnn model.
   807	    This is critical for pipeline equivalence: streaming push_audio() produces
   808	    subtly different embeddings due to internal state accumulation.
   809	
   810	    For each audio file, center-crops to CLIP_SAMPLES (1.5s), runs embed_clips
   811	    to get (n_frames, 96) embeddings, and builds sliding windows of `seq_len`
   812	    consecutive embeddings. Each window is a (seq_len, 96) tensor.
   813	
   814	    Returns:
   815	        embeddings: List of (seq_len, 96) numpy arrays.
   816	        source_indices: Source file index for each embedding (for group-aware split).
   817	        tags: Tag string for each embedding.
   818	    """
   819	
   820	    from violawake_sdk.audio import load_audio
   821	
   822	    audio_clips: list[np.ndarray] = []
   823	    source_ids: list[int] = []
   824	    failures = 0
   825	
   826	    for file_idx, wav_path in enumerate(audio_files):
   827	        audio = load_audio(wav_path)
   828	        if audio is None:
   829	            failures += 1
   830	            continue
   831	        audio_clips.append(audio)
   832	        source_ids.append(file_idx)
   833	
   834	    embeddings, embedding_source_ids, tags = _extract_temporal_windows_from_audio(
   835	        audio_clips,
   836	        source_ids,
   837	        tag,
   838	        verbose=verbose,
   839	        seq_len=seq_len,
   840	    )
   841	
   842	    if verbose and failures > 0:
   843	        print(f"  [{tag}] skipped {failures} files during audio loading")
   844	
   845	    return embeddings, embedding_source_ids, tags
   846	
   847	
   848	# ---------------------------------------------------------------------------
   849	# MLP single-frame embedding extraction (legacy path)
   850	# ---------------------------------------------------------------------------
   851	
   852	
   853	def _extract_mlp_embeddings(
   854	    audio_files: list[Path],
   855	    tag: str,
   856	    verbose: bool = True,
   857	) -> tuple[list[np.ndarray], list[int], list[str]]:
   858	    """Extract mean-pooled OWW embeddings for legacy MLP architecture.
   859	
   860	    Returns:
   861	        embeddings: List of (96,) numpy arrays.
   862	        source_indices: Source file index for each embedding.
   863	        tags: Tag string for each embedding.
   864	    """
   865	    import numpy as np
   866	
   867	    from violawake_sdk._constants import CLIP_SAMPLES
   868	    from violawake_sdk.audio import center_crop, load_audio
   869	
   870	    try:
   871	        from openwakeword.model import Model as OWWModel
   872	    except ImportError as e:
   873	        print(f"ERROR: openwakeword required: {e}", file=sys.stderr)
   874	        sys.exit(1)
   875	
   876	    oww = OWWModel(inference_framework="onnx")
   877	    preprocessor = oww.preprocessor
   878	
   879	    all_embeddings: list[np.ndarray] = []
   880	    all_source_idx: list[int] = []
   881	    all_tags: list[str] = []
   882	    failures = 0
   883	
   884	    for file_idx, wav_path in enumerate(audio_files):
   885	        audio = load_audio(wav_path)
   886	        if audio is None:
   887	            failures += 1
   888	            continue
   889	
   890	        # Guard against zero-energy files (corrupted or silent recordings).
   891	        # If these slip through upload validation, they corrupt training:
   892	        # the model learns silence = wake word.
   893	        audio_rms = float(np.sqrt(np.mean(audio**2)))
   894	        if audio_rms < 1e-6:
   895	            if verbose and failures == 0:

 succeeded in 1232ms:
    90	def _check_embedding_overlap(
    91	    train_files: list[Path],
    92	    eval_files: list[Path],
    93	    cosine_threshold: float = 0.99,
    94	) -> dict:
    95	    """Check for near-duplicate embeddings (cosine similarity > threshold).
    96	
    97	    Loads openwakeword to extract embeddings, then computes pairwise cosine
    98	    similarity between all train and eval embeddings.
    99	    """
   100	    import numpy as np
   101	
   102	    try:
   103	        from openwakeword.model import Model as OWWModel
   104	    except ImportError:
   105	        return {
   106	            "method": "embedding",
   107	            "error": "openwakeword not installed. Install with: pip install openwakeword",
   108	            "overlap_count": -1,
   109	        }
   110	
   111	    from violawake_sdk._constants import CLIP_SAMPLES
   112	    from violawake_sdk.audio import center_crop, load_audio
   113	
   114	    oww = OWWModel()
   115	    preprocessor = oww.preprocessor
   116	
   117	    def _embed_file(path: Path) -> np.ndarray | None:
   118	        audio = load_audio(path)
   119	        if audio is None:
   120	            return None
   121	        audio = center_crop(audio, CLIP_SAMPLES)
   122	        audio_int16 = np.clip(audio, -1.0, 1.0)
   123	        audio_int16 = (audio_int16 * 32767).astype(np.int16)
   124	        if len(audio_int16) < CLIP_SAMPLES:
   125	            audio_int16 = np.pad(audio_int16, (0, CLIP_SAMPLES - len(audio_int16)))
   126	        else:
   127	            audio_int16 = audio_int16[:CLIP_SAMPLES]
   128	        try:
   129	            embeddings = preprocessor.embed_clips(audio_int16.reshape(1, -1), ncpu=1)
   130	            return embeddings.mean(axis=1)[0].astype(np.float32)
   131	        except Exception:
   132	            return None
   133	
   134	    print(f"Extracting train embeddings ({len(train_files)} files)...", file=sys.stderr)
   135	    train_embs: list[tuple[Path, np.ndarray]] = []

 succeeded in 1251ms:
   240	    frame_embeddings = np.asarray(preprocessor.embed_clips(audio_i16.reshape(1, -1), ncpu=1))
   241	    if frame_embeddings.ndim == 3:
   242	        frame_embeddings = frame_embeddings[0]
   243	    else:
   244	        frame_embeddings = np.squeeze(frame_embeddings)
   245	
   246	    if frame_embeddings.ndim == 1:
   247	        frame_embeddings = frame_embeddings.reshape(1, -1)
   248	
   249	    return frame_embeddings.astype(np.float32, copy=False)
   250	
   251	
   252	def _build_oww_scorer(session, input_name: str):
   253	    """Build a scorer for mean-pooled OpenWakeWord embedding models."""
   254	    try:
   255	        from openwakeword.model import Model as OWWModel  # type: ignore[import]
   256	    except ImportError as e:
   257	        raise ImportError(
   258	            "openwakeword required for MLP-on-OWW evaluation. pip install openwakeword"
   259	        ) from e
   260	
   261	    from violawake_sdk._constants import CLIP_SAMPLES
   262	    from violawake_sdk.audio import center_crop, load_audio
   263	
   264	    oww = OWWModel()
   265	    preprocessor = oww.preprocessor
   266	    if not hasattr(preprocessor, "onnx_execution_provider"):
   267	        preprocessor.onnx_execution_provider = "CPUExecutionProvider"
   268	
   269	    def _score_file_oww(wav_path: Path) -> float | None:
   270	        try:
   271	            embeddings = _extract_oww_frame_embeddings(
   272	                wav_path,
   273	                preprocessor=preprocessor,
   274	                load_audio=load_audio,
   275	                center_crop=center_crop,
   276	                clip_samples=CLIP_SAMPLES,
   277	            )
   278	            if embeddings is None:
   279	                return None
   280	
   281	            embedding = embeddings.mean(axis=0).astype(np.float32)
   282	            score = session.run(None, {input_name: embedding.reshape(1, -1)})[0]
   283	            return float(np.asarray(score).flatten()[0])
   284	        except Exception:
   285	            logger.warning("Failed to score file (OWW path): %s", wav_path, exc_info=True)
   286	            return None
   287	
   288	    return _score_file_oww
   289	
   290	
   291	def _build_cnn_scorer(session, input_name: str):
   292	    """Build a scorer for legacy mel/CNN models."""
   293	    from violawake_sdk._constants import CLIP_SAMPLES
   294	    from violawake_sdk.audio import center_crop, compute_features, load_audio
   295	
   296	    def _score_file_cnn(wav_path: Path) -> float | None:
   297	        audio = load_audio(wav_path)
   298	        if audio is None:
   299	            return None
   300	
   301	        audio = center_crop(audio, CLIP_SAMPLES)
   302	        features = compute_features(audio)
   303	        feat_input = features[np.newaxis, :, :].astype(np.float32)
   304	
   305	        try:
   306	            outputs = session.run(None, {input_name: feat_input})
   307	            return float(np.asarray(outputs[0]).flatten()[0])
   308	        except Exception:
   309	            logger.warning("Failed to score file (CNN path): %s", wav_path, exc_info=True)
   310	            return None
   311	
   312	    return _score_file_cnn
   313	
   314	
   315	def _build_temporal_oww_scorer(session, input_name: str):
   316	    """Build a scorer for temporal OpenWakeWord embedding models."""
   317	    try:
   318	        from openwakeword.model import Model as OWWModel  # type: ignore[import]
   319	    except ImportError as e:
   320	        raise ImportError(
   321	            "openwakeword required for temporal OWW evaluation. pip install openwakeword"
   322	        ) from e
   323	
   324	    from violawake_sdk._constants import CLIP_SAMPLES
   325	    from violawake_sdk.audio import center_crop, load_audio
   326	
   327	    oww = OWWModel()
   328	    preprocessor = oww.preprocessor
   329	    if not hasattr(preprocessor, "onnx_execution_provider"):
   330	        preprocessor.onnx_execution_provider = "CPUExecutionProvider"
   331	
   332	    seq_len = _infer_temporal_seq_len(session.get_inputs()[0].shape)
   333	
   334	    def _score_file_temporal(wav_path: Path) -> float | None:
   335	        try:
   336	            emb = _extract_oww_frame_embeddings(
   337	                wav_path,
   338	                preprocessor=preprocessor,
   339	                load_audio=load_audio,
   340	                center_crop=center_crop,
   341	                clip_samples=CLIP_SAMPLES,
   342	            )
   343	            if emb is None:
   344	                return None
   345	
   346	            n_frames = emb.shape[0]
   347	            if n_frames < seq_len:
   348	                padded = np.zeros((seq_len, emb.shape[1]), dtype=np.float32)
   349	                padded[:n_frames] = emb
   350	                for idx in range(n_frames, seq_len):
   351	                    padded[idx] = emb[-1]
   352	                window = padded[np.newaxis, :, :].astype(np.float32)
   353	                score = session.run(None, {input_name: window})[0]
   354	                return float(np.asarray(score).flatten()[0])
   355	
   356	            max_score = -1.0
   357	            for idx in range(n_frames - seq_len + 1):
   358	                window = emb[idx : idx + seq_len][np.newaxis, :, :].astype(np.float32)
   359	                score = session.run(None, {input_name: window})[0]
   360	                max_score = max(max_score, float(np.asarray(score).flatten()[0]))
   361	            return max_score
   362	        except Exception:
   363	            logger.warning("Failed to score file (temporal OWW path): %s", wav_path, exc_info=True)
   364	            return None
   365	
   366	    return _score_file_temporal
   367	
   368	
   369	def build_model_scorer(model_path: str | Path) -> tuple[str, Callable[[Path], float | None]]:
   370	    """Create a clip scorer for an ONNX wake-word model."""
   371	    try:
   372	        import onnxruntime as ort
   373	    except ImportError as e:
   374	        raise ImportError("onnxruntime required. pip install onnxruntime") from e
   375	
   376	    model_path = Path(model_path)
   377	    session = ort.InferenceSession(str(model_path), providers=["CPUExecutionProvider"])
   378	    input_name = session.get_inputs()[0].name
   379	
   380	    architecture = detect_architecture(model_path, session)
   381	    logger.info("Using scoring path: %s", architecture)
   382	
   383	    if architecture == "mlp_on_oww":
   384	        return architecture, _build_oww_scorer(session, input_name)
   385	    if architecture == "temporal_oww":
   386	        return architecture, _build_temporal_oww_scorer(session, input_name)
   387	    return architecture, _build_cnn_scorer(session, input_name)
   388	
   389	
   390	def evaluate_onnx_model(

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '1700,1790p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '260,430p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '2400,2485p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/services/training_service.py | sed -n '220,390p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1050ms:
  1700	def _train_mlp_on_oww(
  1701	    positives_dir: Path,
  1702	    output_path: Path,
  1703	    epochs: int = 50,
  1704	    augment: bool = True,
  1705	    eval_dir: Path | None = None,
  1706	    negatives_dir: Path | None = None,
  1707	    batch_size: int = 32,
  1708	    lr: float = 1e-3,
  1709	    hidden_dim: int = 64,
  1710	    neg_ratio: int = 5,
  1711	    patience: int = 10,
  1712	    verbose: bool = True,
  1713	    progress_callback: ProgressCallback | None = None,
  1714	    device: str | None = None,
  1715	    ema_decay: float = 0.999,
  1716	    swa_epochs: int = 10,
  1717	    swa_lr: float | None = None,
  1718	    save_raw_model: bool = False,
  1719	) -> None:
  1720	    """Legacy MLP training on mean-pooled OWW embeddings.
  1721	
  1722	    Kept for backward compatibility with --architecture mlp.
  1723	    See _train_temporal_cnn for the production architecture.
  1724	    """
  1725	    training_start = time.monotonic()
  1726	
  1727	    try:
  1728	        import numpy as np
  1729	        import torch
  1730	        import torch.nn as nn
  1731	        import torch.optim as optim
  1732	        from torch.utils.data import DataLoader, TensorDataset
  1733	    except ImportError as e:
  1734	        print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
  1735	        print("Install with: pip install 'violawake[training]'", file=sys.stderr)
  1736	        sys.exit(1)
  1737	
  1738	    from violawake_sdk._constants import CLIP_SAMPLES, get_feature_config
  1739	    from violawake_sdk.audio import center_crop, load_audio
  1740	    from violawake_sdk.oww_backbone import get_openwakeword_backbone_hashes
  1741	    from violawake_sdk.training.augment import AugmentationPipeline
  1742	    from violawake_sdk.training.losses import FocalLoss
  1743	    from violawake_sdk.training.weight_averaging import (
  1744	        EMATracker,
  1745	        SWACollector,
  1746	        auto_select_averaging,
  1747	    )
  1748	
  1749	    try:
  1750	        from openwakeword.model import Model as OWWModel
  1751	    except ImportError as e:
  1752	        print(f"ERROR: openwakeword required: {e}", file=sys.stderr)
  1753	        sys.exit(1)
  1754	
  1755	    # -- Collect files -------------------------------------------------------
  1756	    pos_files = sorted(list(positives_dir.rglob("*.wav")) + list(positives_dir.rglob("*.flac")))
  1757	    if len(pos_files) < 5:
  1758	        print(f"ERROR: Found only {len(pos_files)} positive samples.", file=sys.stderr)
  1759	        sys.exit(1)
  1760	
  1761	    if verbose:
  1762	        print(f"Found {len(pos_files)} positive samples")
  1763	
  1764	    # -- Embedding extraction ------------------------------------------------
  1765	    if verbose:
  1766	        print("Loading OpenWakeWord backbone...")
  1767	
  1768	    oww = OWWModel(inference_framework="onnx")
  1769	    preprocessor = oww.preprocessor
  1770	
  1771	    def _audio_to_embedding(audio_f32):
  1772	        audio = center_crop(audio_f32, CLIP_SAMPLES)
  1773	        audio_i16 = np.clip(audio, -1.0, 1.0)
  1774	        audio_i16 = (audio_i16 * 32767).astype(np.int16)
  1775	        if len(audio_i16) < CLIP_SAMPLES:
  1776	            audio_i16 = np.pad(audio_i16, (0, CLIP_SAMPLES - len(audio_i16)))
  1777	        else:
  1778	            audio_i16 = audio_i16[:CLIP_SAMPLES]
  1779	        try:
  1780	            embeddings = preprocessor.embed_clips(audio_i16.reshape(1, -1), ncpu=1)
  1781	            return embeddings.mean(axis=1)[0].astype(np.float32)
  1782	        except Exception:
  1783	            return None
  1784	
  1785	    # Extract positives
  1786	    pos_embeddings = []
  1787	    pos_source_file_idx = []
  1788	
  1789	    if augment:
  1790	        pipeline = AugmentationPipeline(seed=42)

 succeeded in 1060ms:
   260	
   261	
   262	def _edge_tts_synthesize(text: str, voice: str, output_path: Path) -> bool:
   263	    """Synthesize a single phrase with edge-tts and save as WAV at 16kHz.
   264	
   265	    Returns True on success, False on failure.
   266	    """
   267	    import asyncio
   268	    import io
   269	    import tempfile
   270	
   271	    try:
   272	        import edge_tts
   273	    except ImportError:
   274	        print("WARNING: edge-tts not installed. pip install edge-tts", file=sys.stderr)
   275	        return False
   276	
   277	    async def _synth():
   278	        communicate = edge_tts.Communicate(text, voice)
   279	        mp3_buf = io.BytesIO()
   280	        async for chunk in communicate.stream():
   281	            if chunk["type"] == "audio":
   282	                mp3_buf.write(chunk["data"])
   283	        return mp3_buf.getvalue()
   284	
   285	    try:
   286	        # Run the async synthesis
   287	        try:
   288	            loop = asyncio.get_event_loop()
   289	            if loop.is_running():
   290	                import concurrent.futures
   291	
   292	                with concurrent.futures.ThreadPoolExecutor() as pool:
   293	                    mp3_data = pool.submit(lambda: asyncio.run(_synth())).result(timeout=30)
   294	            else:
   295	                mp3_data = loop.run_until_complete(_synth())
   296	        except RuntimeError:
   297	            mp3_data = asyncio.run(_synth())
   298	
   299	        if not mp3_data or len(mp3_data) < 100:
   300	            return False
   301	
   302	        # Convert MP3 to WAV at 16kHz using pydub or ffmpeg
   303	        try:
   304	            from pydub import AudioSegment
   305	
   306	            seg = AudioSegment.from_mp3(io.BytesIO(mp3_data))
   307	            seg = seg.set_channels(1).set_frame_rate(16000).set_sample_width(2)
   308	            seg.export(str(output_path), format="wav")
   309	            return True
   310	        except ImportError:
   311	            pass
   312	
   313	        # Fallback: write MP3 to temp, load with torchaudio/scipy
   314	        tmp_fd, tmp_path = tempfile.mkstemp(suffix=".mp3", dir=_TMP_DIR)
   315	        try:
   316	            os.write(tmp_fd, mp3_data)
   317	        finally:
   318	            os.close(tmp_fd)
   319	        os.chmod(tmp_path, 0o600)
   320	
   321	        try:
   322	            import torchaudio
   323	
   324	            waveform, sr = torchaudio.load(tmp_path)
   325	            if waveform.shape[0] > 1:
   326	                waveform = waveform.mean(dim=0, keepdim=True)
   327	            if sr != 16000:
   328	                waveform = torchaudio.transforms.Resample(sr, 16000)(waveform)
   329	            torchaudio.save(str(output_path), waveform, 16000)
   330	            return True
   331	        except Exception:
   332	            pass
   333	        finally:
   334	            Path(tmp_path).unlink(missing_ok=True)
   335	
   336	        return False
   337	    except Exception:
   338	        return False
   339	
   340	
   341	def _resample_audio(audio: np.ndarray, source_rate: int, target_rate: int) -> np.ndarray:
   342	    """Resample mono audio while keeping float32 output."""
   343	    import numpy as np
   344	    from scipy.signal import resample_poly
   345	
   346	    if source_rate == target_rate:
   347	        return np.asarray(audio, dtype=np.float32)
   348	
   349	    gcd = math.gcd(source_rate, target_rate)
   350	    up = target_rate // gcd
   351	    down = source_rate // gcd
   352	    return np.asarray(resample_poly(audio, up, down), dtype=np.float32)
   353	
   354	
   355	def _kokoro_tts_synthesize(
   356	    text: str,
   357	    voice: str,
   358	    output_path: Path,
   359	    *,
   360	    engine: Any | None = None,
   361	) -> bool:
   362	    """Synthesize a single phrase with Kokoro and save as WAV at 16kHz."""
   363	    import numpy as np
   364	
   365	    try:
   366	        from violawake_sdk.tts import TTS_SAMPLE_RATE, TTSEngine
   367	    except ImportError:
   368	        return False
   369	
   370	    try:
   371	        kokoro_engine = engine
   372	        if kokoro_engine is None:
   373	            kokoro_engine = TTSEngine(voice=voice, sample_rate=TTS_SAMPLE_RATE)
   374	        else:
   375	            kokoro_engine.voice = voice
   376	
   377	        audio = np.asarray(kokoro_engine.synthesize(text), dtype=np.float32)
   378	        if audio.size == 0:
   379	            return False
   380	        if int(kokoro_engine.sample_rate) != 16000:
   381	            audio = _resample_audio(audio, int(kokoro_engine.sample_rate), 16000)
   382	        _save_wav(audio, output_path, sample_rate=16000)
   383	        return True
   384	    except Exception:
   385	        return False
   386	
   387	
   388	def _generate_tts_positives(
   389	    wake_word: str,
   390	    output_dir: Path,
   391	    verbose: bool = True,
   392	) -> list[Path]:
   393	    """Generate diverse TTS positive samples using Edge TTS with Kokoro fallback.
   394	
   395	    Produces: 20 voices x 3 phrases (WORD, hey WORD, ok WORD) = 60 clean files.
   396	    Then augmentation (noisy + reverb) multiplies to ~180 total.
   397	
   398	    Returns list of generated WAV file paths.
   399	    """
   400	    import numpy as np
   401	
   402	    from violawake_sdk.training.augment import (
   403	        rir_augment,
   404	    )
   405	
   406	    output_dir.mkdir(parents=True, exist_ok=True)
   407	    phrases = [wake_word, f"hey {wake_word}", f"ok {wake_word}"]
   408	    generated: list[Path] = []
   409	    kokoro_fallback = False
   410	    kokoro_engine: Any | None = None
   411	    kokoro_voices: list[str] = []
   412	
   413	    def _ensure_kokoro_ready() -> bool:
   414	        nonlocal kokoro_fallback, kokoro_engine, kokoro_voices
   415	        if kokoro_fallback:
   416	            return kokoro_engine is not None and len(kokoro_voices) > 0
   417	        try:
   418	            from violawake_sdk.tts import AVAILABLE_VOICES, TTS_SAMPLE_RATE, TTSEngine
   419	        except ImportError:
   420	            return False
   421	
   422	        print("Using Kokoro TTS for sample generation (Edge TTS unavailable)")
   423	        kokoro_fallback = True
   424	        kokoro_voices = list(AVAILABLE_VOICES)
   425	        if not kokoro_voices:
   426	            return False
   427	        try:
   428	            kokoro_engine = TTSEngine(
   429	                voice=kokoro_voices[0],
   430	                sample_rate=TTS_SAMPLE_RATE,

 succeeded in 1027ms:
  2400	            verbose=verbose,
  2401	        )
  2402	        if confusable_r2:
  2403	            neg_tag_map["neg_confusable_r2"] = confusable_r2
  2404	            confusable_files.extend(confusable_r2)
  2405	
  2406	    # Source 3: Auto-generated speech negatives
  2407	    speech_neg_files: list[Path] = []
  2408	    if args.auto_corpus:
  2409	        if verbose:
  2410	            print("\nStep 1c: Auto-generating speech negatives...")
  2411	        speech_neg_dir = corpus_dir / "speech_negatives"
  2412	        speech_neg_files = _generate_speech_negatives(
  2413	            speech_neg_dir,
  2414	            n_voices=5,
  2415	            verbose=verbose,
  2416	        )
  2417	        if speech_neg_files:
  2418	            neg_tag_map["neg_speech"] = speech_neg_files
  2419	
  2420	    # Source 4: Shared universal negative corpus (LibriSpeech, MUSAN, etc.)
  2421	    # These are word-agnostic negatives that every wake word model needs.
  2422	    # Without them, models only learn to distinguish the wake word from a
  2423	    # tiny auto-generated set and false-trigger on any real-world speech.
  2424	    _CORPUS_SEARCH_PATHS = [
  2425	        Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
  2426	        Path.home() / ".violawake" / "corpus",
  2427	        Path("corpus"),
  2428	    ]
  2429	    _CORPUS_SUBDIRS = {
  2430	        "neg_librispeech": "librispeech",
  2431	        "neg_musan_speech": ("musan/musan/speech", "musan/speech"),
  2432	        "neg_musan_music": ("musan/musan/music", "musan/music"),
  2433	        "neg_musan_noise": ("musan/musan/noise", "musan/noise"),
  2434	    }
  2435	    for tag, subdirs in _CORPUS_SUBDIRS.items():
  2436	        if isinstance(subdirs, str):
  2437	            subdirs = (subdirs,)
  2438	        for corpus_root in _CORPUS_SEARCH_PATHS:
  2439	            if not corpus_root.exists():
  2440	                continue
  2441	            for subdir in subdirs:
  2442	                candidate = corpus_root / subdir
  2443	                if candidate.exists():
  2444	                    corpus_files = sorted(
  2445	                        list(candidate.rglob("*.wav")) + list(candidate.rglob("*.flac"))
  2446	                    )
  2447	                    if corpus_files:
  2448	                        # Cap each source to avoid swamping the dataset
  2449	                        max_per_source = 2000
  2450	                        if len(corpus_files) > max_per_source:
  2451	                            import random
  2452	
  2453	                            rng = random.Random(42)
  2454	                            corpus_files = sorted(rng.sample(corpus_files, max_per_source))
  2455	                        neg_tag_map[tag] = corpus_files
  2456	                        if verbose:
  2457	                            print(
  2458	                                f"  Shared corpus [{tag}]: {len(corpus_files)} files from {candidate}"
  2459	                            )
  2460	                        break  # found this tag, move to next
  2461	            if tag in neg_tag_map:
  2462	                break  # found in this root, move to next tag
  2463	
  2464	    corpus_paths = {
  2465	        "neg_librispeech": "~/.violawake/corpus/librispeech/   (speech recordings)",
  2466	        "neg_musan_speech": "~/.violawake/corpus/musan/speech/  (MUSAN speech subset)",
  2467	        "neg_musan_music": "~/.violawake/corpus/musan/music/   (MUSAN music subset)",
  2468	        "neg_musan_noise": "~/.violawake/corpus/musan/noise/   (MUSAN noise subset)",
  2469	    }
  2470	    found_corpus_tags = [tag for tag in _CORPUS_SUBDIRS if neg_tag_map.get(tag)]
  2471	    missing_corpus_tags = [tag for tag in _CORPUS_SUBDIRS if tag not in found_corpus_tags]
  2472	    if not found_corpus_tags:
  2473	        print(
  2474	            "\nWARNING: No universal negative corpus found.\n"
  2475	            "Training with TTS-only negatives may produce a model with high\n"
  2476	            "false positive rates on real speech and music.\n"
  2477	            "\n"
  2478	            "Place audio files in one of these locations:\n"
  2479	            "  ~/.violawake/corpus/librispeech/   (speech recordings)\n"
  2480	            "  ~/.violawake/corpus/musan/speech/  (MUSAN speech subset)\n"
  2481	            "  ~/.violawake/corpus/musan/music/   (MUSAN music subset)\n"
  2482	            "  ~/.violawake/corpus/musan/noise/   (MUSAN noise subset)\n"
  2483	            "\n"
  2484	            "Or provide negatives via: --negatives <dir>\n"
  2485	        )

 succeeded in 1053ms:
   220	                neg_tag_map["neg_speech"] = speech_files
   221	        except Exception as exc:
   222	            logger.error(
   223	                "Speech neg generation FAILED for job %s: %s — "
   224	                "model will have higher false positive rate on general speech",
   225	                job_id, exc,
   226	            )
   227	
   228	        _ensure_not_cancelled()
   229	
   230	        # Source 4: Universal corpus (LibriSpeech, MUSAN) if available
   231	        _CORPUS_SEARCH_PATHS = [
   232	            Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
   233	            Path.home() / ".violawake" / "corpus",
   234	            Path("corpus"),
   235	        ]
   236	        _CORPUS_SUBDIRS: dict[str, tuple[str, ...]] = {
   237	            "neg_librispeech": ("librispeech",),
   238	            "neg_musan_speech": ("musan/musan/speech", "musan/speech"),
   239	            "neg_musan_music": ("musan/musan/music", "musan/music"),
   240	            "neg_musan_noise": ("musan/musan/noise", "musan/noise"),
   241	        }
   242	        _rng = random.Random(42)
   243	        for tag, subdirs in _CORPUS_SUBDIRS.items():
   244	            for corpus_root in _CORPUS_SEARCH_PATHS:
   245	                if not corpus_root.exists():
   246	                    continue
   247	                for subdir in subdirs:
   248	                    candidate = corpus_root / subdir
   249	                    if candidate.exists():
   250	                        corpus_files = sorted(
   251	                            list(candidate.rglob("*.wav")) + list(candidate.rglob("*.flac"))
   252	                        )
   253	                        if corpus_files:
   254	                            if len(corpus_files) > 2000:
   255	                                corpus_files = sorted(_rng.sample(corpus_files, 2000))
   256	                            neg_tag_map[tag] = corpus_files
   257	                            break
   258	                if tag in neg_tag_map:
   259	                    break
   260	
   261	        all_neg_files: list[Path] = []
   262	        for files in neg_tag_map.values():
   263	            all_neg_files.extend(files)
   264	
   265	        total_neg = len(all_neg_files)
   266	        if total_neg < 5:
   267	            raise RuntimeError(
   268	                "Only %s negative files generated. "
   269	                "edge-tts may not be installed or network unavailable." % total_neg
   270	            )
   271	
   272	        progress_callback({
   273	            "status": "running",
   274	            "progress": 8.0,
   275	            "epoch": 0,
   276	            "total_epochs": epochs,
   277	            "train_loss": 0.0,
   278	            "val_loss": 0.0,
   279	            "message": "Corpus ready: %s pos, %s neg. Training TemporalCNN..." % (len(pos_files), total_neg),
   280	            "error": None,
   281	        })
   282	
   283	        started_at = time.monotonic()
   284	
   285	        def _on_epoch(info: dict[str, Any]) -> None:
   286	            _ensure_not_cancelled()
   287	            elapsed = time.monotonic() - started_at
   288	            if elapsed > timeout_seconds:
   289	                raise RuntimeError(
   290	                    "Training job timed out after %ss (%s minutes)"
   291	                    % (timeout_seconds, timeout_seconds // 60)
   292	                )
   293	
   294	            epoch = int(info.get("epoch", 0))
   295	            total_epochs = int(info.get("total_epochs", epochs)) or epochs
   296	            train_loss = float(info.get("train_loss", 0.0))
   297	            val_loss = float(info.get("val_loss", 0.0))
   298	            progress = min(10.0 + 85.0 * (epoch / total_epochs), 95.0)
   299	
   300	            progress_callback({
   301	                "status": "running",
   302	                "progress": round(progress, 2),
   303	                "epoch": epoch,
   304	                "total_epochs": total_epochs,
   305	                "train_loss": round(train_loss, 6),
   306	                "val_loss": round(val_loss, 6),
   307	                "message": "Epoch %s/%s - loss: %.4f" % (epoch, total_epochs, train_loss),
   308	                "error": None,
   309	            })
   310	
   311	        _train_temporal_cnn(
   312	            pos_files=pos_files,
   313	            neg_files=all_neg_files,
   314	            output_path=output_path,
   315	            wake_word=wake_word,
   316	            epochs=epochs,
   317	            augment=True,
   318	            eval_dir=None,
   319	            verbose=True,
   320	            progress_callback=_on_epoch,
   321	            neg_tags=neg_tag_map,
   322	            tmp_dir=settings.tmp_dir,
   323	            augment_source_files=user_pos_files,
   324	        )
   325	
   326	        _ensure_not_cancelled()
   327	        if not output_path.exists():
   328	            raise RuntimeError("Training completed but no model file was produced")
   329	
   330	        config_path = output_path.with_suffix(".config.json")
   331	        config_json: str | None = None
   332	        config_bytes: bytes | None = None
   333	        d_prime_value: float | None = None
   334	        if config_path.exists():
   335	            config_bytes = config_path.read_bytes()
   336	            config_data = json.loads(config_bytes.decode("utf-8"))
   337	            config_json = json.dumps(config_data)
   338	            if isinstance(config_data, dict):
   339	                raw_d_prime = config_data.get("d_prime")
   340	                if isinstance(raw_d_prime, (int, float)):
   341	                    d_prime_value = float(raw_d_prime)
   342	
   343	        return TrainingArtifact(
   344	            local_path=output_path,
   345	            config_json=config_json,
   346	            config_bytes=config_bytes,
   347	            d_prime=d_prime_value,
   348	            size_bytes=output_path.stat().st_size,
   349	        )
   350	    except TrainingCancelledError:
   351	        logger.info("Training job %s cancelled", job_id)
   352	        raise
   353	    except Exception as exc:
   354	        log_exception(
   355	            logger,
   356	            exc,
   357	            message="Training job failed",
   358	            source="training",
   359	            extra={"job_id": job_id, "wake_word": wake_word},
   360	        )
   361	        raise
   362	    finally:
   363	        if positives_dir is not None and positives_dir.exists():
   364	            shutil.rmtree(positives_dir, ignore_errors=True)
   365	        if neg_temp_dir is not None and neg_temp_dir.exists():
   366	            shutil.rmtree(neg_temp_dir, ignore_errors=True)

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '430,610p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '610,700p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/models.py | sed -n '1,120p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/models.py | sed -n '180,255p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1023ms:
   430	                sample_rate=TTS_SAMPLE_RATE,
   431	            )
   432	        except Exception:
   433	            kokoro_engine = None
   434	        return kokoro_engine is not None
   435	
   436	    if verbose:
   437	        total = len(EDGE_TTS_VOICES) * len(phrases)
   438	        print(
   439	            f"  Generating TTS positives: {len(EDGE_TTS_VOICES)} voices x {len(phrases)} phrases = {total} clean samples..."
   440	        )
   441	
   442	    for voice_idx, voice in enumerate(EDGE_TTS_VOICES):
   443	        for phrase_idx, phrase in enumerate(phrases):
   444	            clean_path = output_dir / f"tts_pos_{voice_idx:02d}_{phrase_idx}_{voice}.wav"
   445	            if clean_path.exists():
   446	                generated.append(clean_path)
   447	                continue
   448	
   449	            if kokoro_fallback:
   450	                kokoro_voice = kokoro_voices[voice_idx % len(kokoro_voices)]
   451	                ok = _kokoro_tts_synthesize(
   452	                    phrase,
   453	                    kokoro_voice,
   454	                    clean_path,
   455	                    engine=kokoro_engine,
   456	                )
   457	            else:
   458	                ok = _edge_tts_synthesize(phrase, voice, clean_path)
   459	                if not ok and _ensure_kokoro_ready():
   460	                    kokoro_voice = kokoro_voices[voice_idx % len(kokoro_voices)]
   461	                    ok = _kokoro_tts_synthesize(
   462	                        phrase,
   463	                        kokoro_voice,
   464	                        clean_path,
   465	                        engine=kokoro_engine,
   466	                    )
   467	            if ok and clean_path.exists():
   468	                generated.append(clean_path)
   469	
   470	                # Generate noisy variant
   471	                try:
   472	                    from violawake_sdk.audio import load_audio
   473	                    from violawake_sdk.training.augment import apply_additive_noise
   474	
   475	                    audio = load_audio(clean_path)
   476	                    if audio is not None and len(audio) > 0:
   477	                        rng = np.random.default_rng(voice_idx * 100 + phrase_idx)
   478	
   479	                        # Noisy variant (SNR 10-15 dB)
   480	                        noisy = apply_additive_noise(audio, snr_db=12.0, rng=rng)
   481	                        noisy_path = (
   482	                            output_dir / f"tts_pos_{voice_idx:02d}_{phrase_idx}_{voice}_noisy.wav"
   483	                        )
   484	                        _save_wav(noisy, noisy_path)
   485	                        generated.append(noisy_path)
   486	
   487	                        # Reverb variant
   488	                        reverbed = rir_augment(audio, rng=rng)
   489	                        reverb_path = (
   490	                            output_dir / f"tts_pos_{voice_idx:02d}_{phrase_idx}_{voice}_reverb.wav"
   491	                        )
   492	                        _save_wav(reverbed, reverb_path)
   493	                        generated.append(reverb_path)
   494	                except Exception:
   495	                    pass  # Augmented variants are best-effort
   496	
   497	        if verbose and (voice_idx + 1) % 5 == 0:
   498	            print(
   499	                f"    {voice_idx + 1}/{len(EDGE_TTS_VOICES)} voices done ({len(generated)} files)"
   500	            )
   501	
   502	    if verbose:
   503	        print(f"  TTS positives generated: {len(generated)} files")
   504	
   505	    return generated
   506	
   507	
   508	def _generate_confusable_negatives(
   509	    wake_word: str,
   510	    output_dir: Path,
   511	    n_confusables: int = 30,
   512	    voices_per_word: int = 10,
   513	    verbose: bool = True,
   514	) -> list[Path]:
   515	    """Generate confusable negative samples via TTS.
   516	
   517	    Uses the confusables generator to find phonetically similar words,
   518	    then synthesizes each with multiple TTS voices.
   519	
   520	    Returns list of generated WAV file paths.
   521	    """
   522	    from violawake_sdk.tools.confusables import generate_confusables
   523	
   524	    output_dir.mkdir(parents=True, exist_ok=True)
   525	    confusable_words = generate_confusables(wake_word, count=n_confusables)
   526	
   527	    if verbose:
   528	        print(f"  Generated {len(confusable_words)} confusable words for '{wake_word}'")
   529	        if confusable_words[:5]:
   530	            print(f"    Top 5: {', '.join(confusable_words[:5])}")
   531	        total = len(confusable_words) * voices_per_word
   532	        print(
   533	            f"  Synthesizing: {len(confusable_words)} words x {voices_per_word} voices = {total} samples..."
   534	        )
   535	
   536	    voices_subset = EDGE_TTS_VOICES[:voices_per_word]
   537	    generated: list[Path] = []
   538	
   539	    for word_idx, word in enumerate(confusable_words):
   540	        for voice_idx, voice in enumerate(voices_subset):
   541	            safe_word = word.replace(" ", "_")[:30]
   542	            out_path = output_dir / f"confusable_{word_idx:03d}_{voice_idx}_{safe_word}.wav"
   543	            if out_path.exists():
   544	                generated.append(out_path)
   545	                continue
   546	
   547	            ok = _edge_tts_synthesize(word, voice, out_path)
   548	            if ok and out_path.exists():
   549	                generated.append(out_path)
   550	
   551	        if verbose and (word_idx + 1) % 10 == 0:
   552	            print(f"    {word_idx + 1}/{len(confusable_words)} words done ({len(generated)} files)")
   553	
   554	    if verbose:
   555	        print(f"  Confusable negatives generated: {len(generated)} files")
   556	
   557	    return generated
   558	
   559	
   560	def _generate_speech_negatives(
   561	    output_dir: Path,
   562	    n_voices: int = 5,
   563	    verbose: bool = True,
   564	) -> list[Path]:
   565	    """Generate speech negative samples via TTS using common phrases.
   566	
   567	    Returns list of generated WAV file paths.
   568	    """
   569	    output_dir.mkdir(parents=True, exist_ok=True)
   570	    voices_subset = EDGE_TTS_VOICES[:n_voices]
   571	    generated: list[Path] = []
   572	
   573	    total = len(SPEECH_NEGATIVE_PHRASES) * n_voices
   574	    if verbose:
   575	        print(
   576	            f"  Generating speech negatives: {len(SPEECH_NEGATIVE_PHRASES)} phrases x {n_voices} voices = {total} samples..."
   577	        )
   578	
   579	    for phrase_idx, phrase in enumerate(SPEECH_NEGATIVE_PHRASES):
   580	        for voice_idx, voice in enumerate(voices_subset):
   581	            safe_phrase = phrase.replace(" ", "_")[:40]
   582	            out_path = output_dir / f"speech_neg_{phrase_idx:03d}_{voice_idx}_{safe_phrase}.wav"
   583	            if out_path.exists():
   584	                generated.append(out_path)
   585	                continue
   586	
   587	            ok = _edge_tts_synthesize(phrase, voice, out_path)
   588	            if ok and out_path.exists():
   589	                generated.append(out_path)
   590	
   591	        if verbose and (phrase_idx + 1) % 25 == 0:
   592	            print(
   593	                f"    {phrase_idx + 1}/{len(SPEECH_NEGATIVE_PHRASES)} phrases done ({len(generated)} files)"
   594	            )
   595	
   596	    if verbose:
   597	        print(f"  Speech negatives generated: {len(generated)} files")
   598	
   599	    return generated
   600	
   601	
   602	def _save_wav(audio: np.ndarray, path: Path, sample_rate: int = 16000) -> None:
   603	    """Save float32 audio to a WAV file."""
   604	    import wave
   605	
   606	    import numpy as np
   607	
   608	    audio = np.clip(audio, -1.0, 1.0)
   609	    pcm_i16 = (audio * 32767).astype(np.int16)
   610	    with wave.open(str(path), "wb") as wf:

 succeeded in 1026ms:
   610	    with wave.open(str(path), "wb") as wf:
   611	        wf.setnchannels(1)
   612	        wf.setsampwidth(2)
   613	        wf.setframerate(sample_rate)
   614	        wf.writeframes(pcm_i16.tobytes())
   615	
   616	
   617	# ---------------------------------------------------------------------------
   618	# Positive augmentation and temporal embedding helpers
   619	# ---------------------------------------------------------------------------
   620	
   621	
   622	def _augment_positives(
   623	    raw_audio_arrays: list[np.ndarray],
   624	    *,
   625	    sample_rate: int = 16000,
   626	    copies_per_clip: int = 21,
   627	    seed: int = 42,
   628	) -> list[np.ndarray]:
   629	    """Augment positive clips with the roadmap audiomentations chain.
   630	
   631	    This operates on raw waveform arrays before OWW embedding extraction and
   632	    returns only augmented copies (the originals remain unchanged).
   633	    """
   634	    import numpy as np
   635	
   636	    try:
   637	        from audiomentations import (
   638	            Compose,
   639	            Gain,
   640	            Mp3Compression,
   641	            PitchShift,
   642	            TimeMask,
   643	            TimeStretch,
   644	        )
   645	    except ImportError as e:
   646	        raise RuntimeError(
   647	            "audiomentations is required for positive augmentation. "
   648	            "Install with: pip install 'violawake[training]'"
   649	        ) from e
   650	
   651	    if not raw_audio_arrays:
   652	        return []
   653	
   654	    augmenter = Compose(
   655	        [
   656	            Gain(min_gain_db=-6.0, max_gain_db=6.0, p=0.8),
   657	            TimeStretch(min_rate=0.9, max_rate=1.1, p=0.5),
   658	            PitchShift(min_semitones=-2.0, max_semitones=2.0, p=0.5),
   659	            Mp3Compression(min_bitrate=32, max_bitrate=128, p=0.3),
   660	            TimeMask(min_band_part=0.0, max_band_part=0.1, p=0.3),
   661	        ],
   662	        shuffle=False,
   663	    )
   664	
   665	    augmented: list[np.ndarray] = []
   666	    rng = np.random.default_rng(seed)
   667	
   668	    for audio in raw_audio_arrays:
   669	        base_audio = np.asarray(audio, dtype=np.float32)
   670	        for _ in range(copies_per_clip):
   671	            # audiomentations reads numpy's global RNG internally.
   672	            np.random.seed(int(rng.integers(0, 2**31 - 1)))
   673	            augmented_audio = augmenter(samples=base_audio.copy(), sample_rate=sample_rate)
   674	            augmented.append(np.asarray(augmented_audio, dtype=np.float32))
   675	
   676	    return augmented
   677	
   678	
   679	def _prepare_audio_for_oww(
   680	    audio: np.ndarray,
   681	    *,
   682	    clip_name: str,
   683	    verbose: bool,
   684	) -> np.ndarray | None:
   685	    """Center-crop/pad an audio clip and convert it to int16 for OWW."""
   686	    import numpy as np
   687	
   688	    from violawake_sdk._constants import CLIP_SAMPLES
   689	    from violawake_sdk.audio import center_crop
   690	
   691	    audio_f32 = np.asarray(audio, dtype=np.float32)
   692	    if audio_f32.size == 0:
   693	        return None
   694	
   695	    audio_rms = float(np.sqrt(np.mean(audio_f32**2)))
   696	    if audio_rms < 1e-6:
   697	        if verbose:
   698	            print(f"    WARNING: Skipping zero-energy clip: {clip_name}")
   699	        return None
   700	

 succeeded in 1009ms:
     1	"""Model registry, download, and cache management.
     2	
     3	Models are distributed via GitHub Releases (not PyPI — too large).
     4	This module handles:
     5	  - Declaring the model registry (name, URL, SHA-256, size)
     6	  - Downloading models on demand with progress and verification
     7	  - Caching models in ~/.violawake/models/ (or VIOLAWAKE_MODEL_DIR)
     8	
     9	See ADR-005 for the full rationale behind this distribution approach.
    10	"""
    11	
    12	from __future__ import annotations
    13	
    14	import contextlib
    15	import hashlib
    16	import logging
    17	import os
    18	import sys
    19	from dataclasses import dataclass
    20	from pathlib import Path
    21	
    22	from violawake_sdk._exceptions import ModelNotFoundError
    23	
    24	logger = logging.getLogger(__name__)
    25	
    26	# Default model cache directory
    27	DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
    28	
    29	
    30	@dataclass(frozen=True)
    31	class ModelSpec:
    32	    """Specification for a downloadable model."""
    33	
    34	    name: str
    35	    url: str
    36	    sha256: str
    37	    size_bytes: int
    38	    description: str
    39	    version: str = "latest"
    40	
    41	
    42	# ──────────────────────────────────────────────────────────────────────────────
    43	# Model Registry
    44	# Update this table when releasing new model versions.
    45	# SHA-256 values are filled in during the release process by tools/update_model_registry.py.
    46	# ──────────────────────────────────────────────────────────────────────────────
    47	MODEL_REGISTRY: dict[str, ModelSpec] = {
    48	    "temporal_cnn": ModelSpec(
    49	        name="temporal_cnn",
    50	        url="https://github.com/GeeIHadAGoodTime/ViolaWake/releases/download/v0.1.0/temporal_cnn.onnx",
    51	        sha256="9c0b12c68593cfdb3d320a3b34667913b18d63e89eb01247d6332d7839ac9efe",
    52	        size_bytes=102378,
    53	        description="Temporal CNN on OWW embeddings — production default, best live recall + lowest FP",
    54	        version="0.1.0",
    55	    ),
    56	    "oww_backbone": ModelSpec(
    57	        name="oww_backbone",
    58	        # Not directly downloadable — fetched at runtime by openwakeword.utils.download_models()
    59	        # on first WakeDetector use. URL is a reference only; download is blocked by
    60	        # _PACKAGE_MANAGED_MODELS. The pinned sha256 is the combined SHA-256 of
    61	        # (mel_sha256 + emb_sha256) — see oww_backbone._verify_backbone_integrity.
    62	        # Updated 2026-05-07 to track current upstream openwakeword release.
    63	        url="https://github.com/dscripka/openWakeWord/tree/main/openwakeword/resources",
    64	        sha256="e8444299a314fbb2971d33b39ff6fce4838be0f4a8d98aa4cf87537ee1350454",
    65	        size_bytes=1_326_578,
    66	        description="OpenWakeWord embedding backbone — installed with openwakeword package, downloaded at runtime",
    67	        version="0.6.0",
    68	    ),
    69	    # Kokoro TTS models hosted upstream at thewh1teagle/kokoro-onnx (Apache 2.0).
    70	    # These are large (325MB + 28MB) so they're not bundled in the PyPI package —
    71	    # they auto-download on first TTSEngine use.
    72	    "kokoro_v1_0": ModelSpec(
    73	        name="kokoro_v1_0",
    74	        url="https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx",
    75	        sha256="7d5df8ecf7d4b1878015a32686053fd0eebe2bc377234608764cc0ef3636a6c5",
    76	        size_bytes=325_532_387,
    77	        description="Kokoro-82M TTS model — Apache 2.0 licensed, 24kHz output",
    78	        version="1.0",
    79	    ),
    80	    "kokoro_voices_v1_0": ModelSpec(
    81	        name="kokoro_voices_v1_0",
    82	        url="https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin",
    83	        sha256="bca610b8308e8d99f32e6fe4197e7ec01679264efed0cac9140fe9c29f1fbf7d",
    84	        size_bytes=28_214_398,
    85	        description="Kokoro voice embeddings — required for TTS",
    86	        version="1.0",
    87	    ),
    88	    # temporal_cnn_tflite removed — TFLite conversion not yet validated.
    89	    # Re-add once .tflite is generated and uploaded to GitHub Releases.
    90	    "temporal_convgru": ModelSpec(
    91	        name="temporal_convgru",
    92	        url="https://github.com/GeeIHadAGoodTime/ViolaWake/releases/download/v0.1.0/temporal_convgru.onnx",
    93	        sha256="5990edf8c6228c45a53e08a270ec4ad9350c9133091ebd95d023a5745578e6f8",
    94	        size_bytes=80780,
    95	        description="Temporal Conv-GRU on OWW embeddings — reserve model",
    96	        version="0.1.0",
    97	    ),
    98	    "r3_10x_s42": ModelSpec(
    99	        name="r3_10x_s42",
   100	        url="https://github.com/GeeIHadAGoodTime/ViolaWake/releases/download/v0.1.0/r3_10x_s42.onnx",
   101	        sha256="98d028fe36ed10c51791da91fcead6c7c0dd3149049da02a0436ed008da7a362",
   102	        size_bytes=34010,
   103	        description="MLP on OWW embeddings — DEPRECATED: fails live mic test (max score 0.50)",
   104	        version="0.1.0",
   105	    ),
   106	}
   107	# Alias: "viola" resolves to "temporal_cnn" — the current production default model.
   108	# This allows callers to use the generic name without coupling to a specific architecture.
   109	MODEL_REGISTRY["viola"] = MODEL_REGISTRY["temporal_cnn"]
   110	
   111	_PACKAGE_MANAGED_MODELS = {"oww_backbone"}
   112	
   113	# Size validation tolerance: downloaded file size must be within this fraction of
   114	# the declared size_bytes.  5% accommodates minor compression/header differences
   115	# while still catching truncated or wildly wrong files.
   116	SIZE_TOLERANCE_FRACTION = 0.05
   117	
   118	
   119	def check_registry_integrity(*, strict: bool = True) -> list[str]:
   120	    """Verify that no model in MODEL_REGISTRY has a placeholder SHA-256 hash.

 succeeded in 1059ms:
   180	
   181	    This is a lightweight download path that does NOT require tqdm or
   182	    requests as hard dependencies — it uses urllib from the standard library.
   183	    For full download features (progress bars, pinning), use ``download_model()``.
   184	
   185	    Models with placeholder hashes are refused (raises RuntimeError).
   186	
   187	    Args:
   188	        model_name: Name from MODEL_REGISTRY.
   189	        spec: The ModelSpec for this model.
   190	
   191	    Returns:
   192	        Path to the downloaded model file.
   193	
   194	    Raises:
   195	        RuntimeError: If the download fails.
   196	    """
   197	    import tempfile
   198	    import urllib.error
   199	    import urllib.request
   200	
   201	    url_suffix = Path(spec.url).suffix
   202	    ext = url_suffix or ".onnx"
   203	    model_dir = get_model_dir()
   204	    model_path = model_dir / f"{spec.name}{ext}"
   205	
   206	    has_placeholder = "placeholder" in spec.sha256.lower()
   207	    size_str = _format_size(spec.size_bytes)
   208	
   209	    # Print progress to stderr so it doesn't interfere with piped output
   210	    print(
   211	        f"Downloading model '{model_name}' ({size_str})...",
   212	        end="",
   213	        flush=True,
   214	        file=sys.stderr,
   215	    )
   216	
   217	    if has_placeholder:
   218	        print(" REFUSED", file=sys.stderr, flush=True)
   219	        raise RuntimeError(
   220	            f"Model '{model_name}' has a placeholder SHA-256 hash and cannot be "
   221	            f"verified. This model was never released. Use 'temporal_cnn' instead."
   222	        )
   223	
   224	    # Reject non-HTTPS URLs to prevent MITM attacks on model downloads
   225	    if not spec.url.startswith("https://"):
   226	        raise ValueError(
   227	            f"Refusing to download model '{model_name}' from non-HTTPS URL: "
   228	            f"{spec.url}. Only HTTPS URLs are allowed for model downloads."
   229	        )
   230	
   231	    tmp_fd = None
   232	    tmp_path = None
   233	    try:
   234	        response = urllib.request.urlopen(spec.url, timeout=60)  # noqa: S310 — URLs validated as HTTPS above
   235	
   236	        tmp_fd, tmp_path_str = tempfile.mkstemp(
   237	            dir=str(model_dir),
   238	            prefix=f".{model_name}_autodownload_",
   239	            suffix=".tmp",
   240	        )
   241	        tmp_path = Path(tmp_path_str)
   242	        os.chmod(tmp_path_str, 0o600)
   243	
   244	        with os.fdopen(tmp_fd, "wb") as f:
   245	            tmp_fd = None  # os.fdopen takes ownership
   246	            chunk_size = 65_536
   247	            while True:
   248	                chunk = response.read(chunk_size)
   249	                if not chunk:
   250	                    break
   251	                f.write(chunk)
   252	
   253	        # Atomic rename
   254	        tmp_path.replace(model_path)
   255	        tmp_path = None

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/models.py | sed -n '255,375p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/models.py | sed -n '455,515p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/backends/onnx_backend.py | sed -n '60,115p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/backends/__init__.py | sed -n '1,105p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1210ms:
    60	        input_feed: dict[str, np.ndarray],
    61	    ) -> list[np.ndarray]:
    62	        return self._session.run(output_names, input_feed)
    63	
    64	
    65	class OnnxBackend(InferenceBackend):
    66	    """Backend powered by ``onnxruntime``."""
    67	
    68	    def __init__(self, providers: list[str] | None = None) -> None:
    69	        # Import eagerly so ``get_backend("onnx")`` fails fast when
    70	        # onnxruntime is not installed.
    71	        import onnxruntime  # noqa: F401
    72	
    73	        self._providers = providers or ["CPUExecutionProvider"]
    74	
    75	    @property
    76	    def name(self) -> str:
    77	        return "onnx"
    78	
    79	    def load(self, model_path: str | Path, **kwargs: Any) -> OnnxSession:
    80	        import onnxruntime as ort
    81	
    82	        model_path = Path(model_path)
    83	        if not model_path.exists():
    84	            raise FileNotFoundError(f"Model file not found: {model_path}")
    85	
    86	        providers = kwargs.get("providers", self._providers)
    87	        try:
    88	            session = ort.InferenceSession(str(model_path), providers=providers)
    89	        except Exception as e:
    90	            raise ModelLoadError(f"ONNX Runtime failed to load {model_path}: {e}") from e
    91	
    92	        logger.debug("OnnxBackend loaded: %s", model_path)
    93	        return OnnxSession(session)
    94	
    95	    def is_available(self) -> bool:
    96	        try:
    97	            import onnxruntime  # noqa: F401
    98	
    99	            return True
   100	        except ImportError:
   101	            return False

 succeeded in 1247ms:
   255	        tmp_path = None
   256	
   257	    except Exception as e:
   258	        if tmp_fd is not None:
   259	            with contextlib.suppress(OSError):
   260	                os.close(tmp_fd)
   261	        if tmp_path is not None:
   262	            with contextlib.suppress(OSError):
   263	                tmp_path.unlink(missing_ok=True)
   264	        print(" FAILED", file=sys.stderr, flush=True)
   265	        raise RuntimeError(
   266	            f"Auto-download of model '{model_name}' failed: {e}. "
   267	            f"Download manually with: violawake-download --model {model_name}"
   268	        ) from e
   269	
   270	    print(" done.", file=sys.stderr, flush=True)
   271	
   272	    # Verify SHA-256 (placeholder hashes are blocked above, so this always runs)
   273	    _verify_sha256(model_path, spec.sha256, model_name)
   274	
   275	    # Size validation (same logic as download_model)
   276	    actual_size = model_path.stat().st_size
   277	    if spec.size_bytes and abs(actual_size - spec.size_bytes) > max(
   278	        1024, spec.size_bytes * SIZE_TOLERANCE_FRACTION
   279	    ):
   280	        model_path.unlink(missing_ok=True)
   281	        raise ValueError(
   282	            f"Size validation failed for auto-downloaded '{model_name}'. "
   283	            f"Expected ~{spec.size_bytes} bytes, got {actual_size} bytes. "
   284	            f"File deleted — retry or download manually."
   285	        )
   286	
   287	    logger.info(
   288	        "Auto-downloaded model '%s' to %s (%.1f MB)",
   289	        model_name,
   290	        model_path,
   291	        model_path.stat().st_size / 1e6,
   292	    )
   293	    return model_path
   294	
   295	
   296	def get_model_path(
   297	    model_name: str,
   298	    *,
   299	    auto_download: bool = True,
   300	    format: str | None = None,
   301	) -> Path:
   302	    """Return the local path for a model, optionally downloading on first use.
   303	
   304	    When the model is not cached and ``auto_download`` is True (the default),
   305	    the model is downloaded automatically from the registry.  Auto-download
   306	    can be disabled globally with the ``VIOLAWAKE_NO_AUTO_DOWNLOAD=1``
   307	    environment variable, or per-call with ``auto_download=False``.
   308	
   309	    Args:
   310	        model_name: Name from MODEL_REGISTRY (without extension).
   311	        auto_download: If True (default), download the model if not cached.
   312	            Set to False to get the old raise-on-missing behavior.
   313	        format: Desired model format (``"onnx"`` or ``"tflite"``).
   314	            When ``"tflite"`` is requested, the function first checks for
   315	            a ``<model_name>_tflite`` entry in the registry, then falls back
   316	            to looking for a locally converted ``.tflite`` file next to the
   317	            ONNX model.  Defaults to ``None`` (use the registry's native
   318	            format, typically ONNX).
   319	
   320	    Returns:
   321	        Path to the cached model file.
   322	
   323	    Raises:
   324	        FileNotFoundError: If model is not in cache and auto-download is
   325	            disabled or unavailable.
   326	        ModelNotFoundError: If model_name is not in MODEL_REGISTRY.
   327	    """
   328	    # Handle format-based lookup: resolve "temporal_cnn" + format="tflite"
   329	    # to the "temporal_cnn_tflite" registry entry if it exists.
   330	    effective_name = model_name
   331	    if format == "tflite" and not model_name.endswith("_tflite"):
   332	        tflite_name = f"{model_name}_tflite"
   333	        if tflite_name in MODEL_REGISTRY:
   334	            effective_name = tflite_name
   335	        else:
   336	            # No registry entry; check for a locally converted .tflite file
   337	            if model_name in MODEL_REGISTRY:
   338	                spec = MODEL_REGISTRY[model_name]
   339	                tflite_path = get_model_dir() / f"{spec.name}.tflite"
   340	                if tflite_path.exists():
   341	                    return tflite_path
   342	                raise FileNotFoundError(
   343	                    f"TFLite version of '{model_name}' not found at {tflite_path}. "
   344	                    f"Convert with: violawake-download --model {model_name} --format tflite"
   345	                )
   346	            raise ModelNotFoundError(f"Unknown model '{model_name}'. No TFLite variant found.")
   347	
   348	    if effective_name not in MODEL_REGISTRY:
   349	        available = ", ".join(MODEL_REGISTRY.keys())
   350	        raise ModelNotFoundError(f"Unknown model '{effective_name}'. Available: {available}")
   351	
   352	    if effective_name in _PACKAGE_MANAGED_MODELS:
   353	        raise FileNotFoundError(
   354	            f"Model '{effective_name}' is provided by the openwakeword package, not the ViolaWake "
   355	            "model cache. Install with: pip install openwakeword"
   356	        )
   357	
   358	    spec = MODEL_REGISTRY[effective_name]
   359	    # Determine file extension from URL
   360	    url_suffix = Path(spec.url).suffix
   361	    if not url_suffix:
   362	        logger.warning(
   363	            "URL for model '%s' has no file extension; defaulting to .onnx",
   364	            effective_name,
   365	        )
   366	    ext = url_suffix or ".onnx"
   367	    model_path = get_model_dir() / f"{spec.name}{ext}"
   368	
   369	    if not model_path.exists():
   370	        if auto_download and not _is_auto_download_disabled():
   371	            return _auto_download_model(effective_name, spec)
   372	
   373	        raise FileNotFoundError(
   374	            f"Model '{effective_name}' not found in cache at {model_path}. "
   375	            f"Run: violawake-download --model {effective_name}"

 succeeded in 1255ms:
     1	"""Inference backend abstraction for ViolaWake.
     2	
     3	Supports multiple inference runtimes (ONNX Runtime, TFLite) behind a
     4	unified interface.  The ``get_backend()`` factory selects a backend
     5	by name or auto-detects the best available runtime.
     6	
     7	Public API::
     8	
     9	    from violawake_sdk.backends import get_backend, InferenceBackend
    10	
    11	    backend = get_backend("auto")           # onnx > tflite
    12	    session = backend.load("model.onnx")    # returns a BackendSession
    13	    out = session.run(input_array)          # numpy in, numpy out
    14	"""
    15	
    16	from __future__ import annotations
    17	
    18	import logging
    19	from typing import TYPE_CHECKING
    20	
    21	if TYPE_CHECKING:
    22	    from violawake_sdk.backends.base import InferenceBackend
    23	
    24	logger = logging.getLogger(__name__)
    25	
    26	# Re-export public names
    27	from violawake_sdk.backends.base import BackendSession, InferenceBackend  # noqa: E402, F811
    28	
    29	
    30	def get_backend(
    31	    name: str = "auto",
    32	    providers: list[str] | None = None,
    33	) -> InferenceBackend:
    34	    """Return an inference backend instance by name.
    35	
    36	    Args:
    37	        name: Backend selector.  One of:
    38	            - ``"onnx"`` -- ONNX Runtime (requires ``onnxruntime``).
    39	            - ``"tflite"`` -- TFLite Runtime (requires ``tflite-runtime``
    40	              or ``tensorflow``).
    41	            - ``"auto"`` -- Try ONNX Runtime first, fall back to TFLite.
    42	        providers: ONNX Runtime execution providers (ignored for TFLite).
    43	            Default: ``["CPUExecutionProvider"]``.
    44	
    45	    Returns:
    46	        An ``InferenceBackend`` instance ready to load models.
    47	
    48	    Raises:
    49	        ImportError: If no suitable runtime is installed.
    50	        ValueError: If *name* is not a recognised backend.
    51	    """
    52	    if name == "onnx":
    53	        return _make_onnx(providers)
    54	    if name == "tflite":
    55	        return _make_tflite()
    56	    if name == "auto":
    57	        return _auto_select(providers)
    58	    raise ValueError(f"Unknown backend {name!r}.  Choose from: 'onnx', 'tflite', 'auto'.")
    59	
    60	
    61	def _make_onnx(providers: list[str] | None = None) -> InferenceBackend:
    62	    from violawake_sdk.backends.onnx_backend import OnnxBackend
    63	
    64	    return OnnxBackend(providers=providers or ["CPUExecutionProvider"])
    65	
    66	
    67	def _make_tflite() -> InferenceBackend:
    68	    from violawake_sdk.backends.tflite_backend import TFLiteBackend
    69	
    70	    return TFLiteBackend()
    71	
    72	
    73	def _auto_select(providers: list[str] | None = None) -> InferenceBackend:
    74	    """Try ONNX Runtime first, then TFLite, raising if neither works."""
    75	    try:
    76	        backend = _make_onnx(providers)
    77	        logger.debug("Auto-selected ONNX Runtime backend")
    78	        return backend
    79	    except ImportError:
    80	        pass
    81	
    82	    try:
    83	        backend = _make_tflite()
    84	        logger.debug("Auto-selected TFLite backend")
    85	        return backend
    86	    except ImportError:
    87	        pass
    88	
    89	    raise ImportError(
    90	        "No inference backend available. Install onnxruntime: pip install "
    91	        "violawake  OR  tflite-runtime: pip install violawake[tflite]"
    92	    )
    93	
    94	
    95	__all__ = [
    96	    "BackendSession",
    97	    "InferenceBackend",
    98	    "get_backend",
    99	]

 succeeded in 1292ms:
   455	        )
   456	    ext = url_suffix or ".onnx"
   457	    model_path = get_model_dir() / f"{spec.name}{ext}"
   458	
   459	    if model_path.exists() and not force:
   460	        logger.info("Model already cached: %s", model_path)
   461	        if verify:
   462	            _verify_sha256(model_path, spec.sha256, model_name)
   463	        return model_path
   464	
   465	    logger.info("Downloading model '%s' from %s", model_name, spec.url)
   466	
   467	    # G3: Atomic write — download to a temp file, rename on success.
   468	    # This prevents partial/corrupt files from being left in the cache
   469	    # if the download is interrupted or fails mid-stream.
   470	    import tempfile
   471	
   472	    # Certificate pinning available but requires real SPKI pins. See security/cert_pinning.py
   473	    if use_pinning:
   474	        from violawake_sdk.security.cert_pinning import pinned_download
   475	
   476	        response = pinned_download(
   477	            spec.url,
   478	            model_path,
   479	            verify_pin=True,
   480	            timeout=30,
   481	        )
   482	    else:
   483	        response = requests.get(spec.url, stream=True, timeout=30)
   484	    response.raise_for_status()
   485	
   486	    total_bytes = int(response.headers.get("content-length", spec.size_bytes))
   487	    chunk_size = 8_192
   488	
   489	    tmp_fd = None
   490	    tmp_path = None
   491	    try:
   492	        # Create temp file in same directory as dest (ensures same filesystem for rename)
   493	        # Use mode 0o600 to prevent world-readable temp files
   494	        tmp_fd, tmp_path_str = tempfile.mkstemp(
   495	            dir=str(model_path.parent),
   496	            prefix=f".{model_name}_download_",
   497	            suffix=".tmp",
   498	        )
   499	        tmp_path = Path(tmp_path_str)
   500	        os.chmod(tmp_path_str, 0o600)
   501	
   502	        with (
   503	            os.fdopen(tmp_fd, "wb") as f,
   504	            tqdm(
   505	                total=total_bytes,
   506	                unit="B",
   507	                unit_scale=True,
   508	                desc=f"Downloading {model_name}",
   509	            ) as progress,
   510	        ):
   511	            tmp_fd = None  # os.fdopen takes ownership of the fd
   512	            for chunk in response.iter_content(chunk_size=chunk_size):
   513	                if chunk:
   514	                    f.write(chunk)
   515	                    progress.update(len(chunk))

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/backends/tflite_backend.py | sed -n '95,140p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/backends/tflite_backend.py | sed -n '298,340p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/wake_detector.py | sed -n '300,365p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/wake_detector.py | sed -n '560,630p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1187ms:
   298	    """Backend powered by ``tflite-runtime`` (or ``tensorflow.lite``)."""
   299	
   300	    def __init__(self) -> None:
   301	        # Fail fast if the runtime is missing
   302	        self._interpreter_cls = _get_tflite_interpreter_class()
   303	
   304	    @property
   305	    def name(self) -> str:
   306	        return "tflite"
   307	
   308	    def load(self, model_path: str | Path, **kwargs: Any) -> TFLiteSession:
   309	        model_path = Path(model_path)
   310	        if not model_path.exists():
   311	            raise FileNotFoundError(f"Model file not found: {model_path}")
   312	
   313	        num_threads: int = kwargs.get("num_threads", 2)
   314	        try:
   315	            interpreter = self._interpreter_cls(
   316	                model_path=str(model_path),
   317	                num_threads=num_threads,
   318	            )
   319	            interpreter.allocate_tensors()
   320	        except Exception as e:
   321	            raise ModelLoadError(f"TFLite failed to load {model_path}: {e}") from e
   322	
   323	        logger.debug("TFLiteBackend loaded: %s (threads=%d)", model_path, num_threads)
   324	        return TFLiteSession(interpreter, model_path)
   325	
   326	    def is_available(self) -> bool:
   327	        try:
   328	            _get_tflite_interpreter_class()
   329	            return True
   330	        except ImportError:
   331	            return False
   332	
   333	
   334	# ──────────────────────────────────────────────────────────────────────────────
   335	# Model conversion utility:  ONNX -> TFLite
   336	# ──────────────────────────────────────────────────────────────────────────────
   337	
   338	
   339	def convert_onnx_to_tflite(
   340	    onnx_path: str | Path,

 succeeded in 1203ms:
    95	    )
    96	    if np_dt is None:
    97	        np_dt = np.dtype(np.float32)  # safe fallback
    98	    return np_dt, type_str
    99	
   100	
   101	def _get_tflite_interpreter_class() -> type:
   102	    """Import and return the TFLite ``Interpreter`` class.
   103	
   104	    Tries ``tflite_runtime`` first (lightweight, ~5 MB), then falls
   105	    back to the full ``tensorflow`` package.
   106	
   107	    Raises:
   108	        ImportError: If neither package is installed.
   109	    """
   110	    try:
   111	        from tflite_runtime.interpreter import Interpreter
   112	
   113	        return Interpreter
   114	    except ImportError:
   115	        pass
   116	
   117	    try:
   118	        from tensorflow.lite.python.interpreter import Interpreter
   119	
   120	        return Interpreter
   121	    except ImportError:
   122	        pass
   123	
   124	    raise ImportError(
   125	        "TFLite backend runtime is not available. Install one of:\n"
   126	        "  pip install violawake[tflite]\n"
   127	        "  pip install tflite-runtime>=2.14"
   128	    )
   129	
   130	
   131	# ──────────────────────────────────────────────────────────────────────────────
   132	# TFLiteSession
   133	# ──────────────────────────────────────────────────────────────────────────────
   134	
   135	
   136	class TFLiteSession(BackendSession):
   137	    """Wraps a TFLite ``Interpreter`` with the same API as ``OnnxSession``.
   138	
   139	    TFLite models do not carry tensor *names* (they use integer indices),
   140	    so this class synthesises names of the form ``input_0``, ``output_0``

 succeeded in 1170ms:
   300	    Args:
   301	        model: Model name from the registry, or a path to a model file.
   302	        threshold: Detection confidence threshold in [0.0, 1.0].
   303	        cooldown_s: Minimum seconds between consecutive detections.
   304	        providers: ONNX Runtime execution providers (ignored for TFLite).
   305	        backend: Inference backend selector (``"onnx"``, ``"tflite"``, ``"auto"``).
   306	        config: A ``DetectorConfig`` instance bundling all advanced options.
   307	            Mutually exclusive with the individual advanced kwargs below.
   308	        models: Additional model paths for ensemble scoring (K3).
   309	        fusion_strategy: Score fusion strategy for ensemble (K3).
   310	        fusion_weights: Per-model weights for weighted_average fusion (K3).
   311	        adaptive_threshold: Enable dynamic threshold based on noise (K4).
   312	        noise_profiler: Custom NoiseProfiler instance (K4).
   313	        speaker_verify_fn: Post-detection speaker verification callback (K5).
   314	        power_manager: Power management controller for duty cycling (K7).
   315	        confirm_count: Consecutive above-threshold scores required for detection (K2).
   316	        score_history_size: Number of recent scores to retain (K2).
   317	    """
   318	
   319	    _VALID_BACKENDS = ("onnx", "tflite", "auto")
   320	
   321	    def __init__(
   322	        self,
   323	        model: str = "temporal_cnn",
   324	        threshold: float = DEFAULT_THRESHOLD,
   325	        cooldown_s: float = DEFAULT_COOLDOWN_S,
   326	        providers: list[str] | None = None,
   327	        backend: str = "auto",
   328	        *,
   329	        config: DetectorConfig | None = None,
   330	        # K3: Multi-model ensemble (individual kwargs, backwards compat)
   331	        models: list[str] | None = _UNSET,
   332	        fusion_strategy: FusionStrategy | str = _UNSET,
   333	        fusion_weights: list[float] | None = _UNSET,
   334	        # K4: Adaptive threshold
   335	        adaptive_threshold: bool = _UNSET,
   336	        noise_profiler: NoiseProfiler | None = _UNSET,
   337	        # K5: Speaker verification
   338	        speaker_verify_fn: Callable[[np.ndarray], bool] | None = _UNSET,
   339	        # K7: Power management
   340	        power_manager: PowerManager | None = _UNSET,
   341	        # K2: Confidence tracking
   342	        confirm_count: int = _UNSET,
   343	        score_history_size: int = _UNSET,
   344	    ) -> None:
   345	        # --- Resolve config vs individual kwargs -------------------------
   346	        # Detect if any advanced kwarg was explicitly passed (not _UNSET)
   347	        _locals = {
   348	            "models": models,
   349	            "fusion_strategy": fusion_strategy,
   350	            "fusion_weights": fusion_weights,
   351	            "adaptive_threshold": adaptive_threshold,
   352	            "noise_profiler": noise_profiler,
   353	            "speaker_verify_fn": speaker_verify_fn,
   354	            "power_manager": power_manager,
   355	            "confirm_count": confirm_count,
   356	            "score_history_size": score_history_size,
   357	        }
   358	        explicit_kwargs = {name for name, val in _locals.items() if val is not _UNSET}
   359	        if config is not None and explicit_kwargs:
   360	            raise ValueError(
   361	                f"Cannot specify both config= and individual advanced kwargs. "
   362	                f"Conflicting kwargs: {sorted(explicit_kwargs)}. "
   363	                f"Either use config=DetectorConfig(...) or pass kwargs directly, not both."
   364	            )
   365	

 succeeded in 1185ms:
   560	
   561	        try:
   562	            current_hashes = get_openwakeword_backbone_hashes("onnx")
   563	        except Exception:
   564	            return
   565	
   566	        if (
   567	            current_hashes["oww_mel_sha256"] != expected_mel
   568	            or current_hashes["oww_emb_sha256"] != expected_emb
   569	        ):
   570	            logger.warning(
   571	                "OWW backbone version changed since training. Model may produce degraded results."
   572	            )
   573	
   574	    def _load_session(self, model: str) -> BackendSession:
   575	        """Load a model file via the configured backend.
   576	
   577	        Resolves *model* to a file path (direct path, .onnx/.tflite suffix,
   578	        or registry lookup), then delegates to ``self._backend.load()``.
   579	
   580	        For TFLite backends, if only a ``.onnx`` file exists in the cache
   581	        the method looks for a sibling ``.tflite`` file with the same stem.
   582	        """
   583	        model_path = self._resolve_model_path(model)
   584	
   585	        # When using the TFLite backend, prefer a .tflite sibling if the
   586	        # resolved path is an .onnx file.
   587	        if self._backend.name == "tflite" and model_path.suffix == ".onnx":
   588	            tflite_sibling = model_path.with_suffix(".tflite")
   589	            if tflite_sibling.exists():
   590	                model_path = tflite_sibling
   591	                logger.debug("TFLite backend: using .tflite sibling %s", model_path)
   592	            else:
   593	                logger.warning(
   594	                    "TFLite backend selected but only .onnx file found at %s. "
   595	                    "Convert with: python -c "
   596	                    '"from violawake_sdk.backends.tflite_backend import '
   597	                    "convert_onnx_to_tflite; convert_onnx_to_tflite('%s')\"",
   598	                    model_path,
   599	                    model_path,
   600	                )
   601	
   602	        try:
   603	            session = self._backend.load(model_path)
   604	        except Exception as e:
   605	            raise ModelLoadError(f"Failed to load model {model_path}: {e}") from e
   606	        logger.debug("Loaded model via %s backend: %s", self._backend.name, model_path)
   607	        return session
   608	
   609	    @staticmethod
   610	    def _resolve_model_path(model: str) -> Path:
   611	        """Resolve a model name or path string to a concrete file path.
   612	
   613	        Resolution order:
   614	        1. If *model* is an existing file path, use it directly.
   615	        2. If *model* ends with ``.onnx`` or ``.tflite``, treat as a path
   616	           (raise if not found).
   617	        3. Otherwise, look up *model* in the model registry / cache.
   618	        """
   619	        if Path(model).is_file():
   620	            return Path(model)
   621	
   622	        if model.endswith((".onnx", ".tflite")):
   623	            path = Path(model)
   624	            if not path.exists():
   625	                raise ModelNotFoundError(
   626	                    f"Model file not found: {model}. "
   627	                    f"If this is a named model, omit the file extension."
   628	                )
   629	            return path
   630	

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/wake_detector.py | sed -n '365,445p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/wake_detector.py | sed -n '445,560p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/wake_detector.py | sed -n '630,660p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'get_backend\\(|backend=\\\"auto\\\"|backend: str = \\\"auto\\\"' src/violawake_sdk console/backend | head -120\"" in J:\CLAUDE\PROJECTS\Wakeword
2026-05-07T11:32:05.137835Z ERROR codex_core::tools::router: error=Exit code: 1
Wall time: 1 seconds
Output:
str: -c: line 1: unexpected EOF while looking for matching `''

 exited 1 in 951ms:
str: -c: line 1: unexpected EOF while looking for matching `''

 succeeded in 1015ms:
   445	
   446	        # K5: Speaker verification
   447	        self._speaker_verify_fn = speaker_verify_fn
   448	
   449	        # K7: Power manager
   450	        self._power_manager = power_manager
   451	
   452	        # Warn on deprecated models
   453	        if model in MODEL_REGISTRY and "DEPRECATED" in MODEL_REGISTRY[model].description:
   454	            import warnings
   455	
   456	            warnings.warn(
   457	                f"Model '{model}' is deprecated: {MODEL_REGISTRY[model].description}. "
   458	                f"Use model='temporal_cnn' instead.",
   459	                DeprecationWarning,
   460	                stacklevel=2,
   461	            )
   462	
   463	        # Load models
   464	        self._oww_backbone = self._create_oww_backbone()
   465	        self._mlp_session = self._load_session(model)
   466	        self._mlp_input_name = self._mlp_session.get_inputs()[0].name
   467	        self._last_score = 0.0
   468	
   469	        # Detect temporal vs MLP model from input shape
   470	        mlp_input_shape = self._mlp_session.get_inputs()[0].shape
   471	        if len(mlp_input_shape) == 3:
   472	            # Temporal model: input is (batch, seq_len, embedding_dim)
   473	            self._is_temporal = True
   474	            self._temporal_seq_len = (
   475	                mlp_input_shape[1]
   476	                if isinstance(mlp_input_shape[1], int)
   477	                else _TEMPORAL_SEQ_LEN_DEFAULT
   478	            )
   479	            self._embedding_buffer: collections.deque[np.ndarray] = collections.deque(
   480	                maxlen=self._temporal_seq_len,
   481	            )
   482	            logger.info(
   483	                "Temporal model detected: seq_len=%d",
   484	                self._temporal_seq_len,
   485	            )
   486	        else:
   487	            self._is_temporal = False
   488	            self._temporal_seq_len = 0
   489	
   490	        # K3: Load additional ensemble models
   491	        if models and self._ensemble is not None:
   492	            # Add primary model to ensemble
   493	            self._ensemble.add_session(self._mlp_session, self._mlp_input_name)
   494	            for extra_model in models:
   495	                extra_session = self._load_session(extra_model)
   496	                extra_input_name = extra_session.get_inputs()[0].name
   497	                self._ensemble.add_session(extra_session, extra_input_name)
   498	
   499	        logger.info(
   500	            "WakeDetector initialized: model=%s, threshold=%.2f, backend=%s",
   501	            model,
   502	            threshold,
   503	            self._backend.name,
   504	        )
   505	        self._warn_on_oww_backbone_change(self._resolve_model_path(model))
   506	
   507	    # ------------------------------------------------------------------
   508	    # Context manager support
   509	    # ------------------------------------------------------------------
   510	
   511	    def __enter__(self) -> WakeDetector:
   512	        """Enter sync context manager. Returns self."""
   513	        return self
   514	
   515	    def __exit__(
   516	        self,
   517	        exc_type: type[BaseException] | None,
   518	        exc_val: BaseException | None,
   519	        exc_tb: object,
   520	    ) -> None:
   521	        """Exit sync context manager. Releases sessions and resets state."""
   522	        self.close()
   523	
   524	    def close(self) -> None:
   525	        """Release inference sessions and reset internal state.
   526	
   527	        After calling close(), the detector should not be used for inference.
   528	        This is called automatically when using WakeDetector as a context
   529	        manager.
   530	        """
   531	        self.reset()
   532	        # Release inference session references so the underlying runtime
   533	        # (ONNX / TFLite) can free memory immediately rather than waiting
   534	        # for garbage collection.
   535	        self._mlp_session = None  # type: ignore[assignment]
   536	        if self._ensemble is not None:
   537	            self._ensemble.clear()
   538	        self._oww_backbone = None  # type: ignore[assignment]
   539	
   540	    def _create_oww_backbone(self) -> OpenWakeWordBackbone:
   541	        """Create the shared OpenWakeWord backbone."""
   542	        return OpenWakeWordBackbone(self._backend)
   543	
   544	    def _warn_on_oww_backbone_change(self, model_path: Path) -> None:
   545	        """Warn when the installed OWW backbone differs from the training config."""
   546	        config_path = model_path.with_suffix(".config.json")
   547	        if not config_path.exists():
   548	            return
   549	
   550	        try:
   551	            with config_path.open(encoding="utf-8") as f:
   552	                config = json.load(f)
   553	        except (OSError, json.JSONDecodeError):
   554	            return
   555	
   556	        expected_mel = config.get("oww_mel_sha256")
   557	        expected_emb = config.get("oww_emb_sha256")
   558	        if not isinstance(expected_mel, str) or not isinstance(expected_emb, str):
   559	            return
   560	

 succeeded in 1015ms:
   365	
   366	        if config is not None:
   367	            # Unpack from DetectorConfig
   368	            models = config.models
   369	            fusion_strategy = config.fusion_strategy
   370	            fusion_weights = config.fusion_weights
   371	            adaptive_threshold = config.adaptive_threshold
   372	            noise_profiler = config.noise_profiler
   373	            speaker_verify_fn = config.speaker_verify_fn
   374	            power_manager = config.power_manager
   375	            confirm_count = config.confirm_count
   376	            score_history_size = config.score_history_size
   377	        else:
   378	            # Apply defaults for any _UNSET values (backwards compat path)
   379	            if models is _UNSET:
   380	                models = None
   381	            if fusion_strategy is _UNSET:
   382	                fusion_strategy = FusionStrategy.AVERAGE
   383	            if fusion_weights is _UNSET:
   384	                fusion_weights = None
   385	            if adaptive_threshold is _UNSET:
   386	                adaptive_threshold = False
   387	            if noise_profiler is _UNSET:
   388	                noise_profiler = None
   389	            if speaker_verify_fn is _UNSET:
   390	                speaker_verify_fn = None
   391	            if power_manager is _UNSET:
   392	                power_manager = None
   393	            if confirm_count is _UNSET:
   394	                confirm_count = 1
   395	            if score_history_size is _UNSET:
   396	                score_history_size = 50
   397	
   398	        # G1: Input validation for public constructor parameters
   399	        if not isinstance(threshold, (int, float)):
   400	            raise TypeError(f"threshold must be a number, got {type(threshold).__name__}")
   401	        if not 0.0 <= threshold <= 1.0:
   402	            raise ValueError(f"threshold must be in [0.0, 1.0], got {threshold!r}")
   403	        if not isinstance(cooldown_s, (int, float)):
   404	            raise TypeError(f"cooldown_s must be a number, got {type(cooldown_s).__name__}")
   405	        if cooldown_s < 0:
   406	            raise ValueError(f"cooldown_s must be >= 0, got {cooldown_s!r}")
   407	        if backend not in self._VALID_BACKENDS:
   408	            raise ValueError(f"backend must be one of {self._VALID_BACKENDS}, got {backend!r}")
   409	        if confirm_count < 1:
   410	            raise ValueError(f"confirm_count must be >= 1, got {confirm_count}")
   411	
   412	        self.threshold = threshold
   413	        self._lock = threading.Lock()
   414	        self._backbone_lock = threading.Lock()
   415	        self._policy = WakeDecisionPolicy(threshold=threshold, cooldown_s=cooldown_s)
   416	        self._providers = providers or ["CPUExecutionProvider"]
   417	        self._backend: InferenceBackend = get_backend(backend, providers=self._providers)
   418	
   419	        # K2: Confidence tracking
   420	        self._score_tracker = ScoreTracker(
   421	            threshold=threshold,
   422	            history_size=score_history_size,
   423	        )
   424	        self._confirm_required = confirm_count
   425	        self._confirm_counter = 0
   426	
   427	        # K3: Ensemble support
   428	        self._ensemble: EnsembleScorer | None = None
   429	        if models and len(models) > 0:
   430	            if isinstance(fusion_strategy, str):
   431	                fusion_strategy = FusionStrategy(fusion_strategy)
   432	            self._ensemble = EnsembleScorer(
   433	                strategy=fusion_strategy,
   434	                weights=fusion_weights,
   435	            )
   436	
   437	        # K4: Noise profiler / adaptive threshold
   438	        self._adaptive_threshold = adaptive_threshold
   439	        if noise_profiler is not None:
   440	            self._noise_profiler: NoiseProfiler | None = noise_profiler
   441	        elif adaptive_threshold:
   442	            self._noise_profiler = NoiseProfiler(base_threshold=threshold)
   443	        else:
   444	            self._noise_profiler = None
   445	

 succeeded in 1058ms:
   630	
   631	        try:
   632	            return get_model_path(model)
   633	        except FileNotFoundError as e:
   634	            raise ModelNotFoundError(
   635	                f"Model '{model}' not found in cache and auto-download failed or is disabled. "
   636	                f"Run: violawake-download --model {model}"
   637	            ) from e
   638	
   639	    def _get_embedding(self, audio_frame: bytes | np.ndarray) -> np.ndarray:
   640	        """Extract the OWW embedding from an audio frame.
   641	
   642	        Returns the raw embedding vector before MLP scoring.
   643	        Used internally for speaker verification (K5).
   644	        """
   645	        with self._backbone_lock:
   646	            embedding = self._oww_backbone.last_embedding
   647	            if embedding is None:
   648	                _, embedding = self._oww_backbone.push_audio(audio_frame)
   649	        if embedding is None:
   650	            return np.zeros(EMBEDDING_DIM, dtype=np.float32)
   651	        return embedding
   652	
   653	    @staticmethod
   654	    def _needs_int16_normalization(audio_frame: bytes | np.ndarray) -> bool:
   655	        """Check whether audio_frame requires int16-to-float normalization."""
   656	        return isinstance(audio_frame, bytes) or (
   657	            isinstance(audio_frame, np.ndarray) and audio_frame.dtype == np.int16
   658	        )
   659	
   660	    @staticmethod

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/database.py | sed -n '1,220p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '140,330p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '660,825p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '1140,1210p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1332ms:
   660	                logger.warning("Skipping job %s because user %s queue is paused", job_id, job.user_id)
   661	                return
   662	            if breaker.next_attempt_at is not None and breaker.next_attempt_at > now:
   663	                delay = (breaker.next_attempt_at - now).total_seconds()
   664	                self._schedule_retry_fill(job.user_id, delay)
   665	                logger.info(
   666	                    "Delaying job %s for user %s due to failure backoff (%ss)",
   667	                    job_id,
   668	                    job.user_id,
   669	                    round(delay, 2),
   670	                )
   671	                return
   672	
   673	            cancel_event = threading.Event()
   674	            async with self._state_lock:
   675	                self._cancel_events[job_id] = cancel_event
   676	
   677	            await self._update_job(
   678	                job_id,
   679	                status=JobStatus.RUNNING,
   680	                started_at=now,
   681	                error=None,
   682	            )
   683	            await self._publish(
   684	                job_id,
   685	                {
   686	                    "status": JobStatus.RUNNING.value,
   687	                    "progress": job.progress_pct,
   688	                    "epoch": 0,
   689	                    "total_epochs": job.epochs,
   690	                    "train_loss": 0.0,
   691	                    "val_loss": 0.0,
   692	                    "message": "Training started.",
   693	                    "error": None,
   694	                    "d_prime": job.d_prime,
   695	                    "model_id": job.model_id,
   696	                    "queue_position": None,
   697	                },
   698	            )
   699	
   700	            recording_paths = await self._load_recording_paths(job.user_id, job.recording_ids)
   701	            if len(recording_paths) < 5:
   702	                raise RuntimeError(f"No valid recordings found for training job {job_id}")
   703	
   704	            # Resolve negatives corpus for paid tiers
   705	            negatives_dir = await self._resolve_negatives_dir(job.user_id)
   706	
   707	            output_dir = Path(tempfile.mkdtemp(prefix=f"violawake_job_{job.id}_", dir=str(settings.tmp_dir)))
   708	            output_path = output_dir / f"{job.wake_word}_{job.id}_{int(now.timestamp())}.onnx"
   709	
   710	            loop = asyncio.get_running_loop()
   711	
   712	            def _on_progress(event: dict[str, Any]) -> None:
   713	                future = asyncio.run_coroutine_threadsafe(
   714	                    self._handle_progress_event(job_id, job.epochs, event),
   715	                    loop,
   716	                )
   717	                future.result(timeout=10)
   718	
   719	            artifact = await asyncio.to_thread(
   720	                run_training_job_sync,
   721	                job_id=job.id,
   722	                wake_word=job.wake_word,
   723	                recording_identifiers=recording_paths,
   724	                output_path=output_path,
   725	                epochs=job.epochs,
   726	                timeout_seconds=settings.training_timeout,
   727	                progress_callback=_on_progress,
   728	                is_cancelled=cancel_event.is_set,
   729	                negatives_dir=negatives_dir,
   730	            )
   731	
   732	            storage = get_storage()
   733	            model_key = build_model_key(job.user_id, artifact.local_path.name)
   734	            storage.upload(
   735	                model_key,
   736	                artifact.local_path.read_bytes(),
   737	                "application/octet-stream",
   738	            )
   739	            if artifact.config_bytes is not None:
   740	                storage.upload(
   741	                    build_companion_config_identifier(model_key),
   742	                    artifact.config_bytes,
   743	                    "application/json",
   744	                )
   745	
   746	            model_id = await self._create_model_record(
   747	                user_id=job.user_id,
   748	                wake_word=job.wake_word,
   749	                file_path=model_key,
   750	                config_json=artifact.config_json,
   751	                d_prime=artifact.d_prime,
   752	                size_bytes=artifact.size_bytes,
   753	            )
   754	
   755	            completed_at = _utcnow()
   756	            await self._update_job(
   757	                job_id,
   758	                status=JobStatus.COMPLETED,
   759	                progress_pct=100.0,
   760	                completed_at=completed_at,
   761	                error=None,
   762	                model_id=model_id,
   763	                d_prime=artifact.d_prime,
   764	            )
   765	            await self._record_success(job.user_id)
   766	
   767	            # Schedule post-training recording deletion (privacy: recordings
   768	            # are deleted after training per the privacy FAQ).
   769	            await self._schedule_recording_cleanup(job.recording_ids)
   770	
   771	            await self._publish(
   772	                job_id,
   773	                {
   774	                    "status": JobStatus.COMPLETED.value,
   775	                    "progress": 100.0,
   776	                    "epoch": job.epochs,
   777	                    "total_epochs": job.epochs,
   778	                    "train_loss": 0.0,
   779	                    "val_loss": 0.0,
   780	                    "message": "Training complete.",
   781	                    "error": None,
   782	                    "d_prime": artifact.d_prime,
   783	                    "model_id": model_id,
   784	                    "queue_position": None,
   785	                },
   786	            )
   787	            logger.info("Training job %s completed for user %s", job_id, job.user_id)
   788	
   789	            # Best-effort training-complete email notification.
   790	            try:
   791	                from app.email_service import get_email_service
   792	
   793	                email_svc = get_email_service()
   794	                if email_svc.enabled:
   795	                    async with async_session_factory() as session:
   796	                        user = await session.get(User, job.user_id)
   797	                    if user is not None:
   798	                        download_url = f"/models/{model_id}/download"
   799	                        await email_svc.send_training_complete(
   800	                            to=user.email,
   801	                            model_name=job.wake_word,
   802	                            download_url=download_url,
   803	                        )
   804	            except Exception as email_exc:
   805	                log_exception(logger, email_exc, message="Training-complete email failed", source="email")
   806	
   807	        except TrainingCancelledError as exc:
   808	            current_job = await self.get_job(job_id)
   809	            completed_at = _utcnow()
   810	            progress_pct = current_job.progress_pct if current_job is not None else 0.0
   811	            await self._update_job(
   812	                job_id,
   813	                status=JobStatus.CANCELLED,
   814	                completed_at=completed_at,
   815	                error=str(exc),
   816	            )
   817	            await self._publish(
   818	                job_id,
   819	                {
   820	                    "status": JobStatus.CANCELLED.value,
   821	                    "progress": progress_pct,
   822	                    "epoch": 0,
   823	                    "total_epochs": current_job.epochs if current_job is not None else 0,
   824	                    "train_loss": 0.0,
   825	                    "val_loss": 0.0,

 succeeded in 1334ms:
   140	
   141	class JobQueue:
   142	    """Persistent async training job queue."""
   143	
   144	    def __init__(
   145	        self,
   146	        *,
   147	        db_path: Path | None = None,
   148	        max_concurrent: int = 2,
   149	        max_pending: int = QUEUE_MAX_SIZE,
   150	    ) -> None:
   151	        self._db_path = db_path or (settings.data_dir / "job_queue.db")
   152	        self._queue: asyncio.Queue[int] = asyncio.Queue(maxsize=max_pending)
   153	        self._semaphore = asyncio.Semaphore(max_concurrent)
   154	        self._subscribers: dict[int, list[asyncio.Queue[dict[str, Any]]]] = {}
   155	        self._queued_job_ids: set[int] = set()
   156	        self._running_job_ids: set[int] = set()
   157	        self._cancel_events: dict[int, threading.Event] = {}
   158	        self._inflight_tasks: set[asyncio.Task[None]] = set()
   159	        self._retry_tasks: dict[int, asyncio.Task[None]] = {}
   160	        self._submission_lock = asyncio.Lock()
   161	        self._state_lock = asyncio.Lock()
   162	        self._refill_lock = asyncio.Lock()
   163	        self._worker_task: asyncio.Task[None] | None = None
   164	        self._closed = False
   165	
   166	    async def start(self) -> None:
   167	        """Initialize persistence and start the dispatcher loop."""
   168	        await self._initialize_db()
   169	        await self._resume_jobs()
   170	        self._worker_task = asyncio.create_task(self._worker_loop(), name="job-queue-worker")
   171	        await self._fill_queue_from_db()
   172	        logger.info("Job queue started with max_concurrent=%s", settings.max_concurrent_jobs)
   173	
   174	    async def shutdown(self) -> None:
   175	        """Stop the dispatcher loop and cancel outstanding retry timers."""
   176	        self._closed = True
   177	        if self._worker_task is not None:
   178	            self._worker_task.cancel()
   179	            with suppress(asyncio.CancelledError):
   180	                await self._worker_task
   181	            self._worker_task = None
   182	
   183	        for cancel_event in list(self._cancel_events.values()):
   184	            cancel_event.set()
   185	
   186	        if self._inflight_tasks:
   187	            done, pending = await asyncio.wait(list(self._inflight_tasks), timeout=15)
   188	            for task in pending:
   189	                task.cancel()
   190	            for task in done:
   191	                with suppress(asyncio.CancelledError):
   192	                    await task
   193	
   194	        for task in list(self._retry_tasks.values()):
   195	            task.cancel()
   196	        for task in list(self._retry_tasks.values()):
   197	            with suppress(asyncio.CancelledError):
   198	                await task
   199	        self._retry_tasks.clear()
   200	
   201	    async def submit_job(
   202	        self,
   203	        *,
   204	        user_id: int,
   205	        wake_word: str,
   206	        recording_ids: list[int],
   207	        epochs: int,
   208	        priority: int | None = None,
   209	    ) -> int:
   210	        """Persist a new training job and enqueue it when capacity allows.
   211	
   212	        When *priority* is not supplied it is resolved automatically from the
   213	        user's subscription tier (free=0, developer=5, business=10).
   214	        """
   215	        if await self._pending_count() >= self._queue.maxsize:
   216	            raise QueueFullError("Training queue is full. Please try again later.")
   217	
   218	        if priority is None:
   219	            priority = await _resolve_user_priority(user_id)
   220	
   221	        created_at = _utcnow()
   222	        payload = json.dumps(recording_ids)
   223	
   224	        async with self._submission_lock:
   225	            async with self._connect() as conn:
   226	                async with conn.execute(
   227	                    """
   228	                    SELECT COUNT(*) AS count
   229	                    FROM jobs
   230	                    WHERE user_id = ? AND status IN (?, ?)
   231	                    """,
   232	                    (
   233	                        user_id,
   234	                        JobStatus.PENDING.value,
   235	                        JobStatus.RUNNING.value,
   236	                    ),
   237	                ) as cursor:
   238	                    row = await cursor.fetchone()
   239	                active_job_count = int(row["count"]) if row is not None else 0
   240	                if active_job_count >= PER_USER_MAX_PENDING:
   241	                    raise TooManyPendingJobsError(
   242	                        "Too many pending jobs. Wait for current jobs to complete."
   243	                    )
   244	
   245	                cursor = await conn.execute(
   246	                    """
   247	                    INSERT INTO jobs (
   248	                        user_id,
   249	                        wake_word,
   250	                        status,
   251	                        created_at,
   252	                        started_at,
   253	                        completed_at,
   254	                        error,
   255	                        progress_pct,
   256	                        recording_ids,
   257	                        epochs,
   258	                        model_id,
   259	                        d_prime,
   260	                        priority
   261	                    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
   262	                    """,
   263	                    (
   264	                        user_id,
   265	                        wake_word,
   266	                        JobStatus.PENDING.value,
   267	                        _serialize_datetime(created_at),
   268	                        None,
   269	                        None,
   270	                        None,
   271	                        0.0,
   272	                        payload,
   273	                        epochs,
   274	                        None,
   275	                        None,
   276	                        priority,
   277	                    ),
   278	                )
   279	                await conn.commit()
   280	                job_id = int(cursor.lastrowid)
   281	
   282	        logger.info(
   283	            "Queued training job %s for user %s (priority=%s)",
   284	            job_id,
   285	            user_id,
   286	            priority,
   287	        )
   288	        await self._fill_queue_from_db()
   289	        # Publish an initial PENDING event so SSE subscribers immediately see
   290	        # their queue position after submission.
   291	        queue_position = await self._queue_position(job_id)
   292	        await self._publish(
   293	            job_id,
   294	            {
   295	                "status": JobStatus.PENDING.value,
   296	                "progress": 0.0,
   297	                "epoch": 0,
   298	                "total_epochs": epochs,
   299	                "train_loss": 0.0,
   300	                "val_loss": 0.0,
   301	                "message": "Queued for training.",
   302	                "error": None,
   303	                "d_prime": None,
   304	                "model_id": None,
   305	                "queue_position": queue_position,
   306	            },
   307	        )
   308	        return job_id
   309	
   310	    async def cancel_job(self, job_id: int) -> bool:
   311	        """Cancel a pending or running job."""
   312	        job = await self.get_job(job_id)
   313	        if job is None:
   314	            return False
   315	        if job.status in {JobStatus.COMPLETED, JobStatus.FAILED, JobStatus.CANCELLED}:
   316	            return False
   317	
   318	        if job.status is JobStatus.RUNNING:
   319	            async with self._state_lock:
   320	                cancel_event = self._cancel_events.get(job_id)
   321	            if cancel_event is not None:
   322	                cancel_event.set()
   323	            logger.info("Cancellation requested for running job %s", job_id)
   324	            return True
   325	
   326	        completed_at = _utcnow()
   327	        await self._update_job(
   328	            job_id,
   329	            status=JobStatus.CANCELLED,
   330	            completed_at=completed_at,

 succeeded in 1369ms:
     1	"""SQLAlchemy async database setup."""
     2	
     3	from __future__ import annotations
     4	
     5	from collections.abc import AsyncGenerator
     6	
     7	from sqlalchemy import inspect, text
     8	from sqlalchemy.engine import Connection
     9	from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
    10	from sqlalchemy.orm import DeclarativeBase
    11	
    12	from app.config import settings
    13	
    14	# Railway deployments should set VIOLAWAKE_DB_URL to a full
    15	# postgresql+asyncpg://... URL so the backend uses persistent PostgreSQL
    16	# instead of the default filesystem-backed SQLite database.
    17	DATABASE_URL = (
    18	    settings.db_url.strip()
    19	    if settings.db_url and settings.db_url.strip()
    20	    else f"sqlite+aiosqlite:///{settings.db_path}"
    21	)
    22	
    23	engine = create_async_engine(DATABASE_URL, echo=False)
    24	async_session_factory = async_sessionmaker(engine, expire_on_commit=False)
    25	
    26	
    27	class Base(DeclarativeBase):
    28	    """Declarative base for all ORM models."""
    29	    pass
    30	
    31	
    32	def _ensure_schema_updates(connection: Connection) -> None:
    33	    """Apply lightweight schema updates for local-first deployments."""
    34	    inspector = inspect(connection)
    35	    table_names = set(inspector.get_table_names())
    36	    if "users" not in table_names:
    37	        return
    38	
    39	    user_columns = {column["name"] for column in inspector.get_columns("users")}
    40	    if "email_verified" not in user_columns:
    41	        connection.execute(
    42	            text("ALTER TABLE users ADD COLUMN email_verified BOOLEAN NOT NULL DEFAULT FALSE")
    43	        )
    44	    if "failed_login_count" not in user_columns:
    45	        try:
    46	            connection.execute(
    47	                text(
    48	                    "ALTER TABLE users ADD COLUMN IF NOT EXISTS "
    49	                    "failed_login_count INTEGER DEFAULT 0 NOT NULL"
    50	                )
    51	            )
    52	        except Exception:
    53	            connection.execute(
    54	                text("ALTER TABLE users ADD COLUMN failed_login_count INTEGER DEFAULT 0 NOT NULL")
    55	            )
    56	    if "locked_until" not in user_columns:
    57	        try:
    58	            connection.execute(
    59	                text("ALTER TABLE users ADD COLUMN IF NOT EXISTS locked_until TIMESTAMP")
    60	            )
    61	        except Exception:
    62	            connection.execute(text("ALTER TABLE users ADD COLUMN locked_until TIMESTAMP"))
    63	
    64	    # Team FK columns on recordings and trained_models (nullable, so no default needed)
    65	    if "recordings" in table_names:
    66	        recording_columns = {col["name"] for col in inspector.get_columns("recordings")}
    67	        if "team_id" not in recording_columns:
    68	            connection.execute(text("ALTER TABLE recordings ADD COLUMN team_id INTEGER REFERENCES teams(id)"))
    69	
    70	    if "trained_models" in table_names:
    71	        model_columns = {col["name"] for col in inspector.get_columns("trained_models")}
    72	        if "team_id" not in model_columns:
    73	            connection.execute(text("ALTER TABLE trained_models ADD COLUMN team_id INTEGER REFERENCES teams(id)"))
    74	
    75	    # Soft-delete support: recordings are marked deleted_at after training completes
    76	    if "recordings" in table_names:
    77	        recording_columns = {col["name"] for col in inspector.get_columns("recordings")}
    78	        if "deleted_at" not in recording_columns:
    79	            connection.execute(text("ALTER TABLE recordings ADD COLUMN deleted_at TIMESTAMP"))
    80	
    81	
    82	async def get_db() -> AsyncGenerator[AsyncSession, None]:
    83	    """FastAPI dependency that yields a database session."""
    84	    async with async_session_factory() as session:
    85	        try:
    86	            yield session
    87	            await session.commit()
    88	        except Exception:
    89	            await session.rollback()
    90	            raise
    91	
    92	
    93	async def init_db() -> None:
    94	    """Create all tables (idempotent)."""
    95	    async with engine.begin() as conn:
    96	        await conn.run_sync(Base.metadata.create_all)
    97	        await conn.run_sync(_ensure_schema_updates)

 succeeded in 1337ms:
  1140	        existing = self._retry_tasks.get(user_id)
  1141	        if existing is not None and not existing.done():
  1142	            return
  1143	
  1144	        async def _delayed_fill() -> None:
  1145	            try:
  1146	                await asyncio.sleep(max(0.0, delay_seconds))
  1147	                await self._fill_queue_from_db()
  1148	            except asyncio.CancelledError:
  1149	                raise
  1150	            finally:
  1151	                self._retry_tasks.pop(user_id, None)
  1152	
  1153	        self._retry_tasks[user_id] = asyncio.create_task(
  1154	            _delayed_fill(),
  1155	            name=f"user-{user_id}-queue-retry",
  1156	        )
  1157	
  1158	    async def _resolve_negatives_dir(self, user_id: int) -> Path | None:
  1159	        """Return curated negatives corpus path for paid-tier users, None for free."""
  1160	        corpus_path = settings.negatives_corpus_dir
  1161	        if not corpus_path:
  1162	            return None
  1163	
  1164	        corpus = Path(corpus_path)
  1165	        if not corpus.is_dir():
  1166	            logger.warning("Negatives corpus dir configured but missing: %s", corpus_path)
  1167	            return None
  1168	
  1169	        # Check user's subscription tier
  1170	        from app.models import Subscription
  1171	
  1172	        async with async_session_factory() as session:
  1173	            result = await session.execute(
  1174	                select(Subscription.tier).where(Subscription.user_id == user_id)
  1175	            )
  1176	            row = result.first()
  1177	            tier = row[0] if row else "free"
  1178	
  1179	        if tier == "free":
  1180	            return None
  1181	
  1182	        logger.info("Using curated negatives corpus for user %s (tier=%s)", user_id, tier)
  1183	        return corpus
  1184	
  1185	    async def _schedule_recording_cleanup(self, recording_ids: list[int]) -> None:
  1186	        """Soft-delete recordings after training completes.
  1187	
  1188	        The actual storage file purge happens later via the periodic
  1189	        retention cleanup loop (``cleanup_soft_deleted_recordings``).
  1190	        """
  1191	        if settings.post_training_retention_hours <= 0:
  1192	            return
  1193	
  1194	        try:
  1195	            from app.retention import mark_recordings_for_deletion
  1196	            await mark_recordings_for_deletion(recording_ids)
  1197	        except Exception as exc:
  1198	            # Non-fatal: recordings will still be cleaned up by the
  1199	            # age-based retention policy even if this fails.
  1200	            logger.warning(
  1201	                "Failed to mark recordings for post-training deletion: %s",
  1202	                exc,
  1203	            )
  1204	
  1205	    async def _load_recording_paths(self, user_id: int, recording_ids: list[int]) -> list[str]:
  1206	        async with async_session_factory() as session:
  1207	            result = await session.execute(
  1208	                select(Recording.file_path)
  1209	                .where(
  1210	                    Recording.id.in_(recording_ids),

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '1,140p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n 'def _connect\\|_initialize_db\\|settings.tmp_dir\\|mkdir' console/backend/app/job_queue.py console/backend/app/config.py\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '450,620p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/config.py | sed -n '1,180p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1044ms:
     1	"""Persistent async training job queue with circuit breaker protection."""
     2	
     3	from __future__ import annotations
     4	
     5	import asyncio
     6	import json
     7	import logging
     8	import shutil
     9	import tempfile
    10	import threading
    11	from collections.abc import AsyncIterator
    12	from contextlib import asynccontextmanager, suppress
    13	from dataclasses import dataclass, field
    14	from datetime import datetime, timedelta, timezone
    15	from enum import Enum
    16	from pathlib import Path
    17	from typing import Any
    18	
    19	import aiosqlite
    20	from sqlalchemy import select
    21	
    22	from app.config import settings
    23	from app.database import async_session_factory
    24	from app.models import Recording, TrainedModel, User
    25	from app.monitoring import log_exception
    26	from app.services.training_service import TrainingCancelledError, run_training_job_sync
    27	from app.storage import build_companion_config_identifier, build_model_key, get_storage
    28	
    29	logger = logging.getLogger("violawake.jobs")
    30	
    31	QUEUE_MAX_SIZE = 50
    32	PER_USER_MAX_PENDING = 3
    33	MAX_SUBSCRIBERS_PER_JOB = 5
    34	FAILURE_THRESHOLD = 3
    35	FAILURE_BACKOFF_SECONDS = 300
    36	ACCOUNT_DELETE_CANCEL_TIMEOUT_SECONDS = 30.0
    37	
    38	
    39	def _utcnow() -> datetime:
    40	    return datetime.now(timezone.utc)
    41	
    42	
    43	def _serialize_datetime(value: datetime | None) -> str | None:
    44	    if value is None:
    45	        return None
    46	    return value.astimezone(timezone.utc).isoformat()
    47	
    48	
    49	def _deserialize_datetime(value: str | None) -> datetime | None:
    50	    if not value:
    51	        return None
    52	    parsed = datetime.fromisoformat(value)
    53	    if parsed.tzinfo is None:
    54	        return parsed.replace(tzinfo=timezone.utc)
    55	    return parsed.astimezone(timezone.utc)
    56	
    57	
    58	class JobStatus(str, Enum):
    59	    """Persisted job states."""
    60	
    61	    PENDING = "pending"
    62	    RUNNING = "running"
    63	    COMPLETED = "completed"
    64	    FAILED = "failed"
    65	    CANCELLED = "cancelled"
    66	
    67	
    68	# Priority values assigned by subscription tier.
    69	PRIORITY_FREE = 0
    70	PRIORITY_DEVELOPER = 10
    71	PRIORITY_BUSINESS = 20
    72	PRIORITY_ENTERPRISE = 30
    73	
    74	
    75	@dataclass(slots=True)
    76	class Job:
    77	    """Persisted training job metadata."""
    78	
    79	    id: int
    80	    user_id: int
    81	    wake_word: str
    82	    status: JobStatus
    83	    created_at: datetime
    84	    started_at: datetime | None = None
    85	    completed_at: datetime | None = None
    86	    error: str | None = None
    87	    progress_pct: float = 0.0
    88	    recording_ids: list[int] = field(default_factory=list)
    89	    epochs: int = 80
    90	    model_id: int | None = None
    91	    d_prime: float | None = None
    92	    priority: int = PRIORITY_FREE
    93	
    94	
    95	@dataclass(slots=True)
    96	class CircuitBreakerState:
    97	    """Per-user failure tracking."""
    98	
    99	    user_id: int
   100	    consecutive_failures: int = 0
   101	    paused: bool = False
   102	    next_attempt_at: datetime | None = None
   103	    last_failure_at: datetime | None = None
   104	    pause_reason: str | None = None
   105	
   106	
   107	_TIER_PRIORITY: dict[str, int] = {
   108	    "free": PRIORITY_FREE,
   109	    "developer": PRIORITY_DEVELOPER,
   110	    "business": PRIORITY_BUSINESS,
   111	    "enterprise": PRIORITY_ENTERPRISE,
   112	}
   113	
   114	
   115	async def _resolve_user_priority(user_id: int) -> int:
   116	    """Return the queue priority for a user based on their subscription tier."""
   117	    from app.models import Subscription
   118	
   119	    async with async_session_factory() as session:
   120	        result = await session.execute(
   121	            select(Subscription.tier).where(Subscription.user_id == user_id)
   122	        )
   123	        row = result.first()
   124	        tier = row[0] if row else "free"
   125	
   126	    return _TIER_PRIORITY.get(str(tier), PRIORITY_FREE)
   127	
   128	
   129	class QueueFullError(RuntimeError):
   130	    """Raised when the persistent queue is at capacity."""
   131	
   132	
   133	class TooManyPendingJobsError(RuntimeError):
   134	    """Raised when a user already has too many active jobs."""
   135	
   136	
   137	class TooManySubscribersError(RuntimeError):
   138	    """Raised when a job already has too many SSE subscribers."""
   139	
   140	

 succeeded in 1056ms:
console/backend/app/job_queue.py:168:        await self._initialize_db()
console/backend/app/job_queue.py:524:    async def _initialize_db(self) -> None:
console/backend/app/job_queue.py:525:        self._db_path.parent.mkdir(parents=True, exist_ok=True)
console/backend/app/job_queue.py:707:            output_dir = Path(tempfile.mkdtemp(prefix=f"violawake_job_{job.id}_", dir=str(settings.tmp_dir)))
console/backend/app/job_queue.py:1243:        # in case _row_to_job is called from a test that does not run _initialize_db.
console/backend/app/job_queue.py:1266:    async def _connect(self) -> AsyncIterator[aiosqlite.Connection]:
console/backend/app/config.py:195:settings.data_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:196:settings.upload_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:197:settings.models_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:198:settings.tmp_dir.mkdir(parents=True, exist_ok=True)

 succeeded in 1030ms:
   450	                """,
   451	                (user_id,),
   452	            )
   453	            await conn.commit()
   454	
   455	        retry_task = self._retry_tasks.pop(user_id, None)
   456	        if retry_task is not None:
   457	            retry_task.cancel()
   458	        logger.info("Resumed job queue for user %s", user_id)
   459	        await self._fill_queue_from_db()
   460	
   461	    async def get_circuit_breaker(self, user_id: int) -> CircuitBreakerState:
   462	        """Return the circuit breaker state for a user."""
   463	        async with self._connect() as conn:
   464	            return await self._get_circuit_breaker_with_conn(conn, user_id)
   465	
   466	    async def runtime_snapshot(self) -> dict[str, Any]:
   467	        """Return queue depth and worker state for health checks."""
   468	        async with self._connect() as conn:
   469	            async with conn.execute(
   470	                "SELECT COUNT(*) AS count FROM jobs WHERE status = ?",
   471	                (JobStatus.PENDING.value,),
   472	            ) as cursor:
   473	                pending_row = await cursor.fetchone()
   474	            async with conn.execute(
   475	                "SELECT COUNT(*) AS count FROM jobs WHERE status = ?",
   476	                (JobStatus.RUNNING.value,),
   477	            ) as cursor:
   478	                running_row = await cursor.fetchone()
   479	
   480	        pending_count = int(pending_row["count"]) if pending_row is not None else 0
   481	        persisted_running_count = int(running_row["count"]) if running_row is not None else 0
   482	        async with self._state_lock:
   483	            queued_job_ids = sorted(self._queued_job_ids)
   484	            running_job_ids = sorted(self._running_job_ids)
   485	
   486	        worker_task_running = self._worker_task is not None and not self._worker_task.done()
   487	        active_workers = len(running_job_ids)
   488	        max_workers = settings.max_concurrent_jobs
   489	        return {
   490	            "queue_depth": pending_count,
   491	            "in_memory_queue_depth": self._queue.qsize(),
   492	            "persisted_running_jobs": persisted_running_count,
   493	            "worker_status": {
   494	                "active_workers": active_workers,
   495	                "max_workers": max_workers,
   496	                "available_slots": max(max_workers - active_workers, 0),
   497	                "worker_task_running": worker_task_running,
   498	                "queued_job_ids": queued_job_ids,
   499	                "running_job_ids": running_job_ids,
   500	            },
   501	        }
   502	
   503	    def subscribe(self, job_id: int) -> asyncio.Queue[dict[str, Any]]:
   504	        """Subscribe to SSE-style job updates."""
   505	        queue: asyncio.Queue[dict[str, Any]] = asyncio.Queue()
   506	        listeners = self._subscribers.setdefault(job_id, [])
   507	        if len(listeners) >= MAX_SUBSCRIBERS_PER_JOB:
   508	            raise TooManySubscribersError("Too many subscribers for this job.")
   509	        listeners.append(queue)
   510	        return queue
   511	
   512	    def unsubscribe(self, job_id: int, queue: asyncio.Queue[dict[str, Any]]) -> None:
   513	        """Unsubscribe from SSE-style job updates."""
   514	        listeners = self._subscribers.get(job_id)
   515	        if listeners is None:
   516	            return
   517	        try:
   518	            listeners.remove(queue)
   519	        except ValueError:
   520	            return
   521	        if not listeners:
   522	            self._subscribers.pop(job_id, None)
   523	
   524	    async def _initialize_db(self) -> None:
   525	        self._db_path.parent.mkdir(parents=True, exist_ok=True)
   526	        async with self._connect() as conn:
   527	            await conn.execute("PRAGMA journal_mode=WAL")
   528	            await conn.execute("PRAGMA foreign_keys=ON")
   529	            await conn.execute(
   530	                """
   531	                CREATE TABLE IF NOT EXISTS jobs (
   532	                    id INTEGER PRIMARY KEY AUTOINCREMENT,
   533	                    user_id INTEGER NOT NULL,
   534	                    wake_word TEXT NOT NULL,
   535	                    status TEXT NOT NULL,
   536	                    created_at TEXT NOT NULL,
   537	                    started_at TEXT,
   538	                    completed_at TEXT,
   539	                    error TEXT,
   540	                    progress_pct REAL NOT NULL DEFAULT 0,
   541	                    recording_ids TEXT NOT NULL,
   542	                    epochs INTEGER NOT NULL DEFAULT 80,
   543	                    model_id INTEGER,
   544	                    d_prime REAL,
   545	                    priority INTEGER NOT NULL DEFAULT 0
   546	                )
   547	                """
   548	            )
   549	            await conn.execute(
   550	                """
   551	                CREATE TABLE IF NOT EXISTS user_circuit_breakers (
   552	                    user_id INTEGER PRIMARY KEY,
   553	                    consecutive_failures INTEGER NOT NULL DEFAULT 0,
   554	                    paused INTEGER NOT NULL DEFAULT 0,
   555	                    next_attempt_at TEXT,
   556	                    last_failure_at TEXT,
   557	                    pause_reason TEXT
   558	                )
   559	                """
   560	            )
   561	            await conn.execute(
   562	                "CREATE INDEX IF NOT EXISTS idx_jobs_user_created ON jobs(user_id, created_at DESC)"
   563	            )
   564	            await conn.execute(
   565	                "CREATE INDEX IF NOT EXISTS idx_jobs_status_created ON jobs(status, created_at ASC)"
   566	            )
   567	
   568	            # Migration: add priority column to existing databases that predate
   569	            # this feature.  Must run BEFORE the priority index creation below.
   570	            async with conn.execute("PRAGMA table_info(jobs)") as cursor:
   571	                columns = {row["name"] async for row in cursor}
   572	            if "priority" not in columns:
   573	                await conn.execute(
   574	                    "ALTER TABLE jobs ADD COLUMN priority INTEGER NOT NULL DEFAULT 0"
   575	                )
   576	                logger.info("Migrated jobs table: added priority column")
   577	
   578	            await conn.execute(
   579	                "CREATE INDEX IF NOT EXISTS idx_jobs_priority_created ON jobs(status, priority DESC, created_at ASC)"
   580	            )
   581	
   582	            await conn.commit()
   583	
   584	    async def _resume_jobs(self) -> None:
   585	        running_user_ids: set[int] = set()
   586	        async with self._connect() as conn:
   587	            async with conn.execute(
   588	                "SELECT DISTINCT user_id FROM jobs WHERE status = ?",
   589	                (JobStatus.RUNNING.value,),
   590	            ) as cursor:
   591	                rows = await cursor.fetchall()
   592	                running_user_ids = {int(row["user_id"]) for row in rows}
   593	
   594	            await conn.execute(
   595	                """
   596	                UPDATE jobs
   597	                SET status = ?, started_at = NULL, error = NULL
   598	                WHERE status IN (?, ?)
   599	                """,
   600	                (
   601	                    JobStatus.PENDING.value,
   602	                    JobStatus.PENDING.value,
   603	                    JobStatus.RUNNING.value,
   604	                ),
   605	            )
   606	            await conn.commit()
   607	
   608	            async with conn.execute(
   609	                """
   610	                SELECT user_id, next_attempt_at, paused
   611	                FROM user_circuit_breakers
   612	                WHERE next_attempt_at IS NOT NULL
   613	                """
   614	            ) as cursor:
   615	                breaker_rows = await cursor.fetchall()
   616	
   617	        for user_id in running_user_ids:
   618	            logger.info("Resumed interrupted training jobs for user %s", user_id)
   619	
   620	        now = _utcnow()

 succeeded in 1062ms:
     1	"""Application configuration for ViolaWake Console backend."""
     2	
     3	from __future__ import annotations
     4	
     5	import secrets
     6	from pathlib import Path
     7	from typing import Annotated, Any, Self
     8	
     9	from pydantic import field_validator, model_validator
    10	from pydantic_settings import BaseSettings, NoDecode, SettingsConfigDict
    11	
    12	DEFAULT_CORS_ORIGINS = [
    13	    "http://localhost:5173",
    14	    "http://127.0.0.1:5173",
    15	]
    16	
    17	
    18	def _generate_dev_secret_key() -> str:
    19	    """Generate a development-only JWT key."""
    20	    return secrets.token_urlsafe(32)
    21	
    22	
    23	class Settings(BaseSettings):
    24	    """Application settings loaded from environment / .env file."""
    25	
    26	    # Environment
    27	    env: str = "development"
    28	
    29	    # Paths
    30	    base_dir: Path = Path(__file__).resolve().parent.parent
    31	    data_dir: Path = Path(__file__).resolve().parent.parent / "data"
    32	    db_path: Path = Path(__file__).resolve().parent.parent / "data" / "violawake.db"
    33	    upload_dir: Path = Path(__file__).resolve().parent.parent / "data" / "recordings"
    34	    models_dir: Path = Path(__file__).resolve().parent.parent / "data" / "models"
    35	    tmp_dir: Path = Path(__file__).resolve().parent.parent / "data" / "tmp"
    36	
    37	    # Object storage
    38	    r2_endpoint: str = ""
    39	    r2_access_key_id: str = ""
    40	    r2_secret_access_key: str = ""
    41	    r2_bucket: str = "violawake"
    42	
    43	    # Database
    44	    db_url: str = ""  # Optional full SQLAlchemy async URL, e.g. Railway PostgreSQL
    45	
    46	    # Auth
    47	    secret_key: str = ""
    48	    algorithm: str = "HS256"
    49	    access_token_expire_hours: int = 2
    50	    trusted_proxy_count: int = 0
    51	
    52	    # CORS
    53	    cors_origins: Annotated[list[str], NoDecode] = DEFAULT_CORS_ORIGINS.copy()
    54	
    55	    # Training
    56	    training_timeout: int = 1800  # seconds (30 minutes)
    57	    max_concurrent_jobs: int = 2
    58	    negatives_corpus_dir: str = ""  # Path to curated negative audio corpus (paid tier)
    59	
    60	    # Retention cleanup (0 = disabled)
    61	    recording_retention_days: int = 90  # Days to keep recordings; 0 disables automatic cleanup
    62	    model_retention_days: int = 365  # Days to keep trained models; 0 disables automatic cleanup
    63	    post_training_retention_hours: int = 24  # Hours to keep recordings after training completes; 0 disables
    64	
    65	    # Admin
    66	    admin_token: str = ""  # When set, enables POST /api/admin/cleanup (protect with a strong secret)
    67	
    68	    # Stripe billing
    69	    stripe_secret_key: str = ""
    70	    stripe_webhook_secret: str = ""
    71	    stripe_price_developer: str = ""  # Stripe Price ID for $29/mo Developer tier
    72	    stripe_price_business: str = ""  # Stripe Price ID for $99/mo Business tier
    73	    sentry_dsn: str = ""
    74	
    75	    # Free trial
    76	    trial_days: int = 14  # 0 to disable free trial for paid tiers
    77	
    78	    # Console URLs (for Stripe checkout redirect)
    79	    console_base_url: str = "http://localhost:5173"
    80	
    81	    # Email
    82	    resend_api_key: str = ""
    83	
    84	    model_config = SettingsConfigDict(
    85	        env_prefix="VIOLAWAKE_",
    86	        env_file=".env",
    87	        extra="ignore",
    88	    )
    89	
    90	    @field_validator("env", mode="before")
    91	    @classmethod
    92	    def normalize_env(cls, value: Any) -> str:
    93	        """Accept mixed-case env names while keeping comparisons consistent."""
    94	        if value is None:
    95	            return "development"
    96	        return str(value).strip().lower() or "development"
    97	
    98	    @field_validator("cors_origins", mode="before")
    99	    @classmethod
   100	    def parse_cors_origins(cls, value: Any) -> Any:
   101	        """Accept comma-separated CORS origins from env vars."""
   102	        if isinstance(value, str):
   103	            raw = value.strip()
   104	            if not raw:
   105	                return []
   106	            if raw.startswith("["):
   107	                return value
   108	            return [origin.strip() for origin in raw.split(",") if origin.strip()]
   109	        return value
   110	
   111	    @model_validator(mode="after")
   112	    def validate_production_settings(self) -> Self:
   113	        """Resolve development defaults and enforce production requirements."""
   114	        _MIN_SECRET_KEY_LENGTH = 32
   115	        _INSECURE_PLACEHOLDERS = {"changeme", "secret", "password", "test", "dev"}
   116	
   117	        key = self.secret_key.strip()
   118	        key_is_empty = not key
   119	        key_is_placeholder = key.lower() in _INSECURE_PLACEHOLDERS
   120	
   121	        if key_is_empty or key_is_placeholder:
   122	            if self.is_production:
   123	                raise ValueError(
   124	                    "VIOLAWAKE_SECRET_KEY must be set to a unique, random value "
   125	                    "when VIOLAWAKE_ENV=production. "
   126	                    "Generate one with: python -c \"import secrets; print(secrets.token_urlsafe(64))\""
   127	                )
   128	            import logging
   129	            _logger = logging.getLogger("violawake.config")
   130	            self.secret_key = _generate_dev_secret_key()
   131	            _logger.warning(
   132	                "VIOLAWAKE_SECRET_KEY was empty or insecure — generated a random "
   133	                "development key. DO NOT use this in production."
   134	            )
   135	        elif len(key) < _MIN_SECRET_KEY_LENGTH:
   136	            if self.is_production:
   137	                raise ValueError(
   138	                    f"VIOLAWAKE_SECRET_KEY is too short ({len(key)} chars). "
   139	                    f"Production requires at least {_MIN_SECRET_KEY_LENGTH} characters. "
   140	                    "Generate one with: python -c \"import secrets; print(secrets.token_urlsafe(64))\""
   141	                )
   142	            import logging
   143	            _logger = logging.getLogger("violawake.config")
   144	            _logger.warning(
   145	                "VIOLAWAKE_SECRET_KEY is only %d characters (minimum %d recommended). "
   146	                "Short keys are brute-forceable. This is acceptable for development only.",
   147	                len(key),
   148	                _MIN_SECRET_KEY_LENGTH,
   149	            )
   150	        return self
   151	
   152	    @property
   153	    def is_production(self) -> bool:
   154	        return self.env == "production"
   155	
   156	    @property
   157	    def database_url(self) -> str:
   158	        """Return the configured database URL, defaulting to local SQLite."""
   159	        if self.db_url and self.db_url.strip():
   160	            return self.db_url.strip()
   161	        return f"sqlite+aiosqlite:///{self.db_path}"
   162	
   163	    @property
   164	    def database_log_target(self) -> str:
   165	        """Return a safe database identifier for logs without leaking credentials."""
   166	        if self.db_url and self.db_url.strip():
   167	            return "VIOLAWAKE_DB_URL"
   168	        return str(self.db_path)
   169	
   170	    @property
   171	    def effective_cors_origins(self) -> list[str]:
   172	        """Return CORS origins.
   173	
   174	        If ``cors_origins`` was explicitly set via env var, use exactly those.
   175	        Otherwise fall back to sensible defaults based on the environment.
   176	        """
   177	        if self.cors_origins != DEFAULT_CORS_ORIGINS:
   178	            return list(self.cors_origins)
   179	        if self.is_production:
   180	            return [

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/config.py | sed -n '180,210p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '1260,1285p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/entrypoint.sh | sed -n '1,80p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "tail -40 console/backend/requirements.txt"' in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 889ms:
fastapi>=0.110
slowapi>=0.1.9
uvicorn[standard]>=0.27
sqlalchemy>=2.0
alembic>=1.13
aiosqlite>=0.19
# Optional: needed when VIOLAWAKE_DB_URL points to PostgreSQL, such as Railway.
asyncpg>=0.29
PyJWT[crypto]>=2.8
bcrypt>=4.0
python-multipart>=0.0.9
sse-starlette>=1.6
pydantic>=2.0
pydantic-settings>=2.0
email-validator>=2.0
httpx>=0.27
resend>=2.0
numpy>=1.24
scipy>=1.11
edge-tts>=6.1
stripe>=8.0.0
sentry-sdk[fastapi]>=2.0
tomli>=2.0; python_version < "3.11"
boto3>=1.34
fast_mp3_augment

 succeeded in 930ms:
  1260	            model_id=int(row["model_id"]) if row["model_id"] is not None else None,
  1261	            d_prime=float(row["d_prime"]) if row["d_prime"] is not None else None,
  1262	            priority=priority,
  1263	        )
  1264	
  1265	    @asynccontextmanager
  1266	    async def _connect(self) -> AsyncIterator[aiosqlite.Connection]:
  1267	        connection = await aiosqlite.connect(self._db_path, timeout=30)
  1268	        connection.row_factory = aiosqlite.Row
  1269	        try:
  1270	            yield connection
  1271	        finally:
  1272	            await connection.close()
  1273	
  1274	
  1275	_job_queue: JobQueue | None = None
  1276	
  1277	
  1278	async def init_job_queue() -> JobQueue:
  1279	    """Initialize the process-wide job queue singleton."""
  1280	    global _job_queue
  1281	    if _job_queue is None:
  1282	        _job_queue = JobQueue(max_concurrent=settings.max_concurrent_jobs)
  1283	        await _job_queue.start()
  1284	    return _job_queue
  1285	

 succeeded in 947ms:
   180	            return [
   181	                "https://console.violawake.com",
   182	                "https://violawake.com",
   183	            ]
   184	        return list(DEFAULT_CORS_ORIGINS)
   185	
   186	    @property
   187	    def billing_enabled(self) -> bool:
   188	        """Billing features require a configured Stripe secret key."""
   189	        return bool(self.stripe_secret_key)
   190	
   191	
   192	settings = Settings()
   193	
   194	# Ensure runtime directories exist for uploads, models, and the default SQLite path.
   195	settings.data_dir.mkdir(parents=True, exist_ok=True)
   196	settings.upload_dir.mkdir(parents=True, exist_ok=True)
   197	settings.models_dir.mkdir(parents=True, exist_ok=True)
   198	settings.tmp_dir.mkdir(parents=True, exist_ok=True)

 succeeded in 909ms:
     1	#!/bin/sh
     2	set -e
     3	
     4	# Run database migrations before serving. Idempotent — alembic handles
     5	# already-applied migrations gracefully. Tolerates connection blips by
     6	# retrying briefly while postgres warms up.
     7	echo "[entrypoint] running alembic upgrade head"
     8	for i in 1 2 3 4 5; do
     9	    if alembic upgrade head; then
    10	        break
    11	    fi
    12	    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
    13	    sleep 3
    14	done
    15	
    16	# Ensure OpenWakeWord backbone resources are present. The openwakeword
    17	# wheel ships without ONNX/TFLite resources; download_models() fetches
    18	# them on first call. Idempotent — short-circuits if files exist.
    19	# Required for the training pipeline (OWWModel construction in
    20	# violawake_sdk.tools.train).
    21	echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
    22	python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
    23	    | tail -3 || true
    24	
    25	if [ "$(id -u)" = "0" ]; then
    26	    chown -R app:app /app/data
    27	    exec gosu app python run.py
    28	else
    29	    exec python run.py
    30	fi

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/billing.py | sed -n '1,120p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/jobs.py | sed -n '1,150p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/training.py | sed -n '1,80p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'TooManyPendingJobsError|QueueFullError|HTTPException\\(|quota|training_used|training_quota' console/backend/app/routes console/backend/app/job_queue.py | head -220\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1069ms:
     1	"""Billing routes: Stripe checkout, webhook, subscription management, usage."""
     2	
     3	import logging
     4	from datetime import datetime, timedelta, timezone
     5	from typing import Annotated
     6	
     7	from fastapi import APIRouter, Depends, Header, HTTPException, Request, status
     8	from sqlalchemy import delete, select, text, update
     9	from sqlalchemy.exc import IntegrityError
    10	from sqlalchemy.ext.asyncio import AsyncSession
    11	
    12	from app.auth import get_verified_user
    13	from app.config import settings
    14	from app.database import get_db
    15	from app.models import ProcessedStripeEvent, Subscription, UsageRecord, User
    16	from app.rate_limit import CHECKOUT_LIMIT, PORTAL_LIMIT, key_by_user, limiter, set_rate_limit_user
    17	from app.schemas import (
    18	    BillingPortalResponse,
    19	    CheckoutRequest,
    20	    CheckoutResponse,
    21	    SubscriptionResponse,
    22	    UsageResponse,
    23	)
    24	
    25	logger = logging.getLogger("violawake.billing")
    26	
    27	router = APIRouter(prefix="/api/billing", tags=["billing"])
    28	
    29	# ---------------------------------------------------------------------------
    30	# Tier limits — single source of truth
    31	# ---------------------------------------------------------------------------
    32	
    33	TIER_LIMITS: dict[str, int | None] = {
    34	    "free": 3,
    35	    "developer": 20,
    36	    "business": None,      # unlimited
    37	    "enterprise": None,    # unlimited / custom
    38	}
    39	
    40	TIER_PRICE_MAP: dict[str, str] = {
    41	    "developer": "stripe_price_developer",
    42	    "business": "stripe_price_business",
    43	}
    44	
    45	# ---------------------------------------------------------------------------
    46	# Webhook idempotency: persist Stripe event IDs so redeliveries after a backend
    47	# restart do not re-run billing side effects.
    48	# ---------------------------------------------------------------------------
    49	
    50	# ---------------------------------------------------------------------------
    51	# Helpers
    52	# ---------------------------------------------------------------------------
    53	
    54	def _require_billing() -> None:
    55	    """Raise 503 if Stripe is not configured."""
    56	    if not settings.billing_enabled:
    57	        raise HTTPException(
    58	            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
    59	            detail="Billing is not configured. Set VIOLAWAKE_STRIPE_SECRET_KEY to enable.",
    60	        )
    61	
    62	
    63	def _get_stripe():
    64	    """Return the stripe module, configured with the secret key.
    65	
    66	    Deferred import so the app starts even when stripe is not installed
    67	    (e.g., in development without billing).
    68	    """
    69	    import stripe
    70	    stripe.api_key = settings.stripe_secret_key
    71	    return stripe
    72	
    73	
    74	async def _verified_user_with_rate_key(
    75	    request: Request,
    76	    current_user: Annotated[User, Depends(get_verified_user)],
    77	) -> User:
    78	    """Resolve the verified user and stash the ID for per-user rate limiting."""
    79	    set_rate_limit_user(request, current_user.id)
    80	    return current_user
    81	
    82	
    83	def _current_period_start() -> datetime:
    84	    """Return the first instant of the current UTC month (billing period)."""
    85	    now = datetime.now(timezone.utc)
    86	    return now.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
    87	
    88	
    89	def _current_period_end() -> datetime:
    90	    """Return the first instant of the next UTC month (end of billing period)."""
    91	    now = datetime.now(timezone.utc)
    92	    if now.month == 12:
    93	        return now.replace(year=now.year + 1, month=1, day=1, hour=0, minute=0, second=0, microsecond=0)
    94	    return now.replace(month=now.month + 1, day=1, hour=0, minute=0, second=0, microsecond=0)
    95	
    96	
    97	def _price_id_for_tier(tier: str) -> str:
    98	    """Resolve the Stripe Price ID for a tier, or raise if not configured."""
    99	    attr = TIER_PRICE_MAP.get(tier)
   100	    if attr is None:
   101	        raise HTTPException(
   102	            status_code=status.HTTP_400_BAD_REQUEST,
   103	            detail=f"Tier '{tier}' is not available for checkout.",
   104	        )
   105	    price_id = getattr(settings, attr, "")
   106	    if not price_id:
   107	        raise HTTPException(
   108	            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
   109	            detail=f"Stripe price for '{tier}' tier is not configured.",
   110	        )
   111	    return price_id
   112	
   113	
   114	async def _get_or_create_subscription(db: AsyncSession, user_id: int) -> Subscription:
   115	    """Return the user's subscription row, creating a free-tier one if absent."""
   116	    result = await db.execute(
   117	        select(Subscription).where(Subscription.user_id == user_id)
   118	    )
   119	    sub = result.scalar_one_or_none()
   120	    if sub is None:

 succeeded in 1035ms:
console/backend/app/routes/auth.py:178:            raise HTTPException(
console/backend/app/routes/auth.py:187:        raise HTTPException(
console/backend/app/routes/auth.py:198:        raise HTTPException(
console/backend/app/routes/auth.py:249:            raise HTTPException(
console/backend/app/routes/auth.py:256:            raise HTTPException(
console/backend/app/routes/auth.py:285:        raise HTTPException(
console/backend/app/routes/auth.py:334:        raise HTTPException(
console/backend/app/routes/auth.py:355:        raise HTTPException(
console/backend/app/routes/auth.py:376:        raise HTTPException(
console/backend/app/routes/billing.py:57:        raise HTTPException(
console/backend/app/routes/billing.py:101:        raise HTTPException(
console/backend/app/routes/billing.py:107:        raise HTTPException(
console/backend/app/routes/billing.py:204:    requests could both pass the quota check before either increments.
console/backend/app/routes/billing.py:249:# Public helper: check_training_quota (dependency for training route)
console/backend/app/routes/billing.py:252:async def check_training_quota(
console/backend/app/routes/billing.py:256:    """FastAPI dependency that enforces the training quota for the current user.
console/backend/app/routes/billing.py:258:    Returns the User if quota is available; raises HTTP 403 if the monthly
console/backend/app/routes/billing.py:280:                    email_svc.send_quota_warning(
console/backend/app/routes/billing.py:299:        raise HTTPException(
console/backend/app/routes/billing.py:334:        raise HTTPException(
console/backend/app/routes/billing.py:390:        raise HTTPException(
console/backend/app/routes/billing.py:398:        raise HTTPException(
console/backend/app/routes/billing.py:411:        raise HTTPException(
console/backend/app/routes/billing.py:494:        raise HTTPException(
console/backend/app/routes/files.py:31:        raise HTTPException(
console/backend/app/routes/files.py:42:        raise HTTPException(
console/backend/app/routes/files.py:47:        raise HTTPException(
console/backend/app/routes/files.py:66:        raise HTTPException(
console/backend/app/routes/files.py:73:        raise HTTPException(
console/backend/app/routes/files.py:95:        raise HTTPException(
console/backend/app/routes/files.py:103:        raise HTTPException(
console/backend/app/routes/jobs.py:11:from app.job_queue import Job, QueueFullError, TooManyPendingJobsError, init_job_queue
console/backend/app/routes/jobs.py:14:from app.routes.billing import check_training_quota, record_usage
console/backend/app/routes/jobs.py:26:async def _quota_user_with_rate_key(
console/backend/app/routes/jobs.py:28:    current_user: Annotated[User, Depends(check_training_quota)],
console/backend/app/routes/jobs.py:30:    """Resolve the user via training-quota check and stash ID for rate limiting."""
console/backend/app/routes/jobs.py:71:        raise HTTPException(
console/backend/app/routes/jobs.py:78:        raise HTTPException(
console/backend/app/routes/jobs.py:84:        raise HTTPException(
console/backend/app/routes/jobs.py:108:    except TooManyPendingJobsError as exc:
console/backend/app/routes/jobs.py:109:        raise HTTPException(
console/backend/app/routes/jobs.py:113:    except QueueFullError as exc:
console/backend/app/routes/jobs.py:114:        raise HTTPException(
console/backend/app/routes/jobs.py:127:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Training job not found")
console/backend/app/routes/jobs.py:136:    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
console/backend/app/routes/jobs.py:195:        raise HTTPException(
console/backend/app/routes/models.py:186:        raise HTTPException(
console/backend/app/routes/models.py:194:        raise HTTPException(
console/backend/app/routes/models.py:215:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found")
console/backend/app/routes/models.py:232:    raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found")
console/backend/app/routes/models.py:278:        raise HTTPException(
console/backend/app/routes/models.py:372:        raise HTTPException(
console/backend/app/routes/recordings.py:62:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="File too small to be a valid WAV")
console/backend/app/routes/recordings.py:65:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Not a valid WAV file")
console/backend/app/routes/recordings.py:81:                raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Malformed fmt chunk")
console/backend/app/routes/recordings.py:84:                raise HTTPException(
console/backend/app/routes/recordings.py:105:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="WAV file missing fmt chunk")
console/backend/app/routes/recordings.py:108:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid WAV header values")
console/backend/app/routes/recordings.py:113:        raise HTTPException(
console/backend/app/routes/recordings.py:118:        raise HTTPException(
console/backend/app/routes/recordings.py:150:        raise HTTPException(
console/backend/app/routes/recordings.py:186:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="wake_word is required")
console/backend/app/routes/recordings.py:193:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Wake word contains no valid characters")
console/backend/app/routes/recordings.py:205:        raise HTTPException(
console/backend/app/routes/recordings.py:213:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Empty file")
console/backend/app/routes/recordings.py:216:        raise HTTPException(
console/backend/app/routes/recordings.py:239:        raise HTTPException(
console/backend/app/routes/recordings.py:320:        raise HTTPException(
console/backend/app/routes/teams.py:47:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Team not found")
console/backend/app/routes/teams.py:95:        raise HTTPException(
console/backend/app/routes/teams.py:112:            raise HTTPException(status_code=status.HTTP_409_CONFLICT, detail="Already a team member")
console/backend/app/routes/teams.py:237:            raise HTTPException(
console/backend/app/routes/teams.py:244:        raise HTTPException(
console/backend/app/routes/teams.py:283:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Token does not match team")
console/backend/app/routes/teams.py:299:        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Cannot remove the team owner")
console/backend/app/routes/teams.py:309:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Member not found")
console/backend/app/routes/teams.py:313:        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Insufficient role to remove this member")
console/backend/app/routes/teams.py:331:        raise HTTPException(
console/backend/app/routes/teams.py:352:        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Role must be 'admin' or 'member'")
console/backend/app/routes/teams.py:357:        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Cannot change the owner's role")
console/backend/app/routes/teams.py:370:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Member not found")
console/backend/app/routes/teams.py:389:        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Only the team owner can delete the team")
console/backend/app/routes/teams.py:417:        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found")
console/backend/app/routes/training.py:17:from app.routes.billing import check_training_quota
console/backend/app/routes/training.py:26:async def _quota_user_with_rate_key(
console/backend/app/routes/training.py:28:    current_user: Annotated[User, Depends(check_training_quota)],
console/backend/app/routes/training.py:30:    """Resolve the user via training-quota check and stash ID for rate limiting."""
console/backend/app/routes/training.py:47:    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
console/backend/app/routes/training.py:93:        raise HTTPException(
console/backend/app/routes/training.py:101:        raise HTTPException(
console/backend/app/routes/training.py:106:        raise HTTPException(
console/backend/app/routes/training.py:126:        raise HTTPException(
console/backend/app/routes/training.py:167:        raise HTTPException(
console/backend/app/job_queue.py:129:class QueueFullError(RuntimeError):
console/backend/app/job_queue.py:133:class TooManyPendingJobsError(RuntimeError):
console/backend/app/job_queue.py:216:            raise QueueFullError("Training queue is full. Please try again later.")
console/backend/app/job_queue.py:241:                    raise TooManyPendingJobsError(

 succeeded in 1045ms:
     1	"""Compatibility routes for legacy training endpoints."""
     2	
     3	import asyncio
     4	import json
     5	from typing import Annotated
     6	
     7	from fastapi import APIRouter, Depends, HTTPException, Query, Request, status
     8	from sqlalchemy import select
     9	from sqlalchemy.ext.asyncio import AsyncSession
    10	from sse_starlette.sse import EventSourceResponse
    11	
    12	from app.auth import decode_download_token, decode_token, get_verified_user
    13	from app.database import get_db
    14	from app.job_queue import TooManySubscribersError, init_job_queue
    15	from app.models import User
    16	from app.rate_limit import TRAINING_SUBMIT_LIMIT, key_by_user, limiter, set_rate_limit_user
    17	from app.routes.billing import check_training_quota
    18	from app.routes.jobs import get_owned_job_or_404, submit_training_job
    19	from app.schemas import JobSubmitRequest, TrainingStartResponse, TrainingStatusResponse
    20	
    21	router = APIRouter(prefix="/api/training", tags=["training"])
    22	MAX_SSE_PER_USER = 10
    23	_active_sse_connections: dict[int, int] = {}
    24	
    25	
    26	async def _quota_user_with_rate_key(
    27	    request: Request,
    28	    current_user: Annotated[User, Depends(check_training_quota)],
    29	) -> User:
    30	    """Resolve the user via training-quota check and stash ID for rate limiting."""
    31	    set_rate_limit_user(request, current_user.id)
    32	    return current_user
    33	
    34	
    35	def _legacy_status(status_value: str) -> str:
    36	    """Map queue status names to the legacy training API values."""
    37	    if status_value == "pending":
    38	        return "queued"
    39	    return status_value
    40	
    41	
    42	@router.post("/start", response_model=TrainingStartResponse, status_code=status.HTTP_202_ACCEPTED)
    43	@limiter.limit(TRAINING_SUBMIT_LIMIT, key_func=key_by_user)
    44	async def start_training(
    45	    request: Request,
    46	    body: JobSubmitRequest,
    47	    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
    48	    db: Annotated[AsyncSession, Depends(get_db)],
    49	) -> TrainingStartResponse:
    50	    """Submit a training job through the persistent queue."""
    51	    job_response = await submit_training_job(body, current_user, db)
    52	    return TrainingStartResponse(job_id=job_response.job_id, status=job_response.status)
    53	
    54	
    55	@router.get("/status/{job_id}", response_model=TrainingStatusResponse)
    56	async def get_training_status(
    57	    job_id: int,
    58	    current_user: Annotated[User, Depends(get_verified_user)],
    59	) -> TrainingStatusResponse:
    60	    """Get the current status of a queued training job."""
    61	    job = await get_owned_job_or_404(job_id, current_user)
    62	    return TrainingStatusResponse(
    63	        job_id=job.id,
    64	        status=_legacy_status(job.status.value),
    65	        progress=job.progress_pct,
    66	        d_prime=job.d_prime,
    67	        model_id=job.model_id,
    68	        error=job.error,
    69	    )
    70	
    71	
    72	async def _resolve_sse_user(
    73	    request: Request,
    74	    token: str | None,
    75	    db: AsyncSession,
    76	    *,
    77	    job_id: int,
    78	) -> User:
    79	    """Resolve the authenticated user for SSE endpoints."""
    80	    user_id: int | None = None

 succeeded in 1048ms:
     1	"""Async training job queue routes."""
     2	
     3	from typing import Annotated
     4	
     5	from fastapi import APIRouter, Depends, HTTPException, Request, status
     6	from sqlalchemy import select
     7	from sqlalchemy.ext.asyncio import AsyncSession
     8	
     9	from app.auth import get_verified_user
    10	from app.database import get_db
    11	from app.job_queue import Job, QueueFullError, TooManyPendingJobsError, init_job_queue
    12	from app.models import Recording, User
    13	from app.rate_limit import TRAINING_SUBMIT_LIMIT, key_by_user, limiter, set_rate_limit_user
    14	from app.routes.billing import check_training_quota, record_usage
    15	from app.schemas import (
    16	    JobCircuitBreakerResponse,
    17	    JobResponse,
    18	    JobSubmitRequest,
    19	    JobSubmitResponse,
    20	    MessageResponse,
    21	)
    22	
    23	router = APIRouter(prefix="/api/jobs", tags=["jobs"])
    24	
    25	
    26	async def _quota_user_with_rate_key(
    27	    request: Request,
    28	    current_user: Annotated[User, Depends(check_training_quota)],
    29	) -> User:
    30	    """Resolve the user via training-quota check and stash ID for rate limiting."""
    31	    set_rate_limit_user(request, current_user.id)
    32	    return current_user
    33	
    34	
    35	def serialize_job(job: Job) -> JobResponse:
    36	    """Convert a queue job dataclass into an API response."""
    37	    return JobResponse(
    38	        job_id=job.id,
    39	        user_id=job.user_id,
    40	        wake_word=job.wake_word,
    41	        status=job.status.value,
    42	        created_at=job.created_at,
    43	        started_at=job.started_at,
    44	        completed_at=job.completed_at,
    45	        error=job.error,
    46	        progress_pct=job.progress_pct,
    47	        d_prime=job.d_prime,
    48	        model_id=job.model_id,
    49	    )
    50	
    51	
    52	async def validate_training_request(
    53	    body: JobSubmitRequest,
    54	    current_user: User,
    55	    db: AsyncSession,
    56	) -> tuple[str, list[int], int]:
    57	    """Validate the submitted recordings for a new training job."""
    58	    wake_word = body.wake_word.strip().lower()
    59	    result = await db.execute(
    60	        select(Recording).where(
    61	            Recording.id.in_(body.recording_ids),
    62	            Recording.user_id == current_user.id,
    63	            Recording.deleted_at.is_(None),
    64	        )
    65	    )
    66	    recordings = result.scalars().all()
    67	
    68	    if len(recordings) != len(body.recording_ids):
    69	        found_ids = {recording.id for recording in recordings}
    70	        missing = [recording_id for recording_id in body.recording_ids if recording_id not in found_ids]
    71	        raise HTTPException(
    72	            status_code=status.HTTP_404_NOT_FOUND,
    73	            detail=f"Recordings not found or not owned by you: {missing}",
    74	        )
    75	
    76	    wrong_word = [recording.id for recording in recordings if recording.wake_word != wake_word]
    77	    if wrong_word:
    78	        raise HTTPException(
    79	            status_code=status.HTTP_400_BAD_REQUEST,
    80	            detail=f"Recordings {wrong_word} do not match wake word '{body.wake_word}'",
    81	        )
    82	
    83	    if len(recordings) < 5:
    84	        raise HTTPException(
    85	            status_code=status.HTTP_400_BAD_REQUEST,
    86	            detail=f"Need at least 5 recordings. Got {len(recordings)}.",
    87	        )
    88	
    89	    return wake_word, list(body.recording_ids), body.epochs
    90	
    91	
    92	async def submit_training_job(
    93	    body: JobSubmitRequest,
    94	    current_user: User,
    95	    db: AsyncSession,
    96	) -> JobSubmitResponse:
    97	    """Validate and enqueue a training job."""
    98	    wake_word, recording_ids, epochs = await validate_training_request(body, current_user, db)
    99	    queue = await init_job_queue()
   100	
   101	    try:
   102	        job_id = await queue.submit_job(
   103	            user_id=current_user.id,
   104	            wake_word=wake_word,
   105	            recording_ids=recording_ids,
   106	            epochs=epochs,
   107	        )
   108	    except TooManyPendingJobsError as exc:
   109	        raise HTTPException(
   110	            status_code=status.HTTP_429_TOO_MANY_REQUESTS,
   111	            detail=str(exc),
   112	        ) from exc
   113	    except QueueFullError as exc:
   114	        raise HTTPException(
   115	            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
   116	            detail=str(exc),
   117	        ) from exc
   118	
   119	    await record_usage(db, current_user.id, action="training_job")
   120	    return JobSubmitResponse(job_id=job_id, status="queued")
   121	
   122	
   123	async def get_owned_job_or_404(job_id: int, current_user: User) -> Job:
   124	    """Return an owned job or raise 404."""
   125	    job = await (await init_job_queue()).get_job(job_id)
   126	    if job is None or job.user_id != current_user.id:
   127	        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Training job not found")
   128	    return job
   129	
   130	
   131	@router.post("", response_model=JobSubmitResponse, status_code=status.HTTP_202_ACCEPTED)
   132	@limiter.limit(TRAINING_SUBMIT_LIMIT, key_func=key_by_user)
   133	async def create_job(
   134	    request: Request,
   135	    body: JobSubmitRequest,
   136	    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
   137	    db: Annotated[AsyncSession, Depends(get_db)],
   138	) -> JobSubmitResponse:
   139	    """Submit a new training job."""
   140	    return await submit_training_job(body, current_user, db)
   141	
   142	
   143	@router.get("", response_model=list[JobResponse])
   144	async def list_jobs(
   145	    current_user: Annotated[User, Depends(get_verified_user)],
   146	) -> list[JobResponse]:
   147	    """List the current user's training jobs."""
   148	    jobs = await (await init_job_queue()).list_jobs(current_user.id)
   149	    return [serialize_job(job) for job in jobs]
   150	

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/billing.py | sed -n '180,320p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/models.py | sed -n '110,160p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/jobs.py | sed -n '90,125p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/routes/training.py | sed -n '42,55p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1000ms:
   180	    if not event_id:
   181	        return True
   182	
   183	    result = await db.execute(
   184	        text(
   185	            "INSERT INTO processed_stripe_events (event_id) "
   186	            "VALUES (:event_id) "
   187	            "ON CONFLICT (event_id) DO NOTHING"
   188	        ),
   189	        {"event_id": event_id},
   190	    )
   191	    return result.rowcount != 0
   192	
   193	
   194	# ---------------------------------------------------------------------------
   195	# Public helper: record_usage (called by training service)
   196	# ---------------------------------------------------------------------------
   197	
   198	async def record_usage(db: AsyncSession, user_id: int, action: str = "training_job") -> None:
   199	    """Increment the usage counter for the current billing period.
   200	
   201	    Creates the UsageRecord row if it does not exist yet.
   202	
   203	    Uses atomic SQL to avoid a read-modify-write race where two concurrent
   204	    requests could both pass the quota check before either increments.
   205	    """
   206	    period_start = _current_period_start()
   207	
   208	    # Attempt atomic UPDATE first: SET count = count + 1 in SQL, not Python.
   209	    result = await db.execute(
   210	        update(UsageRecord)
   211	        .where(
   212	            UsageRecord.user_id == user_id,
   213	            UsageRecord.action == action,
   214	            UsageRecord.period_start == period_start,
   215	        )
   216	        .values(count=UsageRecord.count + 1)
   217	    )
   218	
   219	    if result.rowcount == 0:
   220	        # No existing row — insert one.  If a concurrent request races us and
   221	        # inserts between our UPDATE and this INSERT, the unique constraint
   222	        # (user_id, action, period_start) will raise IntegrityError; retry
   223	        # with the atomic UPDATE which is now guaranteed to match.
   224	        try:
   225	            db.add(UsageRecord(
   226	                user_id=user_id,
   227	                action=action,
   228	                period_start=period_start,
   229	                count=1,
   230	            ))
   231	            await db.flush()
   232	        except IntegrityError:
   233	            await db.rollback()
   234	            await db.execute(
   235	                update(UsageRecord)
   236	                .where(
   237	                    UsageRecord.user_id == user_id,
   238	                    UsageRecord.action == action,
   239	                    UsageRecord.period_start == period_start,
   240	                )
   241	                .values(count=UsageRecord.count + 1)
   242	            )
   243	            await db.flush()
   244	    else:
   245	        await db.flush()
   246	
   247	
   248	# ---------------------------------------------------------------------------
   249	# Public helper: check_training_quota (dependency for training route)
   250	# ---------------------------------------------------------------------------
   251	
   252	async def check_training_quota(
   253	    current_user: Annotated[User, Depends(get_verified_user)],
   254	    db: Annotated[AsyncSession, Depends(get_db)],
   255	) -> User:
   256	    """FastAPI dependency that enforces the training quota for the current user.
   257	
   258	    Returns the User if quota is available; raises HTTP 403 if the monthly
   259	    limit has been reached.
   260	    """
   261	    sub = await _get_or_create_subscription(db, current_user.id)
   262	    limit = TIER_LIMITS.get(sub.tier)
   263	
   264	    # None means unlimited
   265	    if limit is None:
   266	        return current_user
   267	
   268	    used = await _get_usage_count(db, current_user.id)
   269	
   270	    # Warn at 80% of limit (best-effort, non-blocking).
   271	    if limit and used == int(limit * 0.8):
   272	        try:
   273	            from app.email_service import get_email_service
   274	
   275	            email_svc = get_email_service()
   276	            if email_svc.enabled:
   277	                import asyncio
   278	
   279	                asyncio.create_task(
   280	                    email_svc.send_quota_warning(
   281	                        to=current_user.email,
   282	                        used=used,
   283	                        limit=limit,
   284	                        tier=sub.tier,
   285	                    )
   286	                )
   287	        except Exception:
   288	            pass  # non-critical — don't block training
   289	
   290	    if used >= limit:
   291	        tier_name = sub.tier.capitalize()
   292	        if sub.tier == "free":
   293	            upgrade_msg = "Upgrade to Developer for 20 models/month."
   294	        elif sub.tier == "developer":
   295	            upgrade_msg = "Upgrade to Business for unlimited models."
   296	        else:
   297	            upgrade_msg = "Contact sales for a custom plan."
   298	
   299	        raise HTTPException(
   300	            status_code=status.HTTP_403_FORBIDDEN,
   301	            detail=(
   302	                f"Monthly training limit reached ({used}/{limit}). "
   303	                f"You are on the {tier_name} plan. {upgrade_msg}"
   304	            ),
   305	            headers={"X-Upgrade-URL": "/pricing"},
   306	        )
   307	
   308	    return current_user
   309	
   310	
   311	# ---------------------------------------------------------------------------
   312	# Routes
   313	# ---------------------------------------------------------------------------
   314	
   315	@router.post("/checkout", response_model=CheckoutResponse)
   316	@limiter.limit(CHECKOUT_LIMIT, key_func=key_by_user)
   317	async def create_checkout_session(
   318	    request: Request,
   319	    body: CheckoutRequest,
   320	    current_user: Annotated[User, Depends(_verified_user_with_rate_key)],

 succeeded in 1009ms:
   110	    d_prime: Mapped[float | None] = mapped_column(Float, nullable=True)
   111	    size_bytes: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
   112	    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
   113	
   114	    user: Mapped[User] = relationship(back_populates="trained_models")
   115	
   116	
   117	class Subscription(Base):
   118	    __tablename__ = "subscriptions"
   119	
   120	    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
   121	    user_id: Mapped[int] = mapped_column(
   122	        Integer, ForeignKey("users.id", ondelete="CASCADE"), unique=True, nullable=False, index=True,
   123	    )
   124	    stripe_customer_id: Mapped[str | None] = mapped_column(String(255), nullable=True, index=True)
   125	    stripe_subscription_id: Mapped[str | None] = mapped_column(String(255), nullable=True, unique=True)
   126	    tier: Mapped[str] = mapped_column(String(20), nullable=False, default="free")
   127	    status: Mapped[str] = mapped_column(String(20), nullable=False, default="active")
   128	    current_period_end: Mapped[datetime | None] = mapped_column(DateTime(timezone=True), nullable=True)
   129	    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
   130	    updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow)
   131	
   132	    user: Mapped[User] = relationship(backref="subscription")
   133	
   134	
   135	class UsageRecord(Base):
   136	    __tablename__ = "usage_records"
   137	    __table_args__ = (
   138	        UniqueConstraint("user_id", "action", "period_start", name="uq_usage_user_action_period"),
   139	    )
   140	
   141	    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
   142	    user_id: Mapped[int] = mapped_column(
   143	        Integer, ForeignKey("users.id", ondelete="CASCADE"), nullable=False, index=True,
   144	    )
   145	    action: Mapped[str] = mapped_column(String(50), nullable=False)
   146	    period_start: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False)
   147	    count: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
   148	    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
   149	
   150	
   151	class ProcessedStripeEvent(Base):
   152	    __tablename__ = "processed_stripe_events"
   153	
   154	    event_id: Mapped[str] = mapped_column(Text, primary_key=True)
   155	    processed_at: Mapped[datetime] = mapped_column(
   156	        DateTime(timezone=True),
   157	        nullable=False,
   158	        server_default=func.now(),
   159	    )

 succeeded in 991ms:
    42	@router.post("/start", response_model=TrainingStartResponse, status_code=status.HTTP_202_ACCEPTED)
    43	@limiter.limit(TRAINING_SUBMIT_LIMIT, key_func=key_by_user)
    44	async def start_training(
    45	    request: Request,
    46	    body: JobSubmitRequest,
    47	    current_user: Annotated[User, Depends(_quota_user_with_rate_key)],
    48	    db: Annotated[AsyncSession, Depends(get_db)],
    49	) -> TrainingStartResponse:
    50	    """Submit a training job through the persistent queue."""
    51	    job_response = await submit_training_job(body, current_user, db)
    52	    return TrainingStartResponse(job_id=job_response.job_id, status=job_response.status)
    53	
    54	
    55	@router.get("/status/{job_id}", response_model=TrainingStatusResponse)

 succeeded in 1005ms:
    90	
    91	
    92	async def submit_training_job(
    93	    body: JobSubmitRequest,
    94	    current_user: User,
    95	    db: AsyncSession,
    96	) -> JobSubmitResponse:
    97	    """Validate and enqueue a training job."""
    98	    wake_word, recording_ids, epochs = await validate_training_request(body, current_user, db)
    99	    queue = await init_job_queue()
   100	
   101	    try:
   102	        job_id = await queue.submit_job(
   103	            user_id=current_user.id,
   104	            wake_word=wake_word,
   105	            recording_ids=recording_ids,
   106	            epochs=epochs,
   107	        )
   108	    except TooManyPendingJobsError as exc:
   109	        raise HTTPException(
   110	            status_code=status.HTTP_429_TOO_MANY_REQUESTS,
   111	            detail=str(exc),
   112	        ) from exc
   113	    except QueueFullError as exc:
   114	        raise HTTPException(
   115	            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
   116	            detail=str(exc),
   117	        ) from exc
   118	
   119	    await record_usage(db, current_user.id, action="training_job")
   120	    return JobSubmitResponse(job_id=job_id, status="queued")
   121	
   122	
   123	async def get_owned_job_or_404(job_id: int, current_user: User) -> Job:
   124	    """Return an owned job or raise 404."""
   125	    job = await (await init_job_queue()).get_job(job_id)

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'import edge_tts|from pydub|import torchaudio|from audiomentations|import onnxruntime|import torch|import onnx|import scipy|import numpy|import librosa|fast_mp3_augment|from kokoro_onnx|import soundfile' src/violawake_sdk console/backend | head -260\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -nE 'edge-tts|pydub|torchaudio|audiomentations|onnxruntime|torch|onnx|scipy|numpy|librosa|fast_mp3|kokoro|soundfile|openwakeword' pyproject.toml console/backend/requirements.txt\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba pyproject.toml | sed -n '1,160p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'fast_mp3_augment|Mp3Compression|audiomentations' src/violawake_sdk console/backend | head -80\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 967ms:
pyproject.toml:19:    "onnx",
pyproject.toml:39:# openwakeword is optional — see [oww] extra (tflite-runtime has no Python 3.12+ Linux wheels)
pyproject.toml:41:    "onnxruntime>=1.17",
pyproject.toml:42:    "numpy>=1.24",
pyproject.toml:43:    "scipy>=1.11",
pyproject.toml:51:    "soundfile>=0.12",
pyproject.toml:62:    "kokoro-onnx>=0.4",
pyproject.toml:71:# OWW: openwakeword backbone (optional due to tflite-runtime Python 3.12 issue)
pyproject.toml:73:    "openwakeword>=0.6",
pyproject.toml:76:# TFLite: lightweight inference backend (alternative to onnxruntime)
pyproject.toml:89:    "torch>=2.1",
pyproject.toml:90:    "torchaudio>=2.1",
pyproject.toml:91:    "openwakeword>=0.6",
pyproject.toml:92:    "audiomentations>=0.37",
pyproject.toml:93:    "librosa>=0.10",
pyproject.toml:97:    "edge-tts>=6.1",
pyproject.toml:98:    "pydub>=0.25",
pyproject.toml:99:    "onnx>=1.15",
pyproject.toml:107:# Sample generation: TTS-based sample creation (no torch needed)
pyproject.toml:109:    "edge-tts>=6.1",
pyproject.toml:110:    "pydub>=0.25",
pyproject.toml:111:    "soundfile>=0.12",
pyproject.toml:181:    "*.onnx",
pyproject.toml:231:# numpy/torch return types are complex; these codes are noisy in strict mode
pyproject.toml:237:    "openwakeword.*",
pyproject.toml:238:    "onnxruntime.*",
pyproject.toml:241:    "soundfile.*",
pyproject.toml:243:    "kokoro_onnx.*",
pyproject.toml:247:    "torchaudio.*",
pyproject.toml:248:    "torch.*",
pyproject.toml:250:    "scipy.*",
console/backend/requirements.txt:18:numpy>=1.24
console/backend/requirements.txt:19:scipy>=1.11
console/backend/requirements.txt:20:edge-tts>=6.1
console/backend/requirements.txt:25:fast_mp3_augment

 succeeded in 1035ms:
     1	[build-system]
     2	requires = ["hatchling>=1.21"]
     3	build-backend = "hatchling.build"
     4	
     5	[project]
     6	name = "violawake"
     7	version = "0.2.3"
     8	description = "Open-source wake word detection SDK with training pipeline — privacy-first, on-device, Python-native"
     9	readme = "README.md"
    10	license = { file = "LICENSE" }
    11	authors = [
    12	    { name = "ViolaWake Contributors" },
    13	]
    14	keywords = [
    15	    "wake-word",
    16	    "voice-assistant",
    17	    "speech-recognition",
    18	    "on-device",
    19	    "onnx",
    20	    "tts",
    21	    "stt",
    22	]
    23	classifiers = [
    24	    "Development Status :: 3 - Alpha",
    25	    "Intended Audience :: Developers",
    26	    "License :: OSI Approved :: Apache Software License",
    27	    "Programming Language :: Python :: 3",
    28	    "Programming Language :: Python :: 3.10",
    29	    "Programming Language :: Python :: 3.11",
    30	    "Programming Language :: Python :: 3.12",
    31	    "Topic :: Multimedia :: Sound/Audio :: Speech",
    32	    "Topic :: Scientific/Engineering :: Artificial Intelligence",
    33	    "Operating System :: OS Independent",
    34	]
    35	requires-python = ">=3.10"
    36	
    37	# Core dependencies (inference-only: wake word + VAD)
    38	# pyaudio, requests, and tqdm are optional — see [audio] and [download] extras
    39	# openwakeword is optional — see [oww] extra (tflite-runtime has no Python 3.12+ Linux wheels)
    40	dependencies = [
    41	    "onnxruntime>=1.17",
    42	    "numpy>=1.24",
    43	    "scipy>=1.11",
    44	    "pysbd>=0.3.4",
    45	]
    46	
    47	[project.optional-dependencies]
    48	# Audio: microphone capture (pyaudio)
    49	audio = [
    50	    "pyaudio>=0.2.14",
    51	    "soundfile>=0.12",
    52	]
    53	
    54	# Download: model downloading with progress bars
    55	download = [
    56	    "requests>=2.31",
    57	    "tqdm>=4.66",
    58	]
    59	
    60	# TTS: Kokoro-82M on-device TTS
    61	tts = [
    62	    "kokoro-onnx>=0.4",
    63	    "sounddevice>=0.4",
    64	]
    65	
    66	# STT: faster-whisper transcription
    67	stt = [
    68	    "faster-whisper>=1.0",
    69	]
    70	
    71	# OWW: openwakeword backbone (optional due to tflite-runtime Python 3.12 issue)
    72	oww = [
    73	    "openwakeword>=0.6",
    74	]
    75	
    76	# TFLite: lightweight inference backend (alternative to onnxruntime)
    77	tflite = [
    78	    "tflite-runtime>=2.14.0",
    79	]
    80	
    81	# VAD: preferred VAD backends
    82	vad = [
    83	    "silero-vad>=6.2.1",
    84	    "webrtcvad>=2.0.10",
    85	]
    86	
    87	# Training: model training pipeline
    88	training = [
    89	    "torch>=2.1",
    90	    "torchaudio>=2.1",
    91	    "openwakeword>=0.6",
    92	    "audiomentations>=0.37",
    93	    "librosa>=0.10",
    94	    "scikit-learn>=1.3",
    95	    "matplotlib>=3.8",
    96	    "pandas>=2.1",
    97	    "edge-tts>=6.1",
    98	    "pydub>=0.25",
    99	    "onnx>=1.15",
   100	]
   101	
   102	# Documentation: API reference generation
   103	docs = [
   104	    "pdoc>=14.0",
   105	]
   106	
   107	# Sample generation: TTS-based sample creation (no torch needed)
   108	generate = [
   109	    "edge-tts>=6.1",
   110	    "pydub>=0.25",
   111	    "soundfile>=0.12",
   112	]
   113	
   114	# Full install
   115	all = [
   116	    "violawake[audio,download,tts,stt,oww,tflite,vad,training,generate]",
   117	]
   118	
   119	# Dev / CI
   120	dev = [
   121	    "hatchling>=1.21",
   122	    "pytest>=8.0",
   123	    "pytest-cov>=4.1",
   124	    "pytest-benchmark>=4.0",
   125	    "ruff>=0.3",
   126	    "mypy>=1.8",
   127	    "pytest-asyncio>=0.23",
   128	    "pre-commit>=3.6",
   129	    "pyaudio>=0.2.14",
   130	    "requests>=2.31",
   131	    "tqdm>=4.66",
   132	    "types-requests>=2.31",
   133	]
   134	
   135	[project.scripts]
   136	violawake-train = "violawake_sdk.tools.train:main"
   137	violawake-eval = "violawake_sdk.tools.evaluate:main"
   138	violawake-collect = "violawake_sdk.tools.collect_samples:main"
   139	violawake-download = "violawake_sdk.tools.download_model:main"
   140	violawake-expand-corpus = "violawake_sdk.tools.expand_corpus:main"
   141	violawake-streaming-eval = "violawake_sdk.tools.streaming_eval:main"
   142	violawake-test-confusables = "violawake_sdk.tools.test_confusables:main"
   143	violawake-contamination-check = "violawake_sdk.tools.contamination_check:main"
   144	violawake-generate = "violawake_sdk.tools.generate_samples:main"
   145	
   146	[project.urls]
   147	Homepage = "https://github.com/GeeIHadAGoodTime/ViolaWake"
   148	Documentation = "https://github.com/GeeIHadAGoodTime/ViolaWake#readme"
   149	Repository = "https://github.com/GeeIHadAGoodTime/ViolaWake"
   150	"Bug Tracker" = "https://github.com/GeeIHadAGoodTime/ViolaWake/issues"
   151	
   152	[tool.hatch.build.targets.sdist]
   153	exclude = [
   154	    "_training_corpus/",
   155	    "benchmark_v2/",
   156	    "benchmark_oww/",
   157	    "console/",
   158	    "corpus/",
   159	    "data/",
   160	    "dist/",

 succeeded in 2683ms:
src/violawake_sdk/async_detector.py:29:import numpy as np
src/violawake_sdk/audio.py:15:import numpy as np
src/violawake_sdk/audio.py:45:    import torchaudio
src/violawake_sdk/audio.py:46:    import torchaudio.transforms as T
src/violawake_sdk/audio.py:214:    import librosa  # Lazy import to avoid startup cost for inference
src/violawake_sdk/audio_source.py:27:import numpy as np
src/violawake_sdk/audio_source.py:176:                import soundfile as sf
src/violawake_sdk/backends/base.py:13:import numpy as np
src/violawake_sdk/backends/onnx_backend.py:14:import numpy as np
src/violawake_sdk/backends/onnx_backend.py:71:        import onnxruntime  # noqa: F401
src/violawake_sdk/backends/onnx_backend.py:80:        import onnxruntime as ort
src/violawake_sdk/backends/onnx_backend.py:97:            import onnxruntime  # noqa: F401
src/violawake_sdk/backends/tflite_backend.py:27:import numpy as np
src/violawake_sdk/backends/tflite_backend.py:406:        import onnx2tf  # noqa: F401
src/violawake_sdk/backends/tflite_backend.py:422:        import onnx
src/violawake_sdk/ensemble.py:18:import numpy as np
src/violawake_sdk/noise_profiler.py:15:import numpy as np
src/violawake_sdk/oww_backbone.py:12:import numpy as np
src/violawake_sdk/pipeline.py:12:import numpy as np
src/violawake_sdk/power_manager.py:15:import numpy as np
src/violawake_sdk/speaker.py:21:import numpy as np
src/violawake_sdk/stt.py:40:import numpy as np
src/violawake_sdk/stt_engine.py:27:import numpy as np
src/violawake_sdk/stt_engine.py:28:import scipy.io.wavfile
src/violawake_sdk/tools/confusables.py:10:import numpy as np
src/violawake_sdk/tools/contamination_check.py:28:    import numpy as np
src/violawake_sdk/tools/contamination_check.py:100:    import numpy as np
src/violawake_sdk/tools/evaluate.py:181:        import numpy as np
src/violawake_sdk/tools/expand_corpus.py:33:    import numpy as np
src/violawake_sdk/tools/expand_corpus.py:229:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:39:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:55:        from pydub import AudioSegment
src/violawake_sdk/tools/generate_samples.py:94:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:102:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:110:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:121:    import numpy as np
src/violawake_sdk/tools/generate_samples.py:143:        import edge_tts
src/violawake_sdk/tools/generate_samples.py:167:        import edge_tts
src/violawake_sdk/tools/generate_samples.py:484:        import edge_tts  # noqa: F401
src/violawake_sdk/tools/streaming_eval.py:59:        import soundfile as sf
src/violawake_sdk/tools/streaming_eval.py:65:    import numpy as np
src/violawake_sdk/tools/test_confusables.py:20:import numpy as np
src/violawake_sdk/tools/train.py:68:    import numpy as np
src/violawake_sdk/tools/train.py:227:    import onnxruntime as ort
src/violawake_sdk/tools/train.py:272:        import edge_tts
src/violawake_sdk/tools/train.py:304:            from pydub import AudioSegment
src/violawake_sdk/tools/train.py:322:            import torchaudio
src/violawake_sdk/tools/train.py:343:    import numpy as np
src/violawake_sdk/tools/train.py:363:    import numpy as np
src/violawake_sdk/tools/train.py:400:    import numpy as np
src/violawake_sdk/tools/train.py:606:    import numpy as np
src/violawake_sdk/tools/train.py:634:    import numpy as np
src/violawake_sdk/tools/train.py:637:        from audiomentations import (
src/violawake_sdk/tools/train.py:686:    import numpy as np
src/violawake_sdk/tools/train.py:721:    import numpy as np
src/violawake_sdk/tools/train.py:865:    import numpy as np
src/violawake_sdk/tools/train.py:950:    import numpy as np
src/violawake_sdk/tools/train.py:1056:        import numpy as np
src/violawake_sdk/tools/train.py:1057:        import torch
src/violawake_sdk/tools/train.py:1058:        import torch.optim as optim
src/violawake_sdk/tools/train.py:1540:    import numpy as np
src/violawake_sdk/tools/train.py:1541:    import torch
src/violawake_sdk/tools/train.py:1728:        import numpy as np
src/violawake_sdk/tools/train.py:1729:        import torch
src/violawake_sdk/tools/train.py:1730:        import torch.nn as nn
src/violawake_sdk/tools/train.py:1731:        import torch.optim as optim
src/violawake_sdk/tools/train.py:2066:    import numpy as np
src/violawake_sdk/tools/train.py:2067:    import onnx
src/violawake_sdk/tools/train.py:2068:    from onnx import numpy_helper
src/violawake_sdk/training/augment.py:27:import numpy as np
src/violawake_sdk/training/evaluate.py:18:import numpy as np
src/violawake_sdk/training/evaluate.py:372:        import onnxruntime as ort
src/violawake_sdk/training/losses.py:14:    import torch
src/violawake_sdk/training/losses.py:15:    import torch.nn as nn
src/violawake_sdk/training/temporal_model.py:39:    import torch
src/violawake_sdk/training/temporal_model.py:40:    import torch.nn as nn
src/violawake_sdk/training/weight_averaging.py:44:    import torch
src/violawake_sdk/training/weight_averaging.py:45:    import torch.nn as nn
src/violawake_sdk/tts.py:25:import numpy as np
src/violawake_sdk/vad.py:25:import numpy as np
src/violawake_sdk/vad.py:161:            import torch
src/violawake_sdk/wake_detector.py:15:import numpy as np
console/backend/app/routes/recordings.py:10:import numpy as np
console/backend/app/routes/recordings.py:130:    import numpy as np
console/backend/scripts/train_full_pipeline.py:28:import numpy as np
console/backend/scripts/train_full_pipeline.py:56:    import edge_tts
console/backend/scripts/train_full_pipeline.py:67:    from pydub import AudioSegment

 succeeded in 2660ms:
src/violawake_sdk/tools/train.py:629:    """Augment positive clips with the roadmap audiomentations chain.
src/violawake_sdk/tools/train.py:637:        from audiomentations import (
src/violawake_sdk/tools/train.py:640:            Mp3Compression,
src/violawake_sdk/tools/train.py:647:            "audiomentations is required for positive augmentation. "
src/violawake_sdk/tools/train.py:659:            Mp3Compression(min_bitrate=32, max_bitrate=128, p=0.3),
src/violawake_sdk/tools/train.py:671:            # audiomentations reads numpy's global RNG internally.
src/violawake_sdk/tools/train.py:1116:            print("\nStep 2: Augmenting positive audio arrays with audiomentations...")

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"find . -maxdepth 3 -iname '*docker*' -o -name 'Dockerfile*' | sort\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='Dockerfile*' --include='*.yml' --include='*.yaml' --include='*.toml' --include='*.sh' -E 'pip install|requirements|violawake\\[|pyproject|console/backend' . | head -220\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' --include='*.sh' -E 'violawake_sdk|PYTHONPATH|sys.path|pip install' console/backend | head -220\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"find console -maxdepth 3 -type f | sort | grep -E 'Dockerfile|requirements|railway|Procfile|entrypoint|\\.sh"'$'"'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1503ms:
console/backend/entrypoint.sh
console/backend/requirements.txt
console/Dockerfile.backend
console/Dockerfile.frontend
console/frontend/Dockerfile

 succeeded in 1602ms:
./.dockerignore
./console/docker-compose.yml
./console/Dockerfile.backend
./console/Dockerfile.frontend
./console/frontend/.dockerignore
./console/frontend/Dockerfile
./docker-compose.production.yml

 succeeded in 2007ms:
console/backend/app/services/training_service.py:100:        from violawake_sdk.tools.train import (
console/backend/entrypoint.sh:20:# violawake_sdk.tools.train).
console/backend/scripts/train_full_pipeline.py:76:    from violawake_sdk.training.augment import apply_additive_noise, rir_augment
console/backend/scripts/train_full_pipeline.py:77:    from violawake_sdk.audio import load_audio
console/backend/scripts/train_full_pipeline.py:169:    from violawake_sdk.tools.train import _generate_confusable_negatives
console/backend/scripts/train_full_pipeline.py:189:    from violawake_sdk.tools.train import _generate_speech_negatives
console/backend/scripts/train_full_pipeline.py:245:    from violawake_sdk.tools.train import _train_temporal_cnn

2026-05-07T11:33:37.897720Z ERROR codex_core::tools::router: error=Exit code: 124
Wall time: 14.1 seconds
Output:
command timed out after 14051 milliseconds

 exited 124 in 14051ms:
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/Dockerfile.backend | sed -n '1,200p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba docker-compose.production.yml | sed -n '1,170p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/docker-compose.yml | sed -n '1,170p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --exclude-dir='.git' --exclude-dir='node_modules' --exclude-dir='__pycache__' --include='Dockerfile*' --include='*.yml' --include='*.yaml' --include='*.toml' --include='*.sh' -E 'pip install|requirements|violawake\\[|pyproject' console docker-compose.production.yml pyproject.toml | head -220\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1228ms:
     1	services:
     2	  backend:
     3	    build:
     4	      context: ..
     5	      dockerfile: console/Dockerfile.backend
     6	    ports:
     7	      - "8000:8000"
     8	    volumes:
     9	      - backend-data:/app/data
    10	      - ../src:/sdk/src:ro
    11	    environment:
    12	      - VIOLAWAKE_SECRET_KEY=dev-secret-change-in-production
    13	      - PYTHONPATH=/sdk/src
    14	      # - VIOLAWAKE_STRIPE_SECRET_KEY=sk_live_...
    15	      # - VIOLAWAKE_STRIPE_WEBHOOK_SECRET=whsec_...
    16	      # - VIOLAWAKE_STRIPE_PRICE_DEVELOPER=price_...
    17	      # - VIOLAWAKE_STRIPE_PRICE_BUSINESS=price_...
    18	    healthcheck:
    19	      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
    20	      interval: 10s
    21	      timeout: 5s
    22	      retries: 3
    23	
    24	  frontend:
    25	    build:
    26	      context: .
    27	      dockerfile: Dockerfile.frontend
    28	    ports:
    29	      - "80:80"
    30	    environment:
    31	      - VITE_API_URL=/api
    32	      - VITE_API_PROXY_TARGET=http://backend:8000
    33	    depends_on:
    34	      backend:
    35	        condition: service_healthy
    36	
    37	volumes:
    38	  backend-data:

 succeeded in 1255ms:
     1	services:
     2	  postgres:
     3	    image: postgres:16-alpine
     4	    restart: always
     5	    environment:
     6	      POSTGRES_DB: violawake
     7	      POSTGRES_USER: violawake
     8	      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
     9	    volumes:
    10	      - pgdata:/var/lib/postgresql/data
    11	    healthcheck:
    12	      test: ["CMD-SHELL", "pg_isready -U violawake"]
    13	      interval: 5s
    14	      timeout: 3s
    15	      retries: 5
    16	    shm_size: 256mb
    17	
    18	  backend:
    19	    build:
    20	      context: .
    21	      dockerfile: console/Dockerfile.backend
    22	    restart: always
    23	    expose:
    24	      - "8000"
    25	    volumes:
    26	      - backend-data:/app/data
    27	      - ${VIOLAWAKE_NEGATIVES_CORPUS_DIR:-./violawake_data/negatives}:/app/negatives:ro
    28	      # Universal corpus (LibriSpeech + MUSAN) for training-time negative samples.
    29	      # training_service searches Path("corpus") (CWD-relative = /app/corpus) when
    30	      # edge-tts negative generation falls short. Without this mount, training
    31	      # fails with "Only 0 negative files generated" if upstream TTS is degraded.
    32	      - ./corpus:/app/corpus:ro
    33	    env_file:
    34	      - .env.production
    35	    environment:
    36	      - VIOLAWAKE_DB_URL=postgresql+asyncpg://violawake:${POSTGRES_PASSWORD}@postgres:5432/violawake
    37	      - VIOLAWAKE_NEGATIVES_CORPUS_DIR=/app/negatives
    38	    depends_on:
    39	      postgres:
    40	        condition: service_healthy
    41	    healthcheck:
    42	      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
    43	      interval: 10s
    44	      timeout: 5s
    45	      retries: 5
    46	      start_period: 30s
    47	
    48	  frontend:
    49	    build:
    50	      context: .
    51	      dockerfile: console/Dockerfile.frontend
    52	    restart: always
    53	    ports:
    54	      - "80:80"
    55	    depends_on:
    56	      backend:
    57	        condition: service_healthy
    58	
    59	  tunnel:
    60	    image: cloudflare/cloudflared:latest
    61	    restart: always
    62	    command: tunnel --no-autoupdate run --token ${CLOUDFLARE_TUNNEL_TOKEN}
    63	    depends_on:
    64	      backend:
    65	        condition: service_healthy
    66	
    67	volumes:
    68	  pgdata:
    69	  backend-data:

 succeeded in 1309ms:
     1	FROM python:3.11-slim
     2	
     3	WORKDIR /app
     4	
     5	# Install system deps for audio processing + gosu for entrypoint
     6	RUN apt-get update && apt-get install -y --no-install-recommends \
     7	    portaudio19-dev curl gosu \
     8	    && rm -rf /var/lib/apt/lists/*
     9	
    10	# Install Python deps + upgrade build tools
    11	COPY console/backend/requirements.txt .
    12	RUN pip install --no-cache-dir --upgrade pip hatchling && \
    13	    pip install --no-cache-dir -r requirements.txt
    14	
    15	# Install ViolaWake SDK
    16	COPY pyproject.toml /sdk/
    17	COPY README.md /sdk/
    18	COPY LICENSE /sdk/
    19	COPY src/ /sdk/src/
    20	RUN pip install --no-cache-dir "/sdk[training]"
    21	
    22	# Copy backend code
    23	COPY console/backend/ .
    24	
    25	# Create data directory and non-root user
    26	RUN mkdir -p /app/data
    27	RUN addgroup --system app && adduser --system --ingroup app app
    28	RUN chown -R app:app /app /app/data
    29	COPY console/backend/entrypoint.sh /app/entrypoint.sh
    30	RUN chmod +x /app/entrypoint.sh
    31	
    32	EXPOSE 8000
    33	
    34	ENTRYPOINT ["/app/entrypoint.sh"]

 succeeded in 2144ms:
console/Dockerfile.backend:11:COPY console/backend/requirements.txt .
console/Dockerfile.backend:12:RUN pip install --no-cache-dir --upgrade pip hatchling && \
console/Dockerfile.backend:13:    pip install --no-cache-dir -r requirements.txt
console/Dockerfile.backend:16:COPY pyproject.toml /sdk/
console/Dockerfile.backend:20:RUN pip install --no-cache-dir "/sdk[training]"
pyproject.toml:116:    "violawake[audio,download,tts,stt,oww,tflite,vad,training,generate]",

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E '"'^[[:space:]]+(import |from [A-Za-z0-9_'"\\.]+ import )' console/backend/app/services console/backend/app | head -240\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'Path\\(__file__\\)\\.resolve\\(\\)\\.parent|Path\\.home\\(\\)|Path\\(\\\"corpus\\\"\\)|os\\.environ\\.get|VIOLAWAKE_.*DIR|tmp_dir|models_dir|upload_dir|data_dir' src/violawake_sdk console/backend/app console/backend/scripts | head -260\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'urllib\\.request|requests\\.|httpx\\.|urlopen\\(|Communicate\\(|download_models|download_url|kokoro|AutoModel|load_model' src/violawake_sdk console/backend/app console/backend/scripts | head -260\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'torch\\.load\\(|whisper\\.load_model\\(|transformers\\.AutoModel|ort\\.InferenceSession\\(|Interpreter\\(' src/violawake_sdk console/backend/app console/backend/scripts | head -180\"" in J:\CLAUDE\PROJECTS\Wakeword
2026-05-07T11:34:24.859497Z ERROR codex_core::tools::router: error=Exit code: 1
Wall time: 1.2 seconds
Output:
/usr/bin/bash: -c: line 1: unexpected EOF while looking for matching `''

 exited 1 in 1163ms:
/usr/bin/bash: -c: line 1: unexpected EOF while looking for matching `''

 succeeded in 1247ms:
console/backend/app/services/training_service.py:100:        from violawake_sdk.tools.train import (
console/backend/app/config.py:128:            import logging
console/backend/app/config.py:142:            import logging
console/backend/app/job_queue.py:117:    from app.models import Subscription
console/backend/app/job_queue.py:791:                from app.email_service import get_email_service
console/backend/app/job_queue.py:1170:        from app.models import Subscription
console/backend/app/job_queue.py:1195:            from app.retention import mark_recordings_for_deletion
console/backend/app/main.py:42:    from app.retention import (
console/backend/app/main.py:76:        from app.email_service import get_email_service
console/backend/app/main.py:170:    from app.retention import (
console/backend/app/middleware.py:111:        import sentry_sdk
console/backend/app/middleware.py:112:        from sentry_sdk.integrations.fastapi import FastApiIntegration
console/backend/app/monitoring.py:21:    import tomllib
console/backend/app/monitoring.py:23:    import tomli as tomllib
console/backend/app/retention.py:32:        from app.job_queue import JobStatus, get_job_queue
console/backend/app/retention.py:38:    import aiosqlite
console/backend/app/retention.py:261:                from app.storage import build_companion_config_identifier
console/backend/app/routes/auth.py:407:            import stripe
console/backend/app/routes/billing.py:69:    import stripe
console/backend/app/routes/billing.py:273:            from app.email_service import get_email_service
console/backend/app/routes/billing.py:277:                import asyncio
console/backend/app/routes/recordings.py:128:    import io
console/backend/app/routes/recordings.py:130:    import numpy as np
console/backend/app/routes/recordings.py:131:    from scipy.io import wavfile
console/backend/app/routes/recordings.py:156:        from scipy.signal import resample
console/backend/app/services/training_service.py:100:        from violawake_sdk.tools.train import (
console/backend/app/storage.py:151:            import boto3
console/backend/app/storage.py:152:            from botocore.config import Config as BotoConfig
console/backend/app/storage.py:198:        from botocore.exceptions import ClientError

 succeeded in 1307ms:
src/violawake_sdk/cli/download.py:3:from violawake_sdk.tools.download_model import main
src/violawake_sdk/models.py:58:        # Not directly downloadable — fetched at runtime by openwakeword.utils.download_models()
src/violawake_sdk/models.py:69:    # Kokoro TTS models hosted upstream at thewh1teagle/kokoro-onnx (Apache 2.0).
src/violawake_sdk/models.py:72:    "kokoro_v1_0": ModelSpec(
src/violawake_sdk/models.py:73:        name="kokoro_v1_0",
src/violawake_sdk/models.py:74:        url="https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx",
src/violawake_sdk/models.py:80:    "kokoro_voices_v1_0": ModelSpec(
src/violawake_sdk/models.py:81:        name="kokoro_voices_v1_0",
src/violawake_sdk/models.py:82:        url="https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin",
src/violawake_sdk/models.py:137:    prevent shipping unverified models.  ``download_model()`` also calls this
src/violawake_sdk/models.py:178:def _auto_download_model(model_name: str, spec: ModelSpec) -> Path:
src/violawake_sdk/models.py:183:    For full download features (progress bars, pinning), use ``download_model()``.
src/violawake_sdk/models.py:199:    import urllib.request
src/violawake_sdk/models.py:234:        response = urllib.request.urlopen(spec.url, timeout=60)  # noqa: S310 — URLs validated as HTTPS above
src/violawake_sdk/models.py:275:    # Size validation (same logic as download_model)
src/violawake_sdk/models.py:371:            return _auto_download_model(effective_name, spec)
src/violawake_sdk/models.py:381:def download_model(
src/violawake_sdk/models.py:483:        response = requests.get(spec.url, stream=True, timeout=30)
src/violawake_sdk/oww_backbone.py:121:        # the user is expected to call openwakeword.utils.download_models() once.
src/violawake_sdk/oww_backbone.py:126:            from openwakeword.utils import download_models  # type: ignore[import-not-found]
src/violawake_sdk/oww_backbone.py:129:                "OpenWakeWord backbone files missing (%s); fetching via openwakeword.utils.download_models()",
src/violawake_sdk/oww_backbone.py:132:            download_models()
src/violawake_sdk/oww_backbone.py:142:                f"{missing_str}. Tried to auto-download via openwakeword.utils.download_models() "
src/violawake_sdk/oww_backbone.py:144:                "`python -c \"from openwakeword.utils import download_models; download_models()\"`."
src/violawake_sdk/security/cert_pinning.py:32:``models.download_model()``.
src/violawake_sdk/security/cert_pinning.py:630:    This wraps requests.get() with pre- and post-connection pin
src/violawake_sdk/security/cert_pinning.py:640:        **kwargs: Additional kwargs passed to requests.get().
src/violawake_sdk/security/cert_pinning.py:643:        The requests.Response object.
src/violawake_sdk/security/cert_pinning.py:664:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:673:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:680:        return requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/security/cert_pinning.py:706:    response = requests.get(url, stream=True, timeout=timeout, **kwargs)
src/violawake_sdk/tools/confusables.py:460:def is_kokoro_tts_available() -> bool:
src/violawake_sdk/tools/confusables.py:463:        import kokoro_onnx  # noqa: F401
src/violawake_sdk/tools/confusables.py:475:    if not is_kokoro_tts_available():
src/violawake_sdk/tools/confusables.py:477:            "kokoro-onnx is not installed. Install with: pip install 'violawake[tts]'"
src/violawake_sdk/tools/confusables.py:500:    "is_kokoro_tts_available",
src/violawake_sdk/tools/download_model.py:67:    from violawake_sdk.models import MODEL_REGISTRY, download_model, list_cached_models
src/violawake_sdk/tools/download_model.py:102:        print("For TTS, run: violawake-download --model kokoro_v1_0")
src/violawake_sdk/tools/download_model.py:109:            path = download_model(
src/violawake_sdk/tools/expand_corpus.py:89:    response = requests.get(url, stream=True, timeout=timeout)
src/violawake_sdk/tools/generate_samples.py:171:    communicate = edge_tts.Communicate(text, voice_name)
src/violawake_sdk/tools/test_confusables.py:213:    kokoro_audio: dict[str, np.ndarray] = {}
src/violawake_sdk/tools/test_confusables.py:216:        kokoro_audio = generate_confusable_tts_audio(
src/violawake_sdk/tools/test_confusables.py:221:        audio_backend = "kokoro"
src/violawake_sdk/tools/test_confusables.py:226:                "kokoro-onnx is unavailable and no --audio-dir was provided"
src/violawake_sdk/tools/test_confusables.py:241:                    if confusable not in kokoro_audio:
src/violawake_sdk/tools/test_confusables.py:243:                    audio = kokoro_audio[confusable]
src/violawake_sdk/tools/test_confusables.py:244:                    source = "kokoro"
src/violawake_sdk/tools/test_confusables.py:245:                    audio_backend = "audio_dir+kokoro"
src/violawake_sdk/tools/test_confusables.py:247:                    if audio_backend == "kokoro":
src/violawake_sdk/tools/test_confusables.py:248:                        audio_backend = "audio_dir+kokoro"
src/violawake_sdk/tools/test_confusables.py:250:                audio = kokoro_audio[confusable]
src/violawake_sdk/tools/test_confusables.py:251:                source = "kokoro"
src/violawake_sdk/tools/train.py:278:        communicate = edge_tts.Communicate(text, voice)
src/violawake_sdk/tools/train.py:355:def _kokoro_tts_synthesize(
src/violawake_sdk/tools/train.py:371:        kokoro_engine = engine
src/violawake_sdk/tools/train.py:372:        if kokoro_engine is None:
src/violawake_sdk/tools/train.py:373:            kokoro_engine = TTSEngine(voice=voice, sample_rate=TTS_SAMPLE_RATE)
src/violawake_sdk/tools/train.py:375:            kokoro_engine.voice = voice
src/violawake_sdk/tools/train.py:377:        audio = np.asarray(kokoro_engine.synthesize(text), dtype=np.float32)
src/violawake_sdk/tools/train.py:380:        if int(kokoro_engine.sample_rate) != 16000:
src/violawake_sdk/tools/train.py:381:            audio = _resample_audio(audio, int(kokoro_engine.sample_rate), 16000)
src/violawake_sdk/tools/train.py:409:    kokoro_fallback = False
src/violawake_sdk/tools/train.py:410:    kokoro_engine: Any | None = None
src/violawake_sdk/tools/train.py:411:    kokoro_voices: list[str] = []
src/violawake_sdk/tools/train.py:413:    def _ensure_kokoro_ready() -> bool:
src/violawake_sdk/tools/train.py:414:        nonlocal kokoro_fallback, kokoro_engine, kokoro_voices
src/violawake_sdk/tools/train.py:415:        if kokoro_fallback:
src/violawake_sdk/tools/train.py:416:            return kokoro_engine is not None and len(kokoro_voices) > 0
src/violawake_sdk/tools/train.py:423:        kokoro_fallback = True
src/violawake_sdk/tools/train.py:424:        kokoro_voices = list(AVAILABLE_VOICES)
src/violawake_sdk/tools/train.py:425:        if not kokoro_voices:
src/violawake_sdk/tools/train.py:428:            kokoro_engine = TTSEngine(
src/violawake_sdk/tools/train.py:429:                voice=kokoro_voices[0],
src/violawake_sdk/tools/train.py:433:            kokoro_engine = None
src/violawake_sdk/tools/train.py:434:        return kokoro_engine is not None
src/violawake_sdk/tools/train.py:449:            if kokoro_fallback:
src/violawake_sdk/tools/train.py:450:                kokoro_voice = kokoro_voices[voice_idx % len(kokoro_voices)]
src/violawake_sdk/tools/train.py:451:                ok = _kokoro_tts_synthesize(
src/violawake_sdk/tools/train.py:453:                    kokoro_voice,
src/violawake_sdk/tools/train.py:455:                    engine=kokoro_engine,
src/violawake_sdk/tools/train.py:459:                if not ok and _ensure_kokoro_ready():
src/violawake_sdk/tools/train.py:460:                    kokoro_voice = kokoro_voices[voice_idx % len(kokoro_voices)]
src/violawake_sdk/tools/train.py:461:                    ok = _kokoro_tts_synthesize(
src/violawake_sdk/tools/train.py:463:                        kokoro_voice,
src/violawake_sdk/tools/train.py:465:                        engine=kokoro_engine,
src/violawake_sdk/tts.py:4:This module wraps kokoro-onnx with sentence-chunked streaming for low-latency
src/violawake_sdk/tts.py:15:Note: TTSEngine requires the 'kokoro-onnx' package.
src/violawake_sdk/tts.py:64:    Calls are serialized via ``_synthesis_lock`` since kokoro-onnx is not
src/violawake_sdk/tts.py:69:        - ``kokoro_v1_0.onnx`` — Kokoro-82M model (~326MB)
src/violawake_sdk/tts.py:70:        - ``kokoro_voices_v1_0.bin`` — Voice embeddings (~28MB)
src/violawake_sdk/tts.py:106:        self._kokoro: object | None = None
src/violawake_sdk/tts.py:111:    def _get_kokoro(self) -> object:
src/violawake_sdk/tts.py:114:            if self._kokoro is None:
src/violawake_sdk/tts.py:115:                self._kokoro = self._load_kokoro()
src/violawake_sdk/tts.py:116:        return self._kokoro
src/violawake_sdk/tts.py:118:    def _load_kokoro(self) -> object:
src/violawake_sdk/tts.py:121:            import kokoro_onnx
src/violawake_sdk/tts.py:124:                "kokoro-onnx is not installed. Install with: pip install 'violawake[tts]'"
src/violawake_sdk/tts.py:128:            model_path = get_model_path("kokoro_v1_0")
src/violawake_sdk/tts.py:129:            voices_path = get_model_path("kokoro_voices_v1_0")
src/violawake_sdk/tts.py:133:                "  violawake-download --model kokoro_v1_0\n"
src/violawake_sdk/tts.py:134:                "  violawake-download --model kokoro_voices_v1_0"
src/violawake_sdk/tts.py:138:            kokoro = kokoro_onnx.Kokoro(str(model_path), str(voices_path))
src/violawake_sdk/tts.py:143:        return kokoro
src/violawake_sdk/tts.py:158:        kokoro = self._get_kokoro()
src/violawake_sdk/tts.py:160:        # Hold synthesis lock to serialize access to the kokoro model,
src/violawake_sdk/tts.py:161:        # which is not guaranteed to be thread-safe by kokoro-onnx.
src/violawake_sdk/tts.py:164:                # kokoro-onnx API: returns (samples, sample_rate)
src/violawake_sdk/tts.py:165:                audio, sr = kokoro.create(  # type: ignore[attr-defined]
src/violawake_sdk/tts.py:308:        self._kokoro = None
src/violawake_sdk/__init__.py:111:        if spec.name in ("oww_backbone", "kokoro_v1_0", "kokoro_voices_v1_0"):
console/backend/app/email_service.py:87:    async def send_training_complete(self, to: str, model_name: str, download_url: str) -> bool:
console/backend/app/email_service.py:93:            button_url=self._absolute_url(download_url),
console/backend/app/job_queue.py:798:                        download_url = f"/models/{model_id}/download"
console/backend/app/job_queue.py:802:                            download_url=download_url,
console/backend/app/routes/models.py:135:def _load_model_metadata(model: TrainedModel) -> dict:
console/backend/app/routes/models.py:261:async def download_model(
console/backend/app/routes/models.py:306:    training_config = _load_model_metadata(model)
console/backend/app/routes/models.py:332:    metadata = _load_model_metadata(model)
console/backend/scripts/train_full_pipeline.py:57:    communicate = edge_tts.Communicate(text, voice)

 succeeded in 1309ms:
src/violawake_sdk/backends/onnx_backend.py:88:            session = ort.InferenceSession(str(model_path), providers=providers)
src/violawake_sdk/training/evaluate.py:377:    session = ort.InferenceSession(str(model_path), providers=["CPUExecutionProvider"])

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'Path\\(__file__\\)\\.resolve\\(\\)\\.parent|Path\\.home\\(\\)|tmp_dir|models_dir|upload_dir|data_dir|negatives_corpus_dir' src/violawake_sdk console/backend/app console/backend/scripts | head -260\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 822ms:
src/violawake_sdk/models.py:27:DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
src/violawake_sdk/tools/train.py:72:# Set by _train_temporal_cnn() via its tmp_dir parameter.
src/violawake_sdk/tools/train.py:1011:    tmp_dir: Path | str | None = None,
src/violawake_sdk/tools/train.py:1050:    if tmp_dir is not None:
src/violawake_sdk/tools/train.py:1051:        _TMP_DIR = str(tmp_dir)
src/violawake_sdk/tools/train.py:1603:    with tempfile.TemporaryDirectory(prefix="violawake_qc_", dir=_TMP_DIR) as tmp_dir:
src/violawake_sdk/tools/train.py:1604:        quality_dir = Path(tmp_dir)
src/violawake_sdk/tools/train.py:2425:        Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
src/violawake_sdk/tools/train.py:2426:        Path.home() / ".violawake" / "corpus",
src/violawake_sdk/_constants.py:131:DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
console/backend/app/config.py:30:    base_dir: Path = Path(__file__).resolve().parent.parent
console/backend/app/config.py:31:    data_dir: Path = Path(__file__).resolve().parent.parent / "data"
console/backend/app/config.py:32:    db_path: Path = Path(__file__).resolve().parent.parent / "data" / "violawake.db"
console/backend/app/config.py:33:    upload_dir: Path = Path(__file__).resolve().parent.parent / "data" / "recordings"
console/backend/app/config.py:34:    models_dir: Path = Path(__file__).resolve().parent.parent / "data" / "models"
console/backend/app/config.py:35:    tmp_dir: Path = Path(__file__).resolve().parent.parent / "data" / "tmp"
console/backend/app/config.py:58:    negatives_corpus_dir: str = ""  # Path to curated negative audio corpus (paid tier)
console/backend/app/config.py:195:settings.data_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:196:settings.upload_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:197:settings.models_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/config.py:198:settings.tmp_dir.mkdir(parents=True, exist_ok=True)
console/backend/app/health.py:123:    upload_dir = _check_directory(settings.upload_dir)
console/backend/app/health.py:124:    models_dir = _check_directory(settings.models_dir)
console/backend/app/health.py:125:    component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
console/backend/app/health.py:128:        "upload_dir": upload_dir,
console/backend/app/health.py:129:        "models_dir": models_dir,
console/backend/app/job_queue.py:151:        self._db_path = db_path or (settings.data_dir / "job_queue.db")
console/backend/app/job_queue.py:707:            output_dir = Path(tempfile.mkdtemp(prefix=f"violawake_job_{job.id}_", dir=str(settings.tmp_dir)))
console/backend/app/job_queue.py:1160:        corpus_path = settings.negatives_corpus_dir
console/backend/app/services/training_service.py:72:        positives_dir = Path(tempfile.mkdtemp(prefix="violawake_train_", dir=str(settings.tmp_dir)))
console/backend/app/services/training_service.py:107:        neg_temp_dir = Path(tempfile.mkdtemp(prefix="violawake_neg_", dir=str(settings.tmp_dir)))
console/backend/app/services/training_service.py:232:            Path(__file__).resolve().parent.parent.parent.parent / "corpus",  # repo root
console/backend/app/services/training_service.py:233:            Path.home() / ".violawake" / "corpus",
console/backend/app/services/training_service.py:322:            tmp_dir=settings.tmp_dir,
console/backend/app/storage.py:70:                settings.upload_dir,
console/backend/app/storage.py:71:                settings.models_dir,
console/backend/app/storage.py:130:            return settings.upload_dir.joinpath(*relative_parts)
console/backend/app/storage.py:135:            return settings.models_dir.joinpath(*relative_parts)
console/backend/app/storage.py:259:    for base_dir in (settings.upload_dir.resolve(strict=False), settings.models_dir.resolve(strict=False)):
console/backend/app/storage.py:271:    upload_dir = settings.upload_dir.resolve(strict=False)
console/backend/app/storage.py:272:    models_dir = settings.models_dir.resolve(strict=False)
console/backend/app/storage.py:275:        relative = resolved.relative_to(upload_dir)
console/backend/app/storage.py:280:    relative = resolved.relative_to(models_dir)
console/backend/scripts/train_full_pipeline.py:199:        Path(__file__).resolve().parent.parent.parent / "corpus",  # repo root
console/backend/scripts/train_full_pipeline.py:200:        Path.home() / ".violawake" / "corpus",

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/health.py | sed -n '1,160p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/storage.py | sed -n '1,170p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/storage.py | sed -n '170,290p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/scripts/train_full_pipeline.py | sed -n '180,215p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1298ms:
     1	"""Storage backends for recordings and trained models."""
     2	
     3	from __future__ import annotations
     4	
     5	import logging
     6	from pathlib import Path, PurePosixPath
     7	from typing import Protocol
     8	from urllib.parse import quote
     9	
    10	from app.config import settings
    11	
    12	logger = logging.getLogger("violawake.storage")
    13	
    14	_RECORDINGS_PREFIX = "recordings"
    15	_MODELS_PREFIX = "models"
    16	_VALID_PREFIXES = {_RECORDINGS_PREFIX, _MODELS_PREFIX}
    17	_storage: StorageBackend | None = None
    18	
    19	
    20	class StorageBackend(Protocol):
    21	    """Protocol implemented by all storage backends."""
    22	
    23	    def upload(self, key: str, data: bytes, content_type: str) -> str:
    24	        """Store bytes at the given key and return an access URL."""
    25	
    26	    def download(self, key: str) -> bytes:
    27	        """Download bytes for the given key."""
    28	
    29	    def delete(self, key: str) -> bool:
    30	        """Delete the object at the given key if it exists."""
    31	
    32	    def exists(self, key: str) -> bool:
    33	        """Return whether the key exists."""
    34	
    35	    def presigned_url(self, key: str, expires: int = 3600) -> str:
    36	        """Return a time-limited or internal-access URL for the key."""
    37	
    38	
    39	def build_recording_key(user_id: int, wake_word: str, filename: str) -> str:
    40	    """Build a storage key for a user recording."""
    41	    return PurePosixPath(_RECORDINGS_PREFIX, str(user_id), wake_word, filename).as_posix()
    42	
    43	
    44	def build_model_key(user_id: int, filename: str) -> str:
    45	    """Build a storage key for a trained model artifact."""
    46	    return PurePosixPath(_MODELS_PREFIX, str(user_id), filename).as_posix()
    47	
    48	
    49	def build_companion_config_identifier(identifier: str) -> str:
    50	    """Return the companion config identifier for a model key or legacy path."""
    51	    legacy_path = _as_legacy_path(identifier)
    52	    if legacy_path is not None:
    53	        return str(legacy_path.with_suffix(".config.json"))
    54	
    55	    key = _normalize_storage_key(identifier)
    56	    return PurePosixPath(key).with_suffix(".config.json").as_posix()
    57	
    58	
    59	def get_storage() -> StorageBackend:
    60	    """Return the configured storage backend singleton."""
    61	    global _storage
    62	
    63	    if _storage is None:
    64	        if _r2_is_configured():
    65	            logger.info("Using Cloudflare R2 bucket %s at %s", settings.r2_bucket, settings.r2_endpoint)
    66	            _storage = R2StorageBackend()
    67	        else:
    68	            logger.info(
    69	                "Using local storage under %s and %s",
    70	                settings.upload_dir,
    71	                settings.models_dir,
    72	            )
    73	            _storage = LocalStorageBackend()
    74	    return _storage
    75	
    76	
    77	class LocalStorageBackend:
    78	    """Filesystem-backed storage for development."""
    79	
    80	    def upload(self, key: str, data: bytes, content_type: str) -> str:
    81	        """Write a file to the local filesystem."""
    82	        del content_type
    83	
    84	        path = self._resolve_path(key)
    85	        path.parent.mkdir(parents=True, exist_ok=True)
    86	        path.write_bytes(data)
    87	        logger.debug("Stored local object %s at %s", key, path)
    88	        return self.presigned_url(key)
    89	
    90	    def download(self, key: str) -> bytes:
    91	        """Read a file from the local filesystem."""
    92	        path = self._resolve_path(key)
    93	        return path.read_bytes()
    94	
    95	    def delete(self, key: str) -> bool:
    96	        """Delete a local file if it exists."""
    97	        path = self._resolve_path(key)
    98	        if not path.exists():
    99	            return False
   100	
   101	        path.unlink()
   102	        logger.debug("Deleted local object %s at %s", key, path)
   103	        return True
   104	
   105	    def exists(self, key: str) -> bool:
   106	        """Return whether the local file exists."""
   107	        path = self._resolve_path(key)
   108	        return path.exists()
   109	
   110	    def presigned_url(self, key: str, expires: int = 3600) -> str:
   111	        """Return the internal file route for a local storage key."""
   112	        del expires
   113	
   114	        route_key = self._key_for_identifier(key)
   115	        return f"/api/files/{quote(route_key, safe='/')}"
   116	
   117	    def _resolve_path(self, identifier: str) -> Path:
   118	        legacy_path = _as_legacy_path(identifier)
   119	        if legacy_path is not None:
   120	            return _validate_legacy_path(legacy_path)
   121	
   122	        key = _normalize_storage_key(identifier)
   123	        parts = PurePosixPath(key).parts
   124	        prefix = parts[0]
   125	        relative_parts = parts[1:]
   126	
   127	        if prefix == _RECORDINGS_PREFIX:
   128	            if len(relative_parts) < 3:
   129	                raise ValueError(f"Invalid recording key: {identifier}")
   130	            return settings.upload_dir.joinpath(*relative_parts)
   131	
   132	        if prefix == _MODELS_PREFIX:
   133	            if len(relative_parts) < 2:
   134	                raise ValueError(f"Invalid model key: {identifier}")
   135	            return settings.models_dir.joinpath(*relative_parts)
   136	
   137	        raise ValueError(f"Unsupported storage key prefix: {prefix}")
   138	
   139	    def _key_for_identifier(self, identifier: str) -> str:
   140	        legacy_path = _as_legacy_path(identifier)
   141	        if legacy_path is not None:
   142	            return _key_from_local_path(_validate_legacy_path(legacy_path))
   143	        return _normalize_storage_key(identifier)
   144	
   145	
   146	class R2StorageBackend:
   147	    """Cloudflare R2 storage using the S3-compatible API."""
   148	
   149	    def __init__(self) -> None:
   150	        try:
   151	            import boto3
   152	            from botocore.config import Config as BotoConfig
   153	        except ImportError as exc:
   154	            raise RuntimeError("boto3 is required when Cloudflare R2 storage is configured") from exc
   155	
   156	        self.bucket = settings.r2_bucket
   157	        self.client = boto3.client(
   158	            "s3",
   159	            endpoint_url=settings.r2_endpoint,
   160	            aws_access_key_id=settings.r2_access_key_id,
   161	            aws_secret_access_key=settings.r2_secret_access_key,
   162	            region_name="auto",
   163	            config=BotoConfig(signature_version="s3v4"),
   164	        )
   165	
   166	    def upload(self, key: str, data: bytes, content_type: str) -> str:
   167	        """Upload an object to R2."""
   168	        normalized_key = _normalize_storage_key(key)
   169	        self.client.put_object(
   170	            Bucket=self.bucket,

 succeeded in 1311ms:
     1	"""Production health checks for ViolaWake Console."""
     2	
     3	from __future__ import annotations
     4	
     5	import secrets
     6	import tempfile
     7	from pathlib import Path
     8	from typing import Any
     9	
    10	from fastapi import APIRouter, Depends, Header, HTTPException, Request, status
    11	from fastapi.responses import JSONResponse
    12	from sqlalchemy import text
    13	
    14	from app.config import settings
    15	from app.database import async_session_factory
    16	from app.job_queue import get_job_queue
    17	from app.monitoring import (
    18	    APP_VERSION,
    19	    ERROR_TRACKER,
    20	    HEALTH_STATUS_DEGRADED,
    21	    HEALTH_STATUS_ERROR,
    22	    HEALTH_STATUS_OK,
    23	    get_uptime_seconds,
    24	)
    25	
    26	router = APIRouter(prefix="/api/health", tags=["health"])
    27	
    28	
    29	async def require_admin_health_details(
    30	    x_admin_token: str | None = Header(default=None, alias="X-Admin-Token"),
    31	) -> None:
    32	    """Hide the detailed health endpoint unless the configured admin token matches."""
    33	    admin_token = getattr(settings, "admin_token", "")
    34	    if not admin_token or x_admin_token is None or not secrets.compare_digest(x_admin_token, admin_token):
    35	        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Not found")
    36	
    37	
    38	def _combine_statuses(*statuses: str) -> str:
    39	    if any(component_status == HEALTH_STATUS_ERROR for component_status in statuses):
    40	        return HEALTH_STATUS_ERROR
    41	    if any(component_status == HEALTH_STATUS_DEGRADED for component_status in statuses):
    42	        return HEALTH_STATUS_DEGRADED
    43	    return HEALTH_STATUS_OK
    44	
    45	
    46	def _check_directory(path: Path) -> dict[str, Any]:
    47	    exists = path.exists()
    48	    is_dir = path.is_dir()
    49	    writable = False
    50	    error: str | None = None
    51	
    52	    if exists and is_dir:
    53	        try:
    54	            with tempfile.NamedTemporaryFile(dir=path, prefix=".health-", delete=True):
    55	                writable = True
    56	        except OSError as exc:
    57	            error = str(exc)
    58	    else:
    59	        error = "Directory is missing" if not exists else "Path is not a directory"
    60	
    61	    return {
    62	        "path": str(path),
    63	        "exists": exists,
    64	        "writable": writable,
    65	        "status": HEALTH_STATUS_OK if exists and is_dir and writable else HEALTH_STATUS_ERROR,
    66	        "error": error,
    67	    }
    68	
    69	
    70	async def _check_database() -> dict[str, Any]:
    71	    try:
    72	        async with async_session_factory() as session:
    73	            await session.execute(text("SELECT 1"))
    74	    except Exception as exc:
    75	        return {
    76	            "status": HEALTH_STATUS_ERROR,
    77	            "connected": False,
    78	            "target": settings.database_log_target,
    79	            "error": str(exc),
    80	        }
    81	
    82	    return {
    83	        "status": HEALTH_STATUS_OK,
    84	        "connected": True,
    85	        "target": settings.database_log_target,
    86	    }
    87	
    88	
    89	async def _check_training_queue() -> dict[str, Any]:
    90	    try:
    91	        queue_snapshot = await get_job_queue().runtime_snapshot()
    92	    except RuntimeError as exc:
    93	        return {
    94	            "status": HEALTH_STATUS_ERROR,
    95	            "queue_depth": 0,
    96	            "worker_status": {
    97	                "active_workers": 0,
    98	                "max_workers": settings.max_concurrent_jobs,
    99	                "available_slots": settings.max_concurrent_jobs,
   100	                "worker_task_running": False,
   101	            },
   102	            "error": str(exc),
   103	        }
   104	
   105	    queue_depth = int(queue_snapshot["queue_depth"])
   106	    worker_status = dict(queue_snapshot["worker_status"])
   107	    worker_status["persisted_running_jobs"] = int(queue_snapshot["persisted_running_jobs"])
   108	
   109	    component_status = HEALTH_STATUS_OK
   110	    if not worker_status["worker_task_running"] or worker_status["active_workers"] > worker_status["max_workers"]:
   111	        component_status = HEALTH_STATUS_ERROR
   112	    elif queue_depth > 0:
   113	        component_status = HEALTH_STATUS_DEGRADED
   114	
   115	    return {
   116	        "status": component_status,
   117	        "queue_depth": queue_depth,
   118	        "worker_status": worker_status,
   119	    }
   120	
   121	
   122	def _check_storage() -> dict[str, Any]:
   123	    upload_dir = _check_directory(settings.upload_dir)
   124	    models_dir = _check_directory(settings.models_dir)
   125	    component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
   126	    return {
   127	        "status": component_status,
   128	        "upload_dir": upload_dir,
   129	        "models_dir": models_dir,
   130	    }
   131	
   132	
   133	def _check_billing() -> dict[str, Any]:
   134	    configured = bool(settings.stripe_secret_key)
   135	    return {
   136	        "status": HEALTH_STATUS_OK if configured else HEALTH_STATUS_DEGRADED,
   137	        "configured": configured,
   138	    }
   139	
   140	
   141	async def build_health_payload(app: Any) -> dict[str, Any]:
   142	    database = await _check_database()
   143	    training_queue = await _check_training_queue()
   144	    storage = _check_storage()
   145	    billing = _check_billing()
   146	
   147	    startup_complete = bool(getattr(app.state, "startup_complete", False))
   148	    ready = startup_complete and database["status"] == HEALTH_STATUS_OK
   149	
   150	    status_value = _combine_statuses(
   151	        database["status"],
   152	        training_queue["status"],
   153	        storage["status"],
   154	        billing["status"],
   155	    )
   156	    if not ready:
   157	        status_value = HEALTH_STATUS_ERROR
   158	
   159	    return {
   160	        "status": status_value,

 succeeded in 1315ms:
   170	            Bucket=self.bucket,
   171	            Key=normalized_key,
   172	            Body=data,
   173	            ContentType=content_type,
   174	        )
   175	        logger.debug("Uploaded R2 object %s to bucket %s", normalized_key, self.bucket)
   176	        return self.presigned_url(normalized_key)
   177	
   178	    def download(self, key: str) -> bytes:
   179	        """Download an object from R2."""
   180	        normalized_key = _normalize_storage_key(key)
   181	        response = self.client.get_object(Bucket=self.bucket, Key=normalized_key)
   182	        body = response["Body"]
   183	        try:
   184	            return body.read()
   185	        finally:
   186	            body.close()
   187	
   188	    def delete(self, key: str) -> bool:
   189	        """Delete an object from R2 if it exists."""
   190	        normalized_key = _normalize_storage_key(key)
   191	        existed = self.exists(normalized_key)
   192	        self.client.delete_object(Bucket=self.bucket, Key=normalized_key)
   193	        logger.debug("Deleted R2 object %s from bucket %s", normalized_key, self.bucket)
   194	        return existed
   195	
   196	    def exists(self, key: str) -> bool:
   197	        """Return whether an object exists in R2."""
   198	        from botocore.exceptions import ClientError
   199	
   200	        normalized_key = _normalize_storage_key(key)
   201	        try:
   202	            self.client.head_object(Bucket=self.bucket, Key=normalized_key)
   203	            return True
   204	        except ClientError as exc:
   205	            error_code = exc.response.get("Error", {}).get("Code")
   206	            if error_code in {"404", "NoSuchKey", "NotFound"}:
   207	                return False
   208	            raise
   209	
   210	    def presigned_url(self, key: str, expires: int = 3600) -> str:
   211	        """Generate a signed GET URL for an R2 object."""
   212	        normalized_key = _normalize_storage_key(key)
   213	        return self.client.generate_presigned_url(
   214	            "get_object",
   215	            Params={"Bucket": self.bucket, "Key": normalized_key},
   216	            ExpiresIn=expires,
   217	        )
   218	
   219	
   220	def _normalize_storage_key(key: str) -> str:
   221	    """Normalize and validate a storage key."""
   222	    candidate = key.replace("\\", "/").strip().strip("/")
   223	    if not candidate:
   224	        raise ValueError("Storage key cannot be empty")
   225	
   226	    parts = PurePosixPath(candidate).parts
   227	    if not parts:
   228	        raise ValueError("Storage key cannot be empty")
   229	    if parts[0] not in _VALID_PREFIXES:
   230	        raise ValueError(f"Unsupported storage key prefix: {parts[0]}")
   231	    if any(part in {"", ".", ".."} for part in parts):
   232	        raise ValueError(f"Invalid storage key: {key}")
   233	    if len(parts) < 3:
   234	        raise ValueError(f"Incomplete storage key: {key}")
   235	
   236	    return PurePosixPath(*parts).as_posix()
   237	
   238	
   239	def _r2_is_configured() -> bool:
   240	    """Return whether R2 credentials are configured."""
   241	    return all((
   242	        settings.r2_endpoint.strip(),
   243	        settings.r2_access_key_id.strip(),
   244	        settings.r2_secret_access_key.strip(),
   245	    ))
   246	
   247	
   248	def _as_legacy_path(identifier: str) -> Path | None:
   249	    """Return a legacy absolute path identifier if one was provided."""
   250	    path = Path(identifier)
   251	    if path.is_absolute():
   252	        return path
   253	    return None
   254	
   255	
   256	def _validate_legacy_path(path: Path) -> Path:
   257	    """Ensure a legacy filesystem path is inside a managed storage root."""
   258	    resolved = path.resolve(strict=False)
   259	    for base_dir in (settings.upload_dir.resolve(strict=False), settings.models_dir.resolve(strict=False)):
   260	        try:
   261	            resolved.relative_to(base_dir)
   262	            return resolved
   263	        except ValueError:
   264	            continue
   265	    raise ValueError(f"Path is outside managed storage roots: {path}")
   266	
   267	
   268	def _key_from_local_path(path: Path) -> str:
   269	    """Convert a managed local path into a storage key."""
   270	    resolved = _validate_legacy_path(path)
   271	    upload_dir = settings.upload_dir.resolve(strict=False)
   272	    models_dir = settings.models_dir.resolve(strict=False)
   273	
   274	    try:
   275	        relative = resolved.relative_to(upload_dir)
   276	        return PurePosixPath(_RECORDINGS_PREFIX, *relative.parts).as_posix()
   277	    except ValueError:
   278	        pass
   279	
   280	    relative = resolved.relative_to(models_dir)
   281	    return PurePosixPath(_MODELS_PREFIX, *relative.parts).as_posix()

 succeeded in 1324ms:
   180	        args.wake_word, corpus_dir / "conf_r2",
   181	        n_confusables=16, voices_per_word=10, verbose=True,
   182	    )
   183	    if conf_r2:
   184	        neg_tag_map["neg_confusable_r2"] = conf_r2
   185	    print()
   186	
   187	    # Step 3: Speech negatives
   188	    print("Step 3: Speech negatives (common phrases)...")
   189	    from violawake_sdk.tools.train import _generate_speech_negatives
   190	    speech = _generate_speech_negatives(corpus_dir / "speech", n_voices=5, verbose=True)
   191	    if speech:
   192	        neg_tag_map["neg_speech"] = speech
   193	    print()
   194	
   195	    # Step 4: Universal corpus
   196	    print("Step 4: Universal corpus (LibriSpeech, MUSAN)...")
   197	    rng = random.Random(42)
   198	    search_paths = [
   199	        Path(__file__).resolve().parent.parent.parent / "corpus",  # repo root
   200	        Path.home() / ".violawake" / "corpus",
   201	        Path("corpus"),
   202	    ]
   203	    subdirs = {
   204	        "neg_librispeech": ("librispeech",),
   205	        "neg_musan_speech": ("musan/musan/speech", "musan/speech"),
   206	        "neg_musan_music": ("musan/musan/music", "musan/music"),
   207	        "neg_musan_noise": ("musan/musan/noise", "musan/noise"),
   208	    }
   209	    for tag, sds in subdirs.items():
   210	        for root in search_paths:
   211	            if not root.exists():
   212	                continue
   213	            for sd in sds:
   214	                cand = root / sd
   215	                if cand.exists():

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tts.py | sed -n '100,150p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tts.py | sed -n '150,190p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '1010,1165p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '1165,1235p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1220ms:
  1010	    augment_source_files: list[Path] | None = None,
  1011	    tmp_dir: Path | str | None = None,
  1012	) -> dict[str, Any]:
  1013	    """Train a TemporalCNN on 9-frame OWW embedding windows.
  1014	
  1015	    This replicates the proven production training recipe:
  1016	    - TemporalCNN(96, 9) architecture (~25K params)
  1017	    - FocalLoss(gamma=2.0, alpha=0.75, label_smoothing=0.05)
  1018	    - AdamW + cosine annealing LR
  1019	    - EMA weight averaging
  1020	    - Group-aware split
  1021	    - Early stopping
  1022	
  1023	    Args:
  1024	        pos_files: Positive audio file paths.
  1025	        neg_files: All negative audio file paths (flat list).
  1026	        output_path: Where to save the ONNX model.
  1027	        wake_word: Wake word name (for config).
  1028	        epochs: Max training epochs.
  1029	        augment: Whether to augment positives before extraction.
  1030	        eval_dir: Optional eval directory.
  1031	        batch_size: Mini-batch size.
  1032	        lr: Learning rate.
  1033	        patience: Early stopping patience (default 15, matching J5 proven recipe).
  1034	        verbose: Print progress.
  1035	        progress_callback: Optional callback for UI.
  1036	        device: Torch device hint.
  1037	        ema_decay: EMA decay factor.
  1038	        seq_len: Number of frames per temporal window.
  1039	        neg_tags: Optional dict mapping tag -> file list, for tagged negatives.
  1040	        augment_source_files: Optional subset of positives to augment. Defaults
  1041	            to all positives when omitted.
  1042	
  1043	    Returns:
  1044	        Config dict with training results.
  1045	    """
  1046	    training_start = time.monotonic()
  1047	
  1048	    # -- Direct temp files to a non-system drive when requested --------------
  1049	    global _TMP_DIR  # noqa: PLW0603
  1050	    if tmp_dir is not None:
  1051	        _TMP_DIR = str(tmp_dir)
  1052	        Path(_TMP_DIR).mkdir(parents=True, exist_ok=True)
  1053	
  1054	    # -- Lazy imports --------------------------------------------------------
  1055	    try:
  1056	        import numpy as np
  1057	        import torch
  1058	        import torch.optim as optim
  1059	        from torch.utils.data import DataLoader, TensorDataset
  1060	    except ImportError as e:
  1061	        print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
  1062	        print("Install with: pip install 'violawake[training]'", file=sys.stderr)
  1063	        sys.exit(1)
  1064	
  1065	    from violawake_sdk.training.losses import FocalLoss
  1066	    from violawake_sdk.training.temporal_model import (
  1067	        TemporalCNN,
  1068	        count_parameters,
  1069	        export_temporal_onnx,
  1070	    )
  1071	    from violawake_sdk.training.weight_averaging import (
  1072	        EMATracker,
  1073	        auto_select_averaging,
  1074	    )
  1075	
  1076	    # -- Deterministic seeding (matches production) --------------------------
  1077	    SEED = 42
  1078	    torch.manual_seed(SEED)
  1079	    np.random.seed(SEED)
  1080	    if torch.cuda.is_available():
  1081	        torch.cuda.manual_seed_all(SEED)
  1082	
  1083	    EMBEDDING_DIM = 96
  1084	    torch_device = "cuda" if torch.cuda.is_available() else "cpu"
  1085	
  1086	    # -- Load and augment positives before embedding extraction ---------------
  1087	    from violawake_sdk._constants import SAMPLE_RATE
  1088	    from violawake_sdk.audio import load_audio
  1089	
  1090	    validation_fraction = 0.2
  1091	    raw_pos_audio: list[np.ndarray] = []
  1092	    raw_pos_source_ids: list[int] = []
  1093	    augment_candidates: list[np.ndarray] = []
  1094	    augment_candidate_source_ids: list[int] = []
  1095	    augment_target_paths = set(augment_source_files or pos_files)
  1096	    load_failures = 0
  1097	
  1098	    for file_idx, wav_path in enumerate(pos_files):
  1099	        audio = load_audio(wav_path)
  1100	        if audio is None:
  1101	            load_failures += 1
  1102	            continue
  1103	        raw_pos_audio.append(audio)
  1104	        raw_pos_source_ids.append(file_idx)
  1105	        if wav_path in augment_target_paths:
  1106	            augment_candidates.append(audio)
  1107	            augment_candidate_source_ids.append(file_idx)
  1108	
  1109	    original_pos_clip_count = len(raw_pos_audio)
  1110	    n_augmented = 0
  1111	    augmented_pos_audio: list[np.ndarray] = []
  1112	    augmented_pos_source_ids: list[int] = []
  1113	
  1114	    if augment and augment_candidates:
  1115	        if verbose:
  1116	            print("\nStep 2: Augmenting positive audio arrays with audiomentations...")
  1117	
  1118	        min_augmented_total = 210
  1119	        copies_per_clip = max(1, math.ceil(min_augmented_total / len(augment_candidates)))
  1120	        augmented_pos_audio = _augment_positives(
  1121	            augment_candidates,
  1122	            sample_rate=SAMPLE_RATE,
  1123	            copies_per_clip=copies_per_clip,
  1124	            seed=SEED,
  1125	        )
  1126	        augmented_pos_source_ids = [
  1127	            source_id for source_id in augment_candidate_source_ids for _ in range(copies_per_clip)
  1128	        ]
  1129	        n_augmented = len(augmented_pos_audio)
  1130	        raw_pos_audio.extend(augmented_pos_audio)
  1131	        raw_pos_source_ids.extend(augmented_pos_source_ids)
  1132	
  1133	        if verbose:
  1134	            print(
  1135	                f"  {original_pos_clip_count} original clips + {n_augmented} augmented clips "
  1136	                f"= {len(raw_pos_audio)} positive clips before embeddings"
  1137	            )
  1138	    elif verbose and not augment:
  1139	        print("\nStep 2: Positive augmentation disabled; using original clips only.")
  1140	    elif verbose:
  1141	        print("\nStep 2: No positive clips available for augmentation; using originals only.")
  1142	
  1143	    if verbose and load_failures > 0:
  1144	        print(f"  Skipped {load_failures} positive files during audio loading")
  1145	
  1146	    # -- Extract temporal embeddings -----------------------------------------
  1147	    if verbose:
  1148	        print(f"\nStep 3: Extracting {seq_len}-frame temporal OWW embeddings...")
  1149	        print(f"  Processing {len(raw_pos_audio)} positive clips...")
  1150	
  1151	    pos_embs, pos_src, pos_tags = _extract_temporal_windows_from_audio(
  1152	        raw_pos_audio,
  1153	        raw_pos_source_ids,
  1154	        "pos",
  1155	        verbose=verbose,
  1156	        seq_len=seq_len,
  1157	    )
  1158	
  1159	    if len(pos_embs) < 5:
  1160	        print(
  1161	            f"ERROR: Only {len(pos_embs)} positive embeddings extracted. "
  1162	            "Need at least 5. Check audio files.",
  1163	            file=sys.stderr,
  1164	        )
  1165	        sys.exit(1)

 succeeded in 1223ms:
  1165	        sys.exit(1)
  1166	
  1167	    if verbose:
  1168	        print(f"\n  Processing {len(neg_files)} negative files...")
  1169	
  1170	    # Extract negatives with tags if provided
  1171	    all_neg_embs: list[np.ndarray] = []
  1172	    all_neg_src: list[int] = []
  1173	    all_neg_tags: list[str] = []
  1174	    source_offset = 0
  1175	
  1176	    if neg_tags:
  1177	        for ntag, nfiles in neg_tags.items():
  1178	            if not nfiles:
  1179	                continue
  1180	            embs, srcs, tags = _extract_temporal_embeddings(
  1181	                nfiles, ntag, verbose=verbose, seq_len=seq_len
  1182	            )
  1183	            # Offset source indices to avoid collisions across tag groups
  1184	            all_neg_embs.extend(embs)
  1185	            all_neg_src.extend([s + source_offset for s in srcs])
  1186	            all_neg_tags.extend(tags)
  1187	            source_offset += len(nfiles) + 1
  1188	    else:
  1189	        all_neg_embs, all_neg_src, all_neg_tags = _extract_temporal_embeddings(
  1190	            neg_files, "neg", verbose=verbose, seq_len=seq_len
  1191	        )
  1192	
  1193	    corpus_tags = {
  1194	        "neg_librispeech",
  1195	        "neg_musan_speech",
  1196	        "neg_musan_music",
  1197	        "neg_musan_noise",
  1198	    }
  1199	    corpus_found = bool(
  1200	        neg_tags and any(tag in corpus_tags and files for tag, files in neg_tags.items())
  1201	    )
  1202	
  1203	    if len(all_neg_embs) < 5:
  1204	        print(
  1205	            f"ERROR: Only {len(all_neg_embs)} negative embeddings extracted. Need at least 5.",
  1206	            file=sys.stderr,
  1207	        )
  1208	        sys.exit(1)
  1209	
  1210	    # -- Build dataset -------------------------------------------------------
  1211	    n_pos = len(pos_embs)
  1212	    n_neg = len(all_neg_embs)
  1213	
  1214	    X_data = np.array(pos_embs + all_neg_embs, dtype=np.float32)  # (N, 9, 96)
  1215	    labels = np.array([1] * n_pos + [0] * n_neg, dtype=np.int32)
  1216	    source_idx = np.array(pos_src + [s + max(pos_src) + 1 for s in all_neg_src], dtype=np.int32)
  1217	    tags = np.array(pos_tags + all_neg_tags)
  1218	
  1219	    if verbose:
  1220	        print(f"\nDataset: {n_pos} pos + {n_neg} neg = {n_pos + n_neg} total")
  1221	        print(f"  Temporal shape: ({seq_len} frames, {EMBEDDING_DIM}-dim)")
  1222	        print(f"  corpus_found: {corpus_found}")
  1223	
  1224	        # Show tag breakdown
  1225	        unique_tags = sorted(set(tags.tolist()))
  1226	        for t in unique_tags:
  1227	            count = int((tags == t).sum())
  1228	            print(f"    {t}: {count}")
  1229	
  1230	    # -- Group-aware split ---------------------------------------------------
  1231	    train_idx, val_idx = _group_aware_split(
  1232	        labels,
  1233	        source_idx,
  1234	        seed=SEED,
  1235	        val_fraction=validation_fraction,

 succeeded in 1238ms:
   150	                  Long text is processed as a single batch call.
   151	
   152	        Returns:
   153	            Audio samples as float32 numpy array at ``self.sample_rate``.
   154	        """
   155	        if not text.strip():
   156	            return np.zeros(0, dtype=np.float32)
   157	
   158	        kokoro = self._get_kokoro()
   159	
   160	        # Hold synthesis lock to serialize access to the kokoro model,
   161	        # which is not guaranteed to be thread-safe by kokoro-onnx.
   162	        with self._synthesis_lock:
   163	            try:
   164	                # kokoro-onnx API: returns (samples, sample_rate)
   165	                audio, sr = kokoro.create(  # type: ignore[attr-defined]
   166	                    text,
   167	                    voice=self.voice,
   168	                    speed=self.speed,
   169	                    lang="en-us",
   170	                )
   171	            except Exception as e:
   172	                logger.exception("TTS synthesis failed for text: %.50s...", text)
   173	                raise RuntimeError(f"TTS synthesis failed: {e}") from e
   174	
   175	        audio = np.asarray(audio, dtype=np.float32)
   176	
   177	        # Resample if needed
   178	        if sr != self.sample_rate:
   179	            audio = self._resample(audio, sr, self.sample_rate)
   180	
   181	        return audio
   182	
   183	    def synthesize_chunked(self, text: str) -> Generator[np.ndarray, None, None]:
   184	        """Synthesize text sentence-by-sentence for lower latency.
   185	
   186	        Splits text at sentence boundaries and yields audio for each sentence
   187	        as soon as it's synthesized. This allows playback to begin before
   188	        the full text is processed — matching the pattern from production Viola.
   189	
   190	        Args:

 succeeded in 1258ms:
   100	
   101	        self.voice = voice
   102	        self.speed = speed
   103	        self.sample_rate = sample_rate
   104	        self._lock = threading.Lock()
   105	        self._synthesis_lock = threading.Lock()
   106	        self._kokoro: object | None = None
   107	
   108	        # Lazy initialization — load model on first use
   109	        logger.info("TTSEngine created: voice=%s, speed=%.1f", voice, speed)
   110	
   111	    def _get_kokoro(self) -> object:
   112	        """Lazy-load the Kokoro model (thread-safe)."""
   113	        with self._lock:
   114	            if self._kokoro is None:
   115	                self._kokoro = self._load_kokoro()
   116	        return self._kokoro
   117	
   118	    def _load_kokoro(self) -> object:
   119	        """Load the Kokoro ONNX model."""
   120	        try:
   121	            import kokoro_onnx
   122	        except ImportError as e:
   123	            raise ImportError(
   124	                "kokoro-onnx is not installed. Install with: pip install 'violawake[tts]'"
   125	            ) from e
   126	
   127	        try:
   128	            model_path = get_model_path("kokoro_v1_0")
   129	            voices_path = get_model_path("kokoro_voices_v1_0")
   130	        except FileNotFoundError as e:
   131	            raise ModelNotFoundError(
   132	                "Kokoro models not found. Run:\n"
   133	                "  violawake-download --model kokoro_v1_0\n"
   134	                "  violawake-download --model kokoro_voices_v1_0"
   135	            ) from e
   136	
   137	        try:
   138	            kokoro = kokoro_onnx.Kokoro(str(model_path), str(voices_path))
   139	        except Exception as e:
   140	            raise ModelLoadError(f"Failed to load Kokoro model: {e}") from e
   141	
   142	        logger.info("Kokoro-82M loaded: %s", model_path)
   143	        return kokoro
   144	
   145	    def synthesize(self, text: str) -> np.ndarray:
   146	        """Synthesize text to audio.
   147	
   148	        Args:
   149	            text: Text to synthesize. May be multi-sentence.
   150	                  Long text is processed as a single batch call.

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n 'sys.exit' src/violawake_sdk/tools/train.py src/violawake_sdk/training/evaluate.py console/backend/app/services/training_service.py console/backend/app/job_queue.py\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '807,910p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '910,1035p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '920,970p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1244ms:
src/violawake_sdk/tools/train.py:727:        sys.exit(1)
src/violawake_sdk/tools/train.py:874:        sys.exit(1)
src/violawake_sdk/tools/train.py:1063:        sys.exit(1)
src/violawake_sdk/tools/train.py:1165:        sys.exit(1)
src/violawake_sdk/tools/train.py:1208:        sys.exit(1)
src/violawake_sdk/tools/train.py:1736:        sys.exit(1)
src/violawake_sdk/tools/train.py:1753:        sys.exit(1)
src/violawake_sdk/tools/train.py:1759:        sys.exit(1)
src/violawake_sdk/tools/train.py:1817:        sys.exit(1)
src/violawake_sdk/tools/train.py:2267:        sys.exit(1)
src/violawake_sdk/tools/train.py:2271:        sys.exit(1)
src/violawake_sdk/tools/train.py:2298:            sys.exit(1)
src/violawake_sdk/tools/train.py:2351:        sys.exit(1)
src/violawake_sdk/tools/train.py:2500:        sys.exit(1)
src/violawake_sdk/tools/train.py:2542:            sys.exit(1)

 succeeded in 1237ms:
   807	        except TrainingCancelledError as exc:
   808	            current_job = await self.get_job(job_id)
   809	            completed_at = _utcnow()
   810	            progress_pct = current_job.progress_pct if current_job is not None else 0.0
   811	            await self._update_job(
   812	                job_id,
   813	                status=JobStatus.CANCELLED,
   814	                completed_at=completed_at,
   815	                error=str(exc),
   816	            )
   817	            await self._publish(
   818	                job_id,
   819	                {
   820	                    "status": JobStatus.CANCELLED.value,
   821	                    "progress": progress_pct,
   822	                    "epoch": 0,
   823	                    "total_epochs": current_job.epochs if current_job is not None else 0,
   824	                    "train_loss": 0.0,
   825	                    "val_loss": 0.0,
   826	                    "message": "Training cancelled.",
   827	                    "error": str(exc),
   828	                    "d_prime": current_job.d_prime if current_job is not None else None,
   829	                    "model_id": current_job.model_id if current_job is not None else None,
   830	                    "queue_position": None,
   831	                },
   832	            )
   833	            logger.info("Training job %s cancelled", job_id)
   834	        except Exception as exc:
   835	            current_job = await self.get_job(job_id)
   836	            completed_at = _utcnow()
   837	            await self._update_job(
   838	                job_id,
   839	                status=JobStatus.FAILED,
   840	                completed_at=completed_at,
   841	                error=str(exc),
   842	            )
   843	            user_id = current_job.user_id if current_job is not None else None
   844	            if user_id is not None:
   845	                await self._record_failure(user_id, str(exc))
   846	            await self._publish(
   847	                job_id,
   848	                {
   849	                    "status": JobStatus.FAILED.value,
   850	                    "progress": current_job.progress_pct if current_job is not None else 0.0,
   851	                    "epoch": 0,
   852	                    "total_epochs": current_job.epochs if current_job is not None else 0,
   853	                    "train_loss": 0.0,
   854	                    "val_loss": 0.0,
   855	                    "message": "Training failed.",
   856	                    "error": str(exc),
   857	                    "d_prime": current_job.d_prime if current_job is not None else None,
   858	                    "model_id": current_job.model_id if current_job is not None else None,
   859	                    "queue_position": None,
   860	                },
   861	            )
   862	            log_exception(
   863	                logger,
   864	                exc,
   865	                message="Training job failed",
   866	                source="job_queue",
   867	                extra={"job_id": job_id},
   868	            )
   869	        finally:
   870	            async with self._state_lock:
   871	                self._running_job_ids.discard(job_id)
   872	                self._cancel_events.pop(job_id, None)
   873	            if output_dir is not None and output_dir.exists():
   874	                shutil.rmtree(output_dir, ignore_errors=True)
   875	            await self._fill_queue_from_db()
   876	
   877	    async def _pending_count(self) -> int:
   878	        async with self._connect() as conn, conn.execute(
   879	            "SELECT COUNT(*) AS count FROM jobs WHERE status = ?",
   880	            (JobStatus.PENDING.value,),
   881	        ) as cursor:
   882	            row = await cursor.fetchone()
   883	        return int(row["count"]) if row is not None else 0
   884	
   885	    async def _fill_queue_from_db(self) -> None:
   886	        async with self._refill_lock:
   887	            free_slots = self._queue.maxsize - self._queue.qsize()
   888	            if free_slots <= 0:
   889	                return
   890	
   891	            now = _utcnow()
   892	            async with self._connect() as conn:
   893	                async with conn.execute(
   894	                    """
   895	                    SELECT id, user_id
   896	                    FROM jobs
   897	                    WHERE status = ?
   898	                    ORDER BY priority DESC, created_at ASC, id ASC
   899	                    """,
   900	                    (JobStatus.PENDING.value,),
   901	                ) as cursor:
   902	                    rows = await cursor.fetchall()
   903	
   904	                for row in rows:
   905	                    if free_slots <= 0:
   906	                        break
   907	                    job_id = int(row["id"])
   908	                    user_id = int(row["user_id"])
   909	                    async with self._state_lock:
   910	                        if job_id in self._queued_job_ids or job_id in self._running_job_ids:

 succeeded in 1285ms:
   920	                f"    {file_idx + 1}/{len(audio_files)} files -> {len(all_embeddings)} embeddings"
   921	            )
   922	
   923	    if verbose:
   924	        print(
   925	            f"  [{tag}] {len(audio_files)} files -> {len(all_embeddings)} embeddings "
   926	            f"({failures} failures)"
   927	        )
   928	
   929	    return all_embeddings, all_source_idx, all_tags
   930	
   931	
   932	# ---------------------------------------------------------------------------
   933	# Group-aware train/val split
   934	# ---------------------------------------------------------------------------
   935	
   936	
   937	def _group_aware_split(
   938	    labels: np.ndarray,
   939	    source_idx: np.ndarray,
   940	    seed: int = 42,
   941	    val_fraction: float = 0.2,
   942	) -> tuple[np.ndarray, np.ndarray]:
   943	    """Group-aware stratified train/val split.
   944	
   945	    All embeddings from the same source file go to the same split
   946	    to prevent data leakage from augmented variants.
   947	
   948	    Returns (train_indices, val_indices) as numpy arrays.
   949	    """
   950	    import numpy as np
   951	
   952	    rng = np.random.default_rng(seed)
   953	
   954	    pos_mask = labels == 1
   955	    neg_mask = labels == 0
   956	
   957	    pos_sources = sorted(set(source_idx[pos_mask].tolist()))
   958	    neg_sources = sorted(set(source_idx[neg_mask].tolist()))
   959	
   960	    rng.shuffle(pos_sources)
   961	    rng.shuffle(neg_sources)
   962	
   963	    n_val_pos = max(1, int(len(pos_sources) * val_fraction))
   964	    n_val_neg = max(1, int(len(neg_sources) * val_fraction))
   965	
   966	    val_pos_sources = set(pos_sources[:n_val_pos])
   967	    val_neg_sources = set(neg_sources[:n_val_neg])
   968	
   969	    val_mask = np.zeros(len(labels), dtype=bool)
   970	    for i in range(len(labels)):

 succeeded in 1324ms:
   910	                        if job_id in self._queued_job_ids or job_id in self._running_job_ids:
   911	                            continue
   912	
   913	                    breaker = await self._get_circuit_breaker_with_conn(conn, user_id)
   914	                    if breaker.paused:
   915	                        continue
   916	                    if breaker.next_attempt_at is not None and breaker.next_attempt_at > now:
   917	                        delay = (breaker.next_attempt_at - now).total_seconds()
   918	                        self._schedule_retry_fill(user_id, delay)
   919	                        continue
   920	
   921	                    try:
   922	                        self._queue.put_nowait(job_id)
   923	                    except asyncio.QueueFull:
   924	                        break
   925	
   926	                    async with self._state_lock:
   927	                        self._queued_job_ids.add(job_id)
   928	                    free_slots -= 1
   929	
   930	    async def _queue_position(self, job_id: int) -> int | None:
   931	        """Return the 1-based queue position for a pending job, or None if not pending."""
   932	        async with self._connect() as conn:
   933	            async with conn.execute(
   934	                """
   935	                SELECT id
   936	                FROM jobs
   937	                WHERE status = ?
   938	                ORDER BY priority DESC, created_at ASC, id ASC
   939	                """,
   940	                (JobStatus.PENDING.value,),
   941	            ) as cursor:
   942	                rows = await cursor.fetchall()
   943	
   944	        for position, row in enumerate(rows, start=1):
   945	            if int(row["id"]) == job_id:
   946	                return position
   947	        return None
   948	
   949	    async def _handle_progress_event(
   950	        self,
   951	        job_id: int,
   952	        epochs: int,
   953	        event: dict[str, Any],
   954	    ) -> None:
   955	        progress = float(event.get("progress", 0.0))
   956	        await self._update_job(job_id, progress_pct=progress)
   957	        await self._publish(
   958	            job_id,
   959	            {
   960	                "status": str(event.get("status", JobStatus.RUNNING.value)),
   961	                "progress": progress,
   962	                "epoch": int(event.get("epoch", 0)),
   963	                "total_epochs": int(event.get("total_epochs", epochs)),
   964	                "train_loss": float(event.get("train_loss", 0.0)),
   965	                "val_loss": float(event.get("val_loss", 0.0)),
   966	                "message": str(event.get("message", "")),
   967	                "error": event.get("error"),
   968	                "d_prime": event.get("d_prime"),
   969	                "model_id": event.get("model_id"),
   970	                "queue_position": None,  # running jobs have no queue position
   971	            },
   972	        )
   973	
   974	    async def _publish(self, job_id: int, event: dict[str, Any]) -> None:
   975	        queues = list(self._subscribers.get(job_id, []))
   976	        for queue in queues:
   977	            try:
   978	                queue.put_nowait(event)
   979	            except asyncio.QueueFull:
   980	                logger.warning("Dropping event for job %s because subscriber queue is full", job_id)
   981	
   982	    async def _update_job(
   983	        self,
   984	        job_id: int,
   985	        *,
   986	        status: JobStatus | None = None,
   987	        progress_pct: float | None = None,
   988	        started_at: datetime | None = None,
   989	        completed_at: datetime | None = None,
   990	        error: str | None = None,
   991	        model_id: int | None = None,
   992	        d_prime: float | None = None,
   993	    ) -> None:
   994	        assignments: list[str] = []
   995	        values: list[Any] = []
   996	
   997	        if status is not None:
   998	            assignments.append("status = ?")
   999	            values.append(status.value)
  1000	        if progress_pct is not None:
  1001	            assignments.append("progress_pct = ?")
  1002	            values.append(progress_pct)
  1003	        if started_at is not None:
  1004	            assignments.append("started_at = ?")
  1005	            values.append(_serialize_datetime(started_at))
  1006	        if completed_at is not None:
  1007	            assignments.append("completed_at = ?")
  1008	            values.append(_serialize_datetime(completed_at))
  1009	        if error is not None or status in {JobStatus.RUNNING, JobStatus.COMPLETED, JobStatus.CANCELLED}:
  1010	            assignments.append("error = ?")
  1011	            values.append(error)
  1012	        if model_id is not None:
  1013	            assignments.append("model_id = ?")
  1014	            values.append(model_id)
  1015	        if d_prime is not None:
  1016	            assignments.append("d_prime = ?")
  1017	            values.append(d_prime)
  1018	
  1019	        if not assignments:
  1020	            return
  1021	
  1022	        values.append(job_id)
  1023	        async with self._connect() as conn:
  1024	            await conn.execute(
  1025	                f"UPDATE jobs SET {', '.join(assignments)} WHERE id = ?",
  1026	                values,
  1027	            )
  1028	            await conn.commit()
  1029	
  1030	    async def _get_circuit_breaker_with_conn(
  1031	        self,
  1032	        conn: aiosqlite.Connection,
  1033	        user_id: int,
  1034	    ) -> CircuitBreakerState:
  1035	        async with conn.execute(

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n 'def _worker_loop\\|create_task\\|_process_job' console/backend/app/job_queue.py | head -80\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/app/job_queue.py | sed -n '620,660p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'except BaseException|SystemExit|except Exception' console/backend/app/job_queue.py console/backend/app/services/training_service.py src/violawake_sdk/tools/train.py | head -140\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/train.py | sed -n '2245,2305p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 959ms:
   620	        now = _utcnow()
   621	        for row in breaker_rows:
   622	            if bool(row["paused"]):
   623	                continue
   624	            next_attempt_at = _deserialize_datetime(row["next_attempt_at"])
   625	            if next_attempt_at is None:
   626	                continue
   627	            delay = max(0.0, (next_attempt_at - now).total_seconds())
   628	            self._schedule_retry_fill(int(row["user_id"]), delay)
   629	
   630	    async def _worker_loop(self) -> None:
   631	        while not self._closed:
   632	            job_id = await self._queue.get()
   633	            async with self._state_lock:
   634	                self._queued_job_ids.discard(job_id)
   635	            await self._semaphore.acquire()
   636	            task = asyncio.create_task(self._execute_job(job_id), name=f"job-{job_id}")
   637	            self._inflight_tasks.add(task)
   638	
   639	            def _on_done(completed: asyncio.Task[None]) -> None:
   640	                self._semaphore.release()
   641	                self._inflight_tasks.discard(completed)
   642	
   643	            task.add_done_callback(_on_done)
   644	
   645	    async def _execute_job(self, job_id: int) -> None:
   646	        async with self._state_lock:
   647	            self._running_job_ids.add(job_id)
   648	
   649	        output_dir: Path | None = None
   650	        try:
   651	            job = await self.get_job(job_id)
   652	            if job is None:
   653	                return
   654	            if job.status is not JobStatus.PENDING:
   655	                return
   656	
   657	            breaker = await self.get_circuit_breaker(job.user_id)
   658	            now = _utcnow()
   659	            if breaker.paused:
   660	                logger.warning("Skipping job %s because user %s queue is paused", job_id, job.user_id)

 succeeded in 986ms:
170:        self._worker_task = asyncio.create_task(self._worker_loop(), name="job-queue-worker")
630:    async def _worker_loop(self) -> None:
636:            task = asyncio.create_task(self._execute_job(job_id), name=f"job-{job_id}")
1153:        self._retry_tasks[user_id] = asyncio.create_task(

 succeeded in 1045ms:
  2245	        default=5,
  2246	        metavar="N",
  2247	        help="Negatives per positive (used in legacy MLP mode, default: 5)",
  2248	    )
  2249	    parser.add_argument(
  2250	        "--hidden-dim",
  2251	        type=int,
  2252	        default=64,
  2253	        metavar="N",
  2254	        help="Hidden dim for legacy MLP (default: 64)",
  2255	    )
  2256	    parser.add_argument(
  2257	        "--quiet",
  2258	        action="store_true",
  2259	        help="Suppress training progress output",
  2260	    )
  2261	
  2262	    args = parser.parse_args()
  2263	
  2264	    output_path = Path(args.output)
  2265	    eval_dir = Path(args.eval_dir) if args.eval_dir else None
  2266	    positives_dir = Path(args.positives) if args.positives else None
  2267	    negatives_dir = Path(args.negatives) if args.negatives else None
  2268	    verbose = not args.quiet
  2269	
  2270	    if positives_dir and not positives_dir.exists():
  2271	        print(f"ERROR: Positives directory not found: {positives_dir}", file=sys.stderr)
  2272	        sys.exit(1)
  2273	
  2274	    if negatives_dir and not negatives_dir.exists():
  2275	        print(f"ERROR: Negatives directory not found: {negatives_dir}", file=sys.stderr)
  2276	        sys.exit(1)
  2277	
  2278	    if verbose:
  2279	        print("=" * 70)
  2280	        print(f"ViolaWake Training: '{args.word}'")
  2281	        print("=" * 70)
  2282	        print(f"Architecture:       {args.architecture}")
  2283	        print(f"Auto corpus:        {'enabled' if args.auto_corpus else 'disabled'}")
  2284	        if positives_dir:
  2285	            print(f"Positives dir:      {positives_dir}")
  2286	        if negatives_dir:
  2287	            print(f"Negatives dir:      {negatives_dir}")
  2288	        print(f"Output:             {output_path}")
  2289	        print(f"Epochs:             {args.epochs} (patience={args.patience})")
  2290	        print(f"Batch size:         {args.batch_size}")
  2291	        print(f"Learning rate:      {args.lr}")
  2292	        print(f"Augmentation:       {'enabled' if args.augment else 'disabled'}")
  2293	        if eval_dir:
  2294	            print(f"Eval set:           {eval_dir}")
  2295	        print()
  2296	
  2297	    # ======================================================================
  2298	    # Legacy MLP path
  2299	    # ======================================================================
  2300	    if args.architecture == "mlp":
  2301	        if positives_dir is None:
  2302	            print("ERROR: --positives is required for MLP architecture.", file=sys.stderr)
  2303	            sys.exit(1)
  2304	        if verbose:
  2305	            print("Using legacy MLP architecture (single-frame, mean-pooled embeddings).\n")

 succeeded in 1048ms:
console/backend/app/job_queue.py:804:            except Exception as email_exc:
console/backend/app/job_queue.py:834:        except Exception as exc:
console/backend/app/job_queue.py:1197:        except Exception as exc:
console/backend/app/services/training_service.py:126:            except Exception as exc:
console/backend/app/services/training_service.py:172:        except Exception as exc:
console/backend/app/services/training_service.py:192:        except Exception as exc:
console/backend/app/services/training_service.py:221:        except Exception as exc:
console/backend/app/services/training_service.py:353:    except Exception as exc:
src/violawake_sdk/tools/train.py:336:        except Exception:
src/violawake_sdk/tools/train.py:342:    except Exception:
src/violawake_sdk/tools/train.py:389:    except Exception:
src/violawake_sdk/tools/train.py:437:        except Exception:
src/violawake_sdk/tools/train.py:499:                except Exception:
src/violawake_sdk/tools/train.py:782:        except Exception:
src/violawake_sdk/tools/train.py:920:        except Exception:
src/violawake_sdk/tools/train.py:1459:        except Exception as e:
src/violawake_sdk/tools/train.py:1787:        except Exception:
src/violawake_sdk/tools/train.py:2632:        except Exception as e:
src/violawake_sdk/tools/train.py:2643:    except Exception as e:

codex
I have three minimum-touch fixes that meet your “obvious one-line” bar: remaining `OWWModel()` defaults in SDK evaluation/contamination tooling, an entrypoint `download_models()` pipeline that masks failures, and an Alembic retry loop that continues after five failed tries. I’m applying only those small changes, then I’ll commit each class separately and keep the rest as audit findings.
  ✓ Map repo state and relevant training/backend files
  → Audit framework defaults, lazy imports, network calls, filesystem paths, startup, quota behavior
  • Apply only obvious one-line fixes if any, with separate commits
  • Write audit doc and verify with requested commands/tests
apply patch
patch: completed
J:\CLAUDE\PROJECTS\Wakeword\src\violawake_sdk\training\evaluate.py
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

apply patch
patch: completed
J:\CLAUDE\PROJECTS\Wakeword\src\violawake_sdk\tools\contamination_check.py
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

apply patch
patch: completed
J:\CLAUDE\PROJECTS\Wakeword\console\backend\entrypoint.sh
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff -- src/violawake_sdk/training/evaluate.py src/violawake_sdk/tools/contamination_check.py"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff -- console/backend/entrypoint.sh"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' 'OWWModel()' src/violawake_sdk console/backend | head -80\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1199ms:
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849..bc13413 100644
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@ def _check_embedding_overlap(
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577..508ecdc 100644
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@ def _build_oww_scorer(session, input_name: str):
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@ def _build_temporal_oww_scorer(session, input_name: str):
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
warning: in the working copy of 'src/violawake_sdk/tools/contamination_check.py', LF will be replaced by CRLF the next time Git touches it
warning: in the working copy of 'src/violawake_sdk/training/evaluate.py', LF will be replaced by CRLF the next time Git touches it

 succeeded in 1169ms:
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c8..6ca9a2b 100644
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@ for i in 1 2 3 4 5; do
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@ done
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
warning: in the working copy of 'console/backend/entrypoint.sh', LF will be replaced by CRLF the next time Git touches it

 succeeded in 1845ms:
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
The remaining `OWWModel()` defaults are now gone from the audited SDK/backend paths. I’m committing the small fixes before writing the doc so the audit can cite exact SHAs.
  ✓ Map repo state and relevant training/backend files
  → Audit framework defaults, lazy imports, network calls, filesystem paths, startup, quota behavior
  → Apply only obvious one-line fixes if any, with separate commits
  • Write audit doc and verify with requested commands/tests
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"git add src/violawake_sdk/training/evaluate.py src/violawake_sdk/tools/contamination_check.py && git commit -m 'Pin OpenWakeWord evaluation helpers to ONNX'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1784ms:
[master 0c3923b] Pin OpenWakeWord evaluation helpers to ONNX
 2 files changed, 3 insertions(+), 3 deletions(-)
warning: in the working copy of 'src/violawake_sdk/tools/contamination_check.py', LF will be replaced by CRLF the next time Git touches it
warning: in the working copy of 'src/violawake_sdk/training/evaluate.py', LF will be replaced by CRLF the next time Git touches it

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"git add console/backend/entrypoint.sh && git commit -m 'Fail fast on backend startup prerequisites'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1899ms:
[master 9cbb8e6] Fail fast on backend startup prerequisites
 1 file changed, 2 insertions(+), 2 deletions(-)
warning: in the working copy of 'console/backend/entrypoint.sh', LF will be replaced by CRLF the next time Git touches it

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/audio.py | sed -n '1,80p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/audio.py | sed -n '200,230p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/vad.py | sed -n '130,180p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' -E 'torch\\.hub|load_silero|silero' src/violawake_sdk console/backend/app | head -100\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1285ms:
     1	"""
     2	ViolaWake Audio Processing
     3	===========================
     4	
     5	Utilities for loading audio and computing mel spectrograms.
     6	
     7	Copied from Viola's violawake/audio.py — Viola-specific imports replaced.
     8	"""
     9	
    10	from __future__ import annotations
    11	
    12	import logging
    13	from pathlib import Path
    14	
    15	import numpy as np
    16	
    17	from violawake_sdk._constants import (
    18	    AUDIO_INT16_SCALE,
    19	    CLIP_SAMPLES,
    20	    F_MAX,
    21	    F_MIN,
    22	    FEATURE_TYPE,
    23	    HOP_LENGTH,
    24	    HOP_LENGTH_MEL,
    25	    N_FFT_MEL,
    26	    N_MELS,
    27	    N_MELS_MEL,
    28	    PCEN_BIAS,
    29	    PCEN_EPS,
    30	    PCEN_GAIN,
    31	    PCEN_POWER,
    32	    PCEN_TIME_CONSTANT,
    33	    SAMPLE_RATE,
    34	    USE_PCEN,
    35	    WIN_LENGTH,
    36	    WIN_LENGTH_MEL,
    37	)
    38	
    39	logger = logging.getLogger(__name__)
    40	
    41	# Optional imports
    42	_TORCHAUDIO_AVAILABLE = False
    43	
    44	try:
    45	    import torchaudio
    46	    import torchaudio.transforms as T
    47	
    48	    _TORCHAUDIO_AVAILABLE = True
    49	except ImportError:
    50	    pass
    51	
    52	
    53	def load_audio(path: Path, target_sr: int = SAMPLE_RATE) -> np.ndarray | None:
    54	    """
    55	    Load audio file and return as numpy array.
    56	
    57	    Args:
    58	        path: Path to audio file (.wav, .flac, .mp3)
    59	        target_sr: Target sample rate (default 16kHz)
    60	
    61	    Returns:
    62	        Audio samples as float32 numpy array, or None if failed
    63	    """
    64	    # Try torchaudio first, fall back to wave module
    65	    if _TORCHAUDIO_AVAILABLE:
    66	        try:
    67	            waveform, sr = torchaudio.load(str(path))
    68	            # Convert to mono if stereo
    69	            if waveform.shape[0] > 1:
    70	                waveform = waveform.mean(dim=0, keepdim=True)
    71	            # Resample if needed
    72	            if sr != target_sr:
    73	                resampler = T.Resample(sr, target_sr)
    74	                waveform = resampler(waveform)
    75	            return waveform.squeeze().numpy()
    76	        except Exception:
    77	            logger.warning(
    78	                "torchaudio failed for %s, falling back to wave module", path, exc_info=True
    79	            )
    80	

 succeeded in 1306ms:
   200	def compute_mel_spectrogram_v2(audio: np.ndarray) -> np.ndarray:
   201	    """
   202	    Compute mel spectrogram features using librosa with full speech band coverage.
   203	
   204	    Uses 40 mel bins spanning 60-7800 Hz, capturing formants F1/F2/F3 that
   205	    distinguish wake words from other words. Optionally applies PCEN instead of
   206	    log compression for better robustness to volume and noise variation.
   207	
   208	    Args:
   209	        audio: Audio samples as float32 (should be CLIP_SAMPLES length)
   210	
   211	    Returns:
   212	        Feature array (N_MELS_MEL x time_frames), either log-mel or PCEN-mel
   213	    """
   214	    import librosa  # Lazy import to avoid startup cost for inference
   215	
   216	    # Compute mel spectrogram with librosa
   217	    mel = librosa.feature.melspectrogram(
   218	        y=audio,
   219	        sr=SAMPLE_RATE,
   220	        n_fft=N_FFT_MEL,
   221	        hop_length=HOP_LENGTH_MEL,
   222	        win_length=WIN_LENGTH_MEL,
   223	        n_mels=N_MELS_MEL,
   224	        fmin=F_MIN,
   225	        fmax=F_MAX,
   226	        power=2.0,  # Power spectrogram (magnitude squared)
   227	    )
   228	
   229	    if USE_PCEN:
   230	        # Try librosa.pcen first, fall back to manual implementation

 succeeded in 1306ms:
   130	                f"WebRTC VAD requires 10/20/30ms frames at 16kHz "
   131	                f"(320/640/960 bytes). Got {n_bytes} bytes "
   132	                f"({n_bytes / 2 / SAMPLE_RATE * 1000:.1f}ms)."
   133	            )
   134	        try:
   135	            is_speech = self._vad.is_speech(audio_bytes, sample_rate=SAMPLE_RATE)
   136	            return 1.0 if is_speech else 0.0
   137	        except Exception as e:
   138	            logger.warning("WebRTC VAD error: %s", e)
   139	            return 0.0
   140	
   141	    def reset(self) -> None:
   142	        """WebRTC VAD is stateless — no-op."""
   143	
   144	
   145	class SileroVAD:
   146	    """Silero VAD backend using the packaged ONNX model from ``silero-vad``.
   147	
   148	    The upstream package ships an embedded ONNX model and exposes it via
   149	    ``load_silero_vad(onnx=True)``. The wrapped ONNX runtime expects 512-sample
   150	    windows at 16kHz; shorter SDK frames are zero-padded and longer frames are
   151	    processed in 512-sample chunks, returning the maximum speech probability.
   152	    """
   153	
   154	    def __init__(self) -> None:
   155	        if load_silero_vad is None:
   156	            raise ImportError(
   157	                "silero-vad is not installed. Install it with: pip install 'violawake[vad]'"
   158	            ) from None
   159	
   160	        try:
   161	            import torch
   162	        except ImportError as e:
   163	            raise ImportError(
   164	                "silero-vad requires torch at runtime. Install it with: "
   165	                "pip install 'violawake[vad]'"
   166	            ) from e
   167	
   168	        try:
   169	            self._model = load_silero_vad(onnx=True)
   170	        except Exception as e:
   171	            raise RuntimeError(
   172	                f"Failed to load Silero VAD model from the silero-vad package: {e}"
   173	            ) from e
   174	
   175	        self._torch = torch
   176	        self._sample_rate = SAMPLE_RATE
   177	        self._frame_samples = SILERO_FRAME_SAMPLES
   178	
   179	    def process_frame(self, audio_bytes: bytes) -> float:
   180	        """Returns speech probability from Silero VAD model."""

 succeeded in 1424ms:
src/violawake_sdk/vad.py:4:  1. Silero VAD (silero-vad package, embedded ONNX model) — preferred
src/violawake_sdk/vad.py:13:    vad = VADEngine(backend="silero")
src/violawake_sdk/vad.py:28:    from silero_vad import load_silero_vad
src/violawake_sdk/vad.py:30:    load_silero_vad = None
src/violawake_sdk/vad.py:44:    SILERO = "silero"
src/violawake_sdk/vad.py:146:    """Silero VAD backend using the packaged ONNX model from ``silero-vad``.
src/violawake_sdk/vad.py:149:    ``load_silero_vad(onnx=True)``. The wrapped ONNX runtime expects 512-sample
src/violawake_sdk/vad.py:155:        if load_silero_vad is None:
src/violawake_sdk/vad.py:157:                "silero-vad is not installed. Install it with: pip install 'violawake[vad]'"
src/violawake_sdk/vad.py:164:                "silero-vad requires torch at runtime. Install it with: "
src/violawake_sdk/vad.py:169:            self._model = load_silero_vad(onnx=True)
src/violawake_sdk/vad.py:172:                f"Failed to load Silero VAD model from the silero-vad package: {e}"
src/violawake_sdk/vad.py:307:        vad = VADEngine(backend="silero")  # or "webrtc", "rms", "auto"
src/violawake_sdk/vad.py:320:            backend: One of "auto", "webrtc", "silero", "rms".
src/violawake_sdk/vad.py:323:                For "silero": no backend-specific args
src/violawake_sdk/_exceptions.py:35:    Falls back to RMS heuristic if webrtcvad/silero not installed.

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n 'def get_model_dir\\|VIOLAWAKE_MODEL_DIR' -A45 src/violawake_sdk/models.py src/violawake_sdk/_constants.py\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' 'get_model_path(' src/violawake_sdk console/backend/app | head -100\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/models.py | sed -n '150,180p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/__init__.py | sed -n '90,125p'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1374ms:
src/violawake_sdk/models.py:7:  - Caching models in ~/.violawake/models/ (or VIOLAWAKE_MODEL_DIR)
src/violawake_sdk/models.py-8-
src/violawake_sdk/models.py-9-See ADR-005 for the full rationale behind this distribution approach.
src/violawake_sdk/models.py-10-"""
src/violawake_sdk/models.py-11-
src/violawake_sdk/models.py-12-from __future__ import annotations
src/violawake_sdk/models.py-13-
src/violawake_sdk/models.py-14-import contextlib
src/violawake_sdk/models.py-15-import hashlib
src/violawake_sdk/models.py-16-import logging
src/violawake_sdk/models.py-17-import os
src/violawake_sdk/models.py-18-import sys
src/violawake_sdk/models.py-19-from dataclasses import dataclass
src/violawake_sdk/models.py-20-from pathlib import Path
src/violawake_sdk/models.py-21-
src/violawake_sdk/models.py-22-from violawake_sdk._exceptions import ModelNotFoundError
src/violawake_sdk/models.py-23-
src/violawake_sdk/models.py-24-logger = logging.getLogger(__name__)
src/violawake_sdk/models.py-25-
src/violawake_sdk/models.py-26-# Default model cache directory
src/violawake_sdk/models.py-27-DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
src/violawake_sdk/models.py-28-
src/violawake_sdk/models.py-29-
src/violawake_sdk/models.py-30-@dataclass(frozen=True)
src/violawake_sdk/models.py-31-class ModelSpec:
src/violawake_sdk/models.py-32-    """Specification for a downloadable model."""
src/violawake_sdk/models.py-33-
src/violawake_sdk/models.py-34-    name: str
src/violawake_sdk/models.py-35-    url: str
src/violawake_sdk/models.py-36-    sha256: str
src/violawake_sdk/models.py-37-    size_bytes: int
src/violawake_sdk/models.py-38-    description: str
src/violawake_sdk/models.py-39-    version: str = "latest"
src/violawake_sdk/models.py-40-
src/violawake_sdk/models.py-41-
src/violawake_sdk/models.py-42-# ──────────────────────────────────────────────────────────────────────────────
src/violawake_sdk/models.py-43-# Model Registry
src/violawake_sdk/models.py-44-# Update this table when releasing new model versions.
src/violawake_sdk/models.py-45-# SHA-256 values are filled in during the release process by tools/update_model_registry.py.
src/violawake_sdk/models.py-46-# ──────────────────────────────────────────────────────────────────────────────
src/violawake_sdk/models.py-47-MODEL_REGISTRY: dict[str, ModelSpec] = {
src/violawake_sdk/models.py-48-    "temporal_cnn": ModelSpec(
src/violawake_sdk/models.py-49-        name="temporal_cnn",
src/violawake_sdk/models.py-50-        url="https://github.com/GeeIHadAGoodTime/ViolaWake/releases/download/v0.1.0/temporal_cnn.onnx",
src/violawake_sdk/models.py-51-        sha256="9c0b12c68593cfdb3d320a3b34667913b18d63e89eb01247d6332d7839ac9efe",
src/violawake_sdk/models.py-52-        size_bytes=102378,
--
src/violawake_sdk/models.py:152:def get_model_dir() -> Path:
src/violawake_sdk/models.py-153-    """Return the model cache directory, creating it if needed.
src/violawake_sdk/models.py-154-
src/violawake_sdk/models.py:155:    Override via VIOLAWAKE_MODEL_DIR environment variable.
src/violawake_sdk/models.py-156-    """
src/violawake_sdk/models.py:157:    model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
src/violawake_sdk/models.py-158-    model_dir.mkdir(parents=True, exist_ok=True)
src/violawake_sdk/models.py-159-    return model_dir
src/violawake_sdk/models.py-160-
src/violawake_sdk/models.py-161-
src/violawake_sdk/models.py-162-def _is_auto_download_disabled() -> bool:
src/violawake_sdk/models.py-163-    """Check if auto-download is disabled via environment variable."""
src/violawake_sdk/models.py-164-    return os.environ.get("VIOLAWAKE_NO_AUTO_DOWNLOAD", "").strip() in ("1", "true", "yes")
src/violawake_sdk/models.py-165-
src/violawake_sdk/models.py-166-
src/violawake_sdk/models.py-167-def _format_size(size_bytes: int) -> str:
src/violawake_sdk/models.py-168-    """Format a byte count as a human-readable string (e.g. '2.1 MB')."""
src/violawake_sdk/models.py-169-    if size_bytes >= 1_000_000_000:
src/violawake_sdk/models.py-170-        return f"{size_bytes / 1e9:.1f} GB"
src/violawake_sdk/models.py-171-    if size_bytes >= 1_000_000:
src/violawake_sdk/models.py-172-        return f"{size_bytes / 1e6:.1f} MB"
src/violawake_sdk/models.py-173-    if size_bytes >= 1_000:
src/violawake_sdk/models.py-174-        return f"{size_bytes / 1e3:.1f} KB"
src/violawake_sdk/models.py-175-    return f"{size_bytes} B"
src/violawake_sdk/models.py-176-
src/violawake_sdk/models.py-177-
src/violawake_sdk/models.py-178-def _auto_download_model(model_name: str, spec: ModelSpec) -> Path:
src/violawake_sdk/models.py-179-    """Auto-download a model on first use, with progress to stderr.
src/violawake_sdk/models.py-180-
src/violawake_sdk/models.py-181-    This is a lightweight download path that does NOT require tqdm or
src/violawake_sdk/models.py-182-    requests as hard dependencies — it uses urllib from the standard library.
src/violawake_sdk/models.py-183-    For full download features (progress bars, pinning), use ``download_model()``.
src/violawake_sdk/models.py-184-
src/violawake_sdk/models.py-185-    Models with placeholder hashes are refused (raises RuntimeError).
src/violawake_sdk/models.py-186-
src/violawake_sdk/models.py-187-    Args:
src/violawake_sdk/models.py-188-        model_name: Name from MODEL_REGISTRY.
src/violawake_sdk/models.py-189-        spec: The ModelSpec for this model.
src/violawake_sdk/models.py-190-
src/violawake_sdk/models.py-191-    Returns:
src/violawake_sdk/models.py-192-        Path to the downloaded model file.
src/violawake_sdk/models.py-193-
src/violawake_sdk/models.py-194-    Raises:
src/violawake_sdk/models.py-195-        RuntimeError: If the download fails.
src/violawake_sdk/models.py-196-    """
src/violawake_sdk/models.py-197-    import tempfile
src/violawake_sdk/models.py-198-    import urllib.error
src/violawake_sdk/models.py-199-    import urllib.request
src/violawake_sdk/models.py-200-
src/violawake_sdk/models.py-201-    url_suffix = Path(spec.url).suffix
src/violawake_sdk/models.py-202-    ext = url_suffix or ".onnx"

 succeeded in 1400ms:
    90	    """Return available wake word models with their descriptions.
    91	
    92	    Each entry is a dict with keys: ``name``, ``description``, ``version``.
    93	
    94	    Example::
    95	
    96	        >>> from violawake_sdk import list_models
    97	        >>> for m in list_models():
    98	        ...     print(f"{m['name']:20s} {m['description']}")
    99	    """
   100	    from violawake_sdk.models import MODEL_REGISTRY
   101	
   102	    seen: set[str] = set()
   103	    result: list[dict[str, str]] = []
   104	    for name, spec in MODEL_REGISTRY.items():
   105	        # Deduplicate aliases (e.g. "viola" -> "temporal_cnn")
   106	        if spec.name in seen:
   107	            continue
   108	        # Hide deprecated, package-managed, and non-wake-word models
   109	        if "DEPRECATED" in spec.description:
   110	            continue
   111	        if spec.name in ("oww_backbone", "kokoro_v1_0", "kokoro_voices_v1_0"):
   112	            continue
   113	        seen.add(spec.name)
   114	        result.append(
   115	            {
   116	                "name": name,
   117	                "description": spec.description,
   118	                "version": spec.version,
   119	            }
   120	        )
   121	    return result
   122	
   123	
   124	def list_voices() -> list[str]:
   125	    """Return available TTS voice names for use with ``TTSEngine``.

 succeeded in 1401ms:
   150	
   151	
   152	def get_model_dir() -> Path:
   153	    """Return the model cache directory, creating it if needed.
   154	
   155	    Override via VIOLAWAKE_MODEL_DIR environment variable.
   156	    """
   157	    model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
   158	    model_dir.mkdir(parents=True, exist_ok=True)
   159	    return model_dir
   160	
   161	
   162	def _is_auto_download_disabled() -> bool:
   163	    """Check if auto-download is disabled via environment variable."""
   164	    return os.environ.get("VIOLAWAKE_NO_AUTO_DOWNLOAD", "").strip() in ("1", "true", "yes")
   165	
   166	
   167	def _format_size(size_bytes: int) -> str:
   168	    """Format a byte count as a human-readable string (e.g. '2.1 MB')."""
   169	    if size_bytes >= 1_000_000_000:
   170	        return f"{size_bytes / 1e9:.1f} GB"
   171	    if size_bytes >= 1_000_000:
   172	        return f"{size_bytes / 1e6:.1f} MB"
   173	    if size_bytes >= 1_000:
   174	        return f"{size_bytes / 1e3:.1f} KB"
   175	    return f"{size_bytes} B"
   176	
   177	
   178	def _auto_download_model(model_name: str, spec: ModelSpec) -> Path:
   179	    """Auto-download a model on first use, with progress to stderr.
   180	

 succeeded in 1457ms:
src/violawake_sdk/models.py:296:def get_model_path(
src/violawake_sdk/tts.py:128:            model_path = get_model_path("kokoro_v1_0")
src/violawake_sdk/tts.py:129:            voices_path = get_model_path("kokoro_voices_v1_0")
src/violawake_sdk/wake_detector.py:632:            return get_model_path(model)

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba console/backend/entrypoint.sh | sed -n '1,35p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/training/evaluate.py | sed -n '252,332p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"nl -ba src/violawake_sdk/tools/contamination_check.py | sed -n '100,118p'\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git log --oneline -3"' in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1475ms:
   252	def _build_oww_scorer(session, input_name: str):
   253	    """Build a scorer for mean-pooled OpenWakeWord embedding models."""
   254	    try:
   255	        from openwakeword.model import Model as OWWModel  # type: ignore[import]
   256	    except ImportError as e:
   257	        raise ImportError(
   258	            "openwakeword required for MLP-on-OWW evaluation. pip install openwakeword"
   259	        ) from e
   260	
   261	    from violawake_sdk._constants import CLIP_SAMPLES
   262	    from violawake_sdk.audio import center_crop, load_audio
   263	
   264	    oww = OWWModel(inference_framework="onnx")
   265	    preprocessor = oww.preprocessor
   266	    if not hasattr(preprocessor, "onnx_execution_provider"):
   267	        preprocessor.onnx_execution_provider = "CPUExecutionProvider"
   268	
   269	    def _score_file_oww(wav_path: Path) -> float | None:
   270	        try:
   271	            embeddings = _extract_oww_frame_embeddings(
   272	                wav_path,
   273	                preprocessor=preprocessor,
   274	                load_audio=load_audio,
   275	                center_crop=center_crop,
   276	                clip_samples=CLIP_SAMPLES,
   277	            )
   278	            if embeddings is None:
   279	                return None
   280	
   281	            embedding = embeddings.mean(axis=0).astype(np.float32)
   282	            score = session.run(None, {input_name: embedding.reshape(1, -1)})[0]
   283	            return float(np.asarray(score).flatten()[0])
   284	        except Exception:
   285	            logger.warning("Failed to score file (OWW path): %s", wav_path, exc_info=True)
   286	            return None
   287	
   288	    return _score_file_oww
   289	
   290	
   291	def _build_cnn_scorer(session, input_name: str):
   292	    """Build a scorer for legacy mel/CNN models."""
   293	    from violawake_sdk._constants import CLIP_SAMPLES
   294	    from violawake_sdk.audio import center_crop, compute_features, load_audio
   295	
   296	    def _score_file_cnn(wav_path: Path) -> float | None:
   297	        audio = load_audio(wav_path)
   298	        if audio is None:
   299	            return None
   300	
   301	        audio = center_crop(audio, CLIP_SAMPLES)
   302	        features = compute_features(audio)
   303	        feat_input = features[np.newaxis, :, :].astype(np.float32)
   304	
   305	        try:
   306	            outputs = session.run(None, {input_name: feat_input})
   307	            return float(np.asarray(outputs[0]).flatten()[0])
   308	        except Exception:
   309	            logger.warning("Failed to score file (CNN path): %s", wav_path, exc_info=True)
   310	            return None
   311	
   312	    return _score_file_cnn
   313	
   314	
   315	def _build_temporal_oww_scorer(session, input_name: str):
   316	    """Build a scorer for temporal OpenWakeWord embedding models."""
   317	    try:
   318	        from openwakeword.model import Model as OWWModel  # type: ignore[import]
   319	    except ImportError as e:
   320	        raise ImportError(
   321	            "openwakeword required for temporal OWW evaluation. pip install openwakeword"
   322	        ) from e
   323	
   324	    from violawake_sdk._constants import CLIP_SAMPLES
   325	    from violawake_sdk.audio import center_crop, load_audio
   326	
   327	    oww = OWWModel(inference_framework="onnx")
   328	    preprocessor = oww.preprocessor
   329	    if not hasattr(preprocessor, "onnx_execution_provider"):
   330	        preprocessor.onnx_execution_provider = "CPUExecutionProvider"
   331	
   332	    seq_len = _infer_temporal_seq_len(session.get_inputs()[0].shape)

 succeeded in 1525ms:
     1	#!/bin/sh
     2	set -e
     3	
     4	# Run database migrations before serving. Idempotent — alembic handles
     5	# already-applied migrations gracefully. Tolerates connection blips by
     6	# retrying briefly while postgres warms up.
     7	echo "[entrypoint] running alembic upgrade head"
     8	for i in 1 2 3 4 5; do
     9	    if alembic upgrade head; then
    10	        break
    11	    fi
    12	    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
    13	    [ "$i" = "5" ] && exit 1
    14	    sleep 3
    15	done
    16	
    17	# Ensure OpenWakeWord backbone resources are present. The openwakeword
    18	# wheel ships without ONNX/TFLite resources; download_models() fetches
    19	# them on first call. Idempotent — short-circuits if files exist.
    20	# Required for the training pipeline (OWWModel construction in
    21	# violawake_sdk.tools.train).
    22	echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
    23	python -c "from openwakeword.utils import download_models; download_models()"
    24	
    25	if [ "$(id -u)" = "0" ]; then
    26	    chown -R app:app /app/data
    27	    exec gosu app python run.py
    28	else
    29	    exec python run.py
    30	fi

 succeeded in 1516ms:
9cbb8e6 Fail fast on backend startup prerequisites
0c3923b Pin OpenWakeWord evaluation helpers to ONNX
196bfdd fix(training): four runtime bugs blocking the actual product

 succeeded in 1523ms:
   100	    import numpy as np
   101	
   102	    try:
   103	        from openwakeword.model import Model as OWWModel
   104	    except ImportError:
   105	        return {
   106	            "method": "embedding",
   107	            "error": "openwakeword not installed. Install with: pip install openwakeword",
   108	            "overlap_count": -1,
   109	        }
   110	
   111	    from violawake_sdk._constants import CLIP_SAMPLES
   112	    from violawake_sdk.audio import center_crop, load_audio
   113	
   114	    oww = OWWModel(inference_framework="onnx")
   115	    preprocessor = oww.preprocessor
   116	
   117	    def _embed_file(path: Path) -> np.ndarray | None:
   118	        audio = load_audio(path)

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
The audit has enough evidence now: startup failures, backend/library exception handling, network-bound TTS, corpus assumptions, queue persistence, quota behavior, and remaining SDK format defaults. I’m writing the doc with fixed items marked by commit SHA and unfixed items scoped to proposed changes only.
  ✓ Map repo state and relevant training/backend files
  ✓ Audit framework defaults, lazy imports, network calls, filesystem paths, startup, quota behavior
  ✓ Apply only obvious one-line fixes if any, with separate commits
  → Write audit doc and verify with requested commands/tests
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

apply patch
patch: completed
J:\CLAUDE\PROJECTS\Wakeword\docs\TRAINING_PIPELINE_AUDIT_2026-05-07.md
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "head -100 docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "bash -n console/backend/entrypoint.sh"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git status --short --branch"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -RInI --include='*.py' 'OWWModel()' src/violawake_sdk console/backend | head -20\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1483ms:
# Training Pipeline Audit - 2026-05-07

Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.

Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.

One-line fixes applied:
- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.

## Findings

### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed

File/lines: `console/backend/entrypoint.sh:8-15`

Evidence:
```sh
for i in 1 2 3 4 5; do
    if alembic upgrade head; then
        break
    fi
    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
    [ "$i" = "5" ] && exit 1
    sleep 3
done
```

Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.

### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path

File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`

Evidence:
```py
except ImportError as e:
    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
    sys.exit(1)

if len(pos_embs) < 5:
    ...
    sys.exit(1)

if len(all_neg_embs) < 5:
    ...
    sys.exit(1)
```
```py
except Exception as exc:
    ...
```

`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.

### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path

File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`

Evidence:
```py
oww = OWWModel(inference_framework="onnx")
```

Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.

### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior

File/lines: `console/backend/entrypoint.sh:22-23`

Evidence:
```sh
echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
python -c "from openwakeword.utils import download_models; download_models()"
```

Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.

### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed

File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`

Evidence:
```py
communicate = edge_tts.Communicate(text, voice)
ok = _edge_tts_synthesize(word, voice, out_path)
ok = _edge_tts_synthesize(phrase, voice, out_path)
```
```dockerfile
RUN pip install --no-cache-dir "/sdk[training]"
```

The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.

### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres

File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`

Evidence:

 succeeded in 1530ms:
 succeeded in 1633ms:
## master...origin/master [ahead 2]
 M pyproject.toml
 M src/violawake_sdk/tools/train.py
 M tests/unit/test_train.py
?? .codex_log_a.txt
?? .codex_log_alpha.txt
?? .codex_log_b.txt
?? .codex_log_c.txt
?? .codex_log_delta.txt
?? .codex_prompt_a.txt
?? .codex_prompt_alpha.txt
?? .codex_prompt_b.txt
?? .codex_prompt_c.txt
?? .codex_prompt_delta.txt
?? docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md

 succeeded in 2448ms:
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
The audit doc is written and `head -100` shows the summary plus the first six findings. I also noticed new unrelated modifications in `pyproject.toml`, `src/violawake_sdk/tools/train.py`, and `tests/unit/test_train.py` after the earlier clean status; I’m treating those as user/workspace changes and will not revert them.
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff --stat"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff -- pyproject.toml | head -120"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff -- src/violawake_sdk/tools/train.py | head -160"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git diff -- tests/unit/test_train.py | head -160"' in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1560ms:
diff --git a/tests/unit/test_train.py b/tests/unit/test_train.py
index 9c3521e..be62dbf 100644
--- a/tests/unit/test_train.py
+++ b/tests/unit/test_train.py
@@ -51,6 +51,59 @@ class TestTrainHelpers:
         assert saved["wake_word"] == "viola"
         assert saved["auto_eval"]["status"] == "ok"
 
+    def test_edge_tts_synthesize_decodes_with_soundfile(self, tmp_path: Path) -> None:
+        import numpy as np
+
+        class FakeCommunicate:
+            def __init__(self, text: str, voice: str) -> None:
+                self.text = text
+                self.voice = voice
+
+            async def stream(self):
+                yield {"type": "audio", "data": b"fake mp3 bytes" * 20}
+
+        edge_tts_module = ModuleType("edge_tts")
+        edge_tts_module.Communicate = FakeCommunicate
+        soundfile_module = ModuleType("soundfile")
+        soundfile_module.read = MagicMock(
+            return_value=(np.zeros(16000, dtype=np.float32), 16000)
+        )
+
+        out_path = tmp_path / "tts.wav"
+        with patch.dict(
+            sys.modules,
+            {"edge_tts": edge_tts_module, "soundfile": soundfile_module},
+        ):
+            assert train._edge_tts_synthesize("hello", "en-US-JennyNeural", out_path)
+
+        assert out_path.stat().st_size > 44
+        soundfile_module.read.assert_called_once()
+
+    def test_confusable_generation_logs_zero_edge_tts_outputs(
+        self, caplog: pytest.LogCaptureFixture, tmp_path: Path
+    ) -> None:
+        train._LAST_EDGE_TTS_ERROR = "pydub decode failed: missing ffprobe"
+
+        with (
+            caplog.at_level("ERROR", logger=train.logger.name),
+            patch(
+                "violawake_sdk.tools.confusables.generate_confusables",
+                return_value=["violas"],
+            ),
+            patch("violawake_sdk.tools.train._edge_tts_synthesize", return_value=False),
+        ):
+            generated = train._generate_confusable_negatives(
+                "viola",
+                tmp_path,
+                n_confusables=1,
+                voices_per_word=1,
+                verbose=False,
+            )
+
+        assert generated == []
+        assert "edge-tts confusable negative generation produced 0 files" in caplog.text
+        assert "missing ffprobe" in caplog.text
+
 
 class TestTrainMainValidation:
     def test_main_exits_when_positives_dir_is_missing(
warning: in the working copy of 'tests/unit/test_train.py', LF will be replaced by CRLF the next time Git touches it

 succeeded in 1564ms:
 pyproject.toml                   |   1 +
 src/violawake_sdk/tools/train.py | 187 ++++++++++++++++++++++++++++++---------
 tests/unit/test_train.py         |  53 +++++++++++
 3 files changed, 199 insertions(+), 42 deletions(-)
warning: in the working copy of 'pyproject.toml', LF will be replaced by CRLF the next time Git touches it
warning: in the working copy of 'tests/unit/test_train.py', LF will be replaced by CRLF the next time Git touches it

 succeeded in 1565ms:
diff --git a/pyproject.toml b/pyproject.toml
index dc3daa9..e437145 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -91,6 +91,7 @@ training = [
     "openwakeword>=0.6",
     "audiomentations>=0.37",
     "librosa>=0.10",
+    "soundfile>=0.12",
     "scikit-learn>=1.3",
     "matplotlib>=3.8",
     "pandas>=2.1",
warning: in the working copy of 'pyproject.toml', LF will be replaced by CRLF the next time Git touches it

 succeeded in 1565ms:
diff --git a/src/violawake_sdk/tools/train.py b/src/violawake_sdk/tools/train.py
index 68b6e1e..558f771 100644
--- a/src/violawake_sdk/tools/train.py
+++ b/src/violawake_sdk/tools/train.py
@@ -54,6 +54,7 @@ from __future__ import annotations
 
 import argparse
 import json
+import logging
 import math
 import os
 import shutil
@@ -67,10 +68,14 @@ from typing import TYPE_CHECKING, Any
 if TYPE_CHECKING:
     import numpy as np
 
+logger = logging.getLogger(__name__)
+
 # Module-level temp directory override. When set, all tempfile operations use
 # this instead of the OS default (which may be on a small system drive).
 # Set by _train_temporal_cnn() via its tmp_dir parameter.
 _TMP_DIR: str | None = None
+_LAST_EDGE_TTS_ERROR: str | None = None
+_REPORTED_EDGE_TTS_ERRORS: set[str] = set()
 
 # ---------------------------------------------------------------------------
 # Edge-TTS voice pool for diverse positive and negative generation
@@ -259,6 +264,29 @@ def get_best_provider(device: str | None = None) -> str:
 # ---------------------------------------------------------------------------
 
 
+def _edge_tts_fail(text: str, voice: str, detail: str | BaseException) -> bool:
+    """Record and log an edge-tts failure while preserving the bool API."""
+    global _LAST_EDGE_TTS_ERROR
+
+    if isinstance(detail, BaseException):
+        summary = f"{type(detail).__name__}: {detail}"
+    else:
+        summary = detail
+    _LAST_EDGE_TTS_ERROR = summary
+
+    # A missing decoder causes hundreds of identical per-sample failures. Log
+    # the actual exception once, then the generator summary logs the zero count.
+    if summary not in _REPORTED_EDGE_TTS_ERRORS:
+        _REPORTED_EDGE_TTS_ERRORS.add(summary)
+        logger.error(
+            "edge-tts synthesis failed for voice %s text %.80r: %s",
+            voice,
+            text,
+            summary,
+        )
+    return False
+
+
 def _edge_tts_synthesize(text: str, voice: str, output_path: Path) -> bool:
     """Synthesize a single phrase with edge-tts and save as WAV at 16kHz.
 
@@ -268,11 +296,20 @@ def _edge_tts_synthesize(text: str, voice: str, output_path: Path) -> bool:
     import io
     import tempfile
 
+    global _LAST_EDGE_TTS_ERROR
+    _LAST_EDGE_TTS_ERROR = None
+
     try:
         import edge_tts
-    except ImportError:
-        print("WARNING: edge-tts not installed. pip install edge-tts", file=sys.stderr)
-        return False
+    except ImportError as exc:
+        message = "edge-tts is not installed. Install with: pip install edge-tts"
+        if exc:
+            message = f"{message} ({type(exc).__name__}: {exc})"
+        return _edge_tts_fail(
+            text,
+            voice,
+            message,
+        )
 
     async def _synth():
         communicate = edge_tts.Communicate(text, voice)
@@ -295,47 +332,82 @@ def _edge_tts_synthesize(text: str, voice: str, output_path: Path) -> bool:
                 mp3_data = loop.run_until_complete(_synth())
         except RuntimeError:
             mp3_data = asyncio.run(_synth())
+    except Exception as exc:
+        return _edge_tts_fail(text, voice, exc)
+
+    if not mp3_data or len(mp3_data) < 100:
+        return _edge_tts_fail(
+            text,
+            voice,
+            f"edge-tts returned too little audio data ({len(mp3_data) if mp3_data else 0} bytes)",
+        )
 
-        if not mp3_data or len(mp3_data) < 100:
-            return False
+    conversion_errors: list[str] = []
 
-        # Convert MP3 to WAV at 16kHz using pydub or ffmpeg
-        try:
-            from pydub import AudioSegment
-
-            seg = AudioSegment.from_mp3(io.BytesIO(mp3_data))
-            seg = seg.set_channels(1).set_frame_rate(16000).set_sample_width(2)
-            seg.export(str(output_path), format="wav")
-            return True
-        except ImportError:
-            pass
-
-        # Fallback: write MP3 to temp, load with torchaudio/scipy
-        tmp_fd, tmp_path = tempfile.mkstemp(suffix=".mp3", dir=_TMP_DIR)
-        try:
-            os.write(tmp_fd, mp3_data)
-        finally:
-            os.close(tmp_fd)
-        os.chmod(tmp_path, 0o600)
+    # First try libsndfile via soundfile. The backend image already gets this
+    # through the training stack, and it avoids a hard ffmpeg dependency.
+    try:
+        import numpy as np
+        import soundfile as sf
 
-        try:
-            import torchaudio
-
-            waveform, sr = torchaudio.load(tmp_path)
-            if waveform.shape[0] > 1:
-                waveform = waveform.mean(dim=0, keepdim=True)
-            if sr != 16000:
-                waveform = torchaudio.transforms.Resample(sr, 16000)(waveform)
-            torchaudio.save(str(output_path), waveform, 16000)
-            return True
-        except Exception:
-            pass
-        finally:
-            Path(tmp_path).unlink(missing_ok=True)
+        audio, sr = sf.read(io.BytesIO(mp3_data), dtype="float32")
+        audio = np.asarray(audio, dtype=np.float32)
+        if audio.size == 0:
+            raise RuntimeError("decoded MP3 contained no audio samples")
+        if audio.ndim > 1:
+            audio = audio.mean(axis=1)
+        if sr != 16000:
+            audio = _resample_audio(audio, sr, 16000)
+        _save_wav(audio, output_path, 16000)
+        return output_path.exists() and output_path.stat().st_size > 44
+    except ImportError as exc:
+        conversion_errors.append(f"soundfile unavailable: {type(exc).__name__}: {exc}")
+    except Exception as exc:
+        conversion_errors.append(f"soundfile decode failed: {type(exc).__name__}: {exc}")
+
+    # Fallback: pydub with ffmpeg/ffprobe when available.
+    try:
+        from pydub import AudioSegment
+
+        seg = AudioSegment.from_mp3(io.BytesIO(mp3_data))
+        seg = seg.set_channels(1).set_frame_rate(16000).set_sample_width(2)
+        seg.export(str(output_path), format="wav")
+        return output_path.exists() and output_path.stat().st_size > 44

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
The requested test will run against the current working tree, which includes those unrelated training changes. If it fails, I’ll separate whether the failure is from my two committed startup/OWW fixes, the audit doc, or the pre-existing/unrelated diffs.
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "cd console && python -m pytest tests/ --no-cov --timeout=30 --ignore=tests/e2e -q 2>&1 | tail -3"' in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 45518ms:
tests\test_teams.py .........................                            [100%]

======================= 147 passed, 2 skipped in 41.03s =======================

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
The backend test suite passed with the requested command. I’m going to commit the audit document as its own non-fix commit, leaving the unrelated modified files untouched.
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"git add docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md && git commit -m 'Add training pipeline latent bug audit'\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 4045ms:
[master 12e9a50] Add training pipeline latent bug audit
 1 file changed, 186 insertions(+)
 create mode 100644 docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
warning: in the working copy of 'docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md', LF will be replaced by CRLF the next time Git touches it

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git status --short --branch"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command 'bash -lc "git log --oneline -5"' in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n '"'^### '"' docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md\"" in J:\CLAUDE\PROJECTS\Wakeword
exec
"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -Command "bash -lc \"grep -n 'Summary counts' docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md\"" in J:\CLAUDE\PROJECTS\Wakeword
 succeeded in 1455ms:
13:### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
31:### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
57:### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
68:### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
80:### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
96:### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
108:### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
125:### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
140:### 9. MEDIUM - Health/readiness does not validate training prerequisites
153:### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume

 succeeded in 1466ms:
5:Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.

 succeeded in 1572ms:
12e9a50 Add training pipeline latent bug audit
9cbb8e6 Fail fast on backend startup prerequisites
0c3923b Pin OpenWakeWord evaluation helpers to ONNX
196bfdd fix(training): four runtime bugs blocking the actual product
2320f2d ci: gate deploy-pages workflow behind manual dispatch

 succeeded in 1622ms:
## master...origin/master [ahead 3]
 M pyproject.toml
 M src/violawake_sdk/tools/train.py
 M tests/unit/test_train.py
?? .codex_log_a.txt
?? .codex_log_alpha.txt
?? .codex_log_b.txt
?? .codex_log_c.txt
?? .codex_log_delta.txt
?? .codex_prompt_a.txt
?? .codex_prompt_alpha.txt
?? .codex_prompt_b.txt
?? .codex_prompt_c.txt
?? .codex_prompt_delta.txt

diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

codex
Done. Audit doc is committed at [docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md](J:/CLAUDE/PROJECTS/Wakeword/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md).

`head -100` was verified. It starts with:
```md
# Training Pipeline Audit - 2026-05-07

Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.

One-line fixes applied:
- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.

Findings:
1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
```

Commits made:
- `0c3923b` - `Pin OpenWakeWord evaluation helpers to ONNX`
- `9cbb8e6` - `Fail fast on backend startup prerequisites`
- `12e9a50` - `Add training pipeline latent bug audit`

Requested test:
```text
tests\test_teams.py .........................                            [100%]
======================= 147 passed, 2 skipped in 41.03s =======================
```

One-liner takeaway: the next biggest landmine is `sys.exit(1)` inside SDK training helpers used by the backend worker, because `SystemExit` bypasses the job queue’s `except Exception` failure path and can leave a runtime training failure unreported as a clean failed job.

Note: the worktree still has unrelated modified files I did not touch or revert: `pyproject.toml`, `src/violawake_sdk/tools/train.py`, and `tests/unit/test_train.py`, plus untracked Codex log/prompt files.
diff --git a/console/backend/entrypoint.sh b/console/backend/entrypoint.sh
index 92893c88348827c34555b836785f894660bba10d..6ca9a2b24e4b434dd890c3ac79f92b9b9f0cfe19
--- a/console/backend/entrypoint.sh
+++ b/console/backend/entrypoint.sh
@@ -10,6 +10,7 @@
         break
     fi
     echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
     sleep 3
 done
 
@@ -19,8 +20,7 @@
 # Required for the training pipeline (OWWModel construction in
 # violawake_sdk.tools.train).
 echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
-python -c "from openwakeword.utils import download_models; download_models()" 2>&1 \
-    | tail -3 || true
+python -c "from openwakeword.utils import download_models; download_models()"
 
 if [ "$(id -u)" = "0" ]; then
     chown -R app:app /app/data
diff --git a/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1483b9a90ebcd7189686dc752d589bbe022d08d
--- /dev/null
+++ b/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md
@@ -0,0 +1,186 @@
+# Training Pipeline Audit - 2026-05-07
+
+Scope: `src/violawake_sdk/` and `console/backend/`, focused on latent fresh-deploy failures in the synchronous training path and adjacent SDK evaluation/runtime paths.
+
+Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.
+
+One-line fixes applied:
+- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
+- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.
+
+## Findings
+
+### 1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
+
+File/lines: `console/backend/entrypoint.sh:8-15`
+
+Evidence:
+```sh
+for i in 1 2 3 4 5; do
+    if alembic upgrade head; then
+        break
+    fi
+    echo "[entrypoint] alembic try $i failed; retrying in 3s..."
+    [ "$i" = "5" ] && exit 1
+    sleep 3
+done
+```
+
+Before `9cbb8e6`, the loop had no final failure exit, so the fifth failed try fell through and the backend could start against an unmigrated or unavailable database. Proposed fix: applied in `9cbb8e6`; keep this fail-fast behavior and consider logging a clearer final "migration failed" line. Estimated effort: done; optional polish 15 minutes.
+
+### 2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
+
+File/lines: `src/violawake_sdk/tools/train.py:1060-1063`, `src/violawake_sdk/tools/train.py:1159-1165`, `src/violawake_sdk/tools/train.py:1203-1208`, `console/backend/app/job_queue.py:834`
+
+Evidence:
+```py
+except ImportError as e:
+    print(f"ERROR: PyTorch required for training: {e}", file=sys.stderr)
+    print("Install with: pip install 'violawake[training]'", file=sys.stderr)
+    sys.exit(1)
+
+if len(pos_embs) < 5:
+    ...
+    sys.exit(1)
+
+if len(all_neg_embs) < 5:
+    ...
+    sys.exit(1)
+```
+```py
+except Exception as exc:
+    ...
+```
+
+`SystemExit` inherits from `BaseException`, not `Exception`, so these library-path exits bypass the job failure handler. A fresh deploy with missing training deps, bad embeddings, or no usable negatives can leave the task exception unhandled instead of marking the job failed cleanly. Proposed fix: split CLI exits from library behavior. Make `_train_temporal_cnn` and other callable helpers raise `RuntimeError`/typed exceptions, and let `main()` convert them to exit codes. As a short-term guard, catch `SystemExit` at the backend boundary and mark the job failed without swallowing `asyncio.CancelledError`. Estimated effort: 0.5-1 day with tests.
+
+### 3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
+
+File/lines: `src/violawake_sdk/training/evaluate.py:264`, `src/violawake_sdk/training/evaluate.py:327`, `src/violawake_sdk/tools/contamination_check.py:114`
+
+Evidence:
+```py
+oww = OWWModel(inference_framework="onnx")
+```
+
+Pre-fix grep found default constructors at those sites. They are adjacent to training/evaluation workflows and would have failed the same way as `tools/train.py` when the container has a `tflite_runtime` that cannot read the current OpenWakeWord `.tflite` schema. Proposed fix: applied in `0c3923b`; keep `grep -R "OWWModel()" src/violawake_sdk console/backend` as a regression check. Estimated effort: done.
+
+### 4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
+
+File/lines: `console/backend/entrypoint.sh:22-23`
+
+Evidence:
+```sh
+echo "[entrypoint] ensuring openwakeword backbone models are downloaded"
+python -c "from openwakeword.utils import download_models; download_models()"
+```
+
+Before `9cbb8e6`, this command was piped through `tail -3 || true`. In POSIX `sh`, pipeline status comes from `tail`, so a failing Python download could still return success and the app would become healthy until first training. Proposed fix: applied in `9cbb8e6`; if output trimming is needed later, use a shell that supports `pipefail` or capture logs without hiding exit status. Estimated effort: done.
+
+### 5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
+
+File/lines: `src/violawake_sdk/tools/train.py:278`, `src/violawake_sdk/tools/train.py:547`, `src/violawake_sdk/tools/train.py:587`, `src/violawake_sdk/tools/train.py:418-434`, `console/Dockerfile.backend:20`, `pyproject.toml:87-100`
+
+Evidence:
+```py
+communicate = edge_tts.Communicate(text, voice)
+ok = _edge_tts_synthesize(word, voice, out_path)
+ok = _edge_tts_synthesize(phrase, voice, out_path)
+```
+```dockerfile
+RUN pip install --no-cache-dir "/sdk[training]"
+```
+
+The production image installs `violawake[training]`, which includes `edge-tts` but not the `tts` extra (`kokoro-onnx`). `_generate_tts_positives()` has a Kokoro fallback probe, but `TTSEngine` only loads `kokoro-onnx` and its large model files on first synthesis. Confusable and speech negatives do not try Kokoro at all. On a customer machine without internet, behind a proxy, or during Edge rate limiting, quality drops and training can fail if the mounted corpus is absent or too small. Proposed fix: make offline training explicit. Either ship/pre-download an offline TTS backend and its model assets into the image/cache, or remove online TTS from the synchronous path and require a mounted negative/positive corpus with a startup readiness check. Estimated effort: 1-2 days.
+
+### 6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
+
+File/lines: `console/backend/app/database.py:17-23`, `console/backend/app/job_queue.py:151`, `docker-compose.production.yml:36`
+
+Evidence:
+```py
+DATABASE_URL = settings.db_url.strip() if settings.db_url and settings.db_url.strip() else ...
+self._db_path = db_path or (settings.data_dir / "job_queue.db")
+```
+
+The app data model can use `VIOLAWAKE_DB_URL=postgresql+asyncpg://...`, but training job state is always an `aiosqlite` file under `/app/data`. The production compose file has a volume, so single-node Docker is survivable. Fresh deploys without a persistent `/app/data` volume, horizontal replicas, or platform restarts can lose pending/running job state while the main database remains intact. Proposed fix: move the queue tables into Postgres or add an explicit deployment guard that refuses multi-replica/ephemeral-volume production mode. Estimated effort: 1-2 days.
+
+### 7. MEDIUM - Universal negative corpus is optional in the console path but quality-critical
+
+File/lines: `console/backend/app/services/training_service.py:230-270`, `src/violawake_sdk/tools/train.py:2472-2485`, `docker-compose.production.yml:25-32`
+
+Evidence:
+```py
+_CORPUS_SEARCH_PATHS = [
+    Path(__file__).resolve().parent.parent.parent.parent / "corpus",
+    Path.home() / ".violawake" / "corpus",
+    Path("corpus"),
+]
+if total_neg < 5:
+    raise RuntimeError("Only %s negative files generated. edge-tts may not be installed or network unavailable." % total_neg)
+```
+
+The CLI warns when no universal corpus exists, but the backend only hard-fails when total negatives fall below five. If Edge TTS produces enough synthetic negatives, training can succeed without LibriSpeech/MUSAN and create high false-positive models. Production compose mounts `./corpus:/app/corpus:ro`, but Docker will create an empty host directory if it is missing. Proposed fix: add a backend warning/progress event and a startup/health check for minimum corpus counts, or require `VIOLAWAKE_NEGATIVES_CORPUS_DIR`/`/app/corpus` in production. Estimated effort: 0.5 day.
+
+### 8. MEDIUM - Quota enforcement is clean at limit but not atomic with job submission
+
+File/lines: `console/backend/app/routes/billing.py:252-308`, `console/backend/app/routes/jobs.py:101-120`
+
+Evidence:
+```py
+if used >= limit:
+    raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, ...)
+...
+job_id = await queue.submit_job(...)
+await record_usage(db, current_user.id, action="training_job")
+```
+
+If a user is already at quota, the route returns a clean 403, not a 500. The latent issue is concurrency and partial failure: two requests at `limit - 1` can both pass `check_training_quota()`, and a queue insert can succeed before `record_usage()` fails or increments. Proposed fix: reserve usage atomically before queue submission, or put queue submission and usage reservation behind one database transaction/outbox. Estimated effort: 0.5-1 day.
+
+### 9. MEDIUM - Health/readiness does not validate training prerequisites
+
+File/lines: `console/backend/app/health.py:122-155`, `console/backend/app/config.py:195-198`
+
+Evidence:
+```py
+upload_dir = _check_directory(settings.upload_dir)
+models_dir = _check_directory(settings.models_dir)
+component_status = _combine_statuses(upload_dir["status"], models_dir["status"])
+```
+
+Runtime dirs are created at import, and health checks database, queue, uploads, and models. It does not check `tmp_dir` writability, OpenWakeWord resource presence, universal corpus counts, or importability of training-only dependencies. After the entrypoint fixes, OWW download is a startup prerequisite in Docker, but local/fresh SDK starts and non-Docker runs can still report healthy before first training fails. Proposed fix: add a training readiness component with cheap checks and a more expensive admin-only preflight. Estimated effort: 0.5 day.
+
+### 10. MEDIUM - SDK model cache defaults to `Path.home()` and is not bound to the backend data volume
+
+File/lines: `src/violawake_sdk/models.py:27`, `src/violawake_sdk/models.py:152-158`, `src/violawake_sdk/wake_detector.py:631-637`, `src/violawake_sdk/tts.py:128-129`
+
+Evidence:
+```py
+DEFAULT_MODEL_DIR = Path.home() / ".violawake" / "models"
+model_dir = Path(os.environ.get("VIOLAWAKE_MODEL_DIR", str(DEFAULT_MODEL_DIR)))
+model_dir.mkdir(parents=True, exist_ok=True)
+```
+
+SDK inference and Kokoro TTS auto-download into the process home directory unless `VIOLAWAKE_MODEL_DIR` is set. The backend Dockerfile/compose do not set that env var to `/app/data/models`, so downloaded SDK assets can be ephemeral or unwritable depending on platform user/home behavior. Proposed fix: set `VIOLAWAKE_MODEL_DIR=/app/data/sdk-models` in backend production config and document the same for fresh SDK deploys that disable internet access. Estimated effort: 15-30 minutes plus deploy validation.
+
+## Lazy Import Inventory
+
+`src/violawake_sdk/tools/train.py`: lazy imports at lines 68, 227, 267-272, 290, 304, 322, 343-344, 363-366, 400-418, 472-473, 522, 604-606, 634-637, 686-689, 721-724, 820, 865-871, 950, 1056-1071, 1087-1088, 1396-1397, 1447, 1538-1543, 1728-1750, 2066-2068, 2451, 2586. Risky runtime-only deps here: `openwakeword`, `edge_tts`, `pydub`, `torchaudio`, `audiomentations`, `torch`, `onnx`, `onnxruntime`, `scipy`, `violawake_sdk.tts`/Kokoro.
+
+`src/violawake_sdk/oww_backbone.py`: lazy imports at lines 126 (`openwakeword.utils.download_models`) and 187 (`MODEL_REGISTRY`). The resource download is now enforced in Docker entrypoint, but SDK/local usage still auto-downloads on first missing-resource access.
+
+Backend services/platform lazy imports: `console/backend/app/services/training_service.py:100` imports SDK training helpers only when a job runs; `app/storage.py:151-152` and `198` import `boto3`/`botocore` only when R2 is configured or queried; `app/routes/billing.py:69`, `273`, `277` import Stripe/email/asyncio lazily; `app/routes/recordings.py:128-156` imports `io`, `numpy`, and `scipy` during upload validation/resampling; `app/middleware.py:111-112` imports Sentry only when configured; `app/main.py:42`, `76`, `170` imports retention/email services at startup/admin execution; `app/job_queue.py:117`, `791`, `1170`, `1195` imports subscription/email/retention helpers at priority, completion, and cleanup time.
+
+## External Network Dependencies
+
+Synchronous training path: Edge TTS at `src/violawake_sdk/tools/train.py:278`, OpenWakeWord `download_models()` at `src/violawake_sdk/oww_backbone.py:126-132` and `console/backend/entrypoint.sh:23`, optional Kokoro model downloads via `get_model_path()` in `src/violawake_sdk/tts.py:128-129`, and optional object storage calls through R2 in `console/backend/app/storage.py:151-217`.
+
+SDK/download path: model auto-download via `urllib.request.urlopen()` in `src/violawake_sdk/models.py:197-267`, explicit downloads via `requests.get()` in `src/violawake_sdk/models.py:483`, corpus expansion via `requests.get()` in `src/violawake_sdk/tools/expand_corpus.py:89`, and certificate-pinned downloads in `src/violawake_sdk/security/cert_pinning.py:664-706`.
+
+## Filesystem Expectations
+
+Required backend paths: `console/backend/app/config.py:30-35` defines `data_dir`, `db_path`, `upload_dir`, `models_dir`, and `tmp_dir`; `config.py:195-198` creates them at import. `job_queue.py:151` requires a writable local `data/job_queue.db`; `training_service.py:72`, `107`, and `job_queue.py:707` require `settings.tmp_dir` for temporary training work.
+
+Optional but quality-critical corpus paths: `training_service.py:231-234`, `tools/train.py:2424-2428`, and `console/backend/scripts/train_full_pipeline.py:198-202` search repo `corpus`, `~/.violawake/corpus`, and CWD `corpus`. Absence is not fatal unless too few negatives remain.
+
+SDK model cache path: `src/violawake_sdk/models.py:27` and `152-158` use `~/.violawake/models` unless `VIOLAWAKE_MODEL_DIR` is set.
diff --git a/src/violawake_sdk/tools/contamination_check.py b/src/violawake_sdk/tools/contamination_check.py
index bfb5849b57da804d87d79e1e8a6e42b3d3b6d3a2..bc134136aaa00bd5e32a5e3ff573d854b0b6f2ae
--- a/src/violawake_sdk/tools/contamination_check.py
+++ b/src/violawake_sdk/tools/contamination_check.py
@@ -111,7 +111,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
 
     def _embed_file(path: Path) -> np.ndarray | None:
diff --git a/src/violawake_sdk/training/evaluate.py b/src/violawake_sdk/training/evaluate.py
index 9bb4577f5e01cb4c3dceffcb9bca8aaf607ca497..508ecdcb73617a30016d4efb4e3e5a1916281b1b
--- a/src/violawake_sdk/training/evaluate.py
+++ b/src/violawake_sdk/training/evaluate.py
@@ -261,7 +261,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"
@@ -324,7 +324,7 @@
     from violawake_sdk._constants import CLIP_SAMPLES
     from violawake_sdk.audio import center_crop, load_audio
 
-    oww = OWWModel()
+    oww = OWWModel(inference_framework="onnx")
     preprocessor = oww.preprocessor
     if not hasattr(preprocessor, "onnx_execution_provider"):
         preprocessor.onnx_execution_provider = "CPUExecutionProvider"

SUCCESS: The process with PID 82296 (child process of PID 9672) has been terminated.
SUCCESS: The process with PID 9672 (child process of PID 73196) has been terminated.
tokens used
291,315
Done. Audit doc is committed at [docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md](J:/CLAUDE/PROJECTS/Wakeword/docs/TRAINING_PIPELINE_AUDIT_2026-05-07.md).

`head -100` was verified. It starts with:
```md
# Training Pipeline Audit - 2026-05-07

Summary counts: CRITICAL 2, HIGH 4, MEDIUM 4, LOW 0.

One-line fixes applied:
- `0c3923b` - pinned remaining SDK `OWWModel` evaluation/contamination helpers to ONNX.
- `9cbb8e6` - made backend startup fail after failed Alembic retries and made OpenWakeWord resource download failures visible to `set -e`.

Findings:
1. CRITICAL - Alembic retry loop could start the app after every migration attempt failed
2. CRITICAL - SDK training helpers can call `sys.exit(1)` inside the backend worker path
3. HIGH - Remaining `OWWModel()` defaults would still select the broken TFLite path
4. HIGH - OpenWakeWord resource pre-download was masked by shell pipeline behavior
5. HIGH - Synchronous training still depends on external Edge TTS, and the offline fallback is not deployed
6. HIGH - Production job queue persists to local SQLite even when the app database is Postgres
```

Commits made:
- `0c3923b` - `Pin OpenWakeWord evaluation helpers to ONNX`
- `9cbb8e6` - `Fail fast on backend startup prerequisites`
- `12e9a50` - `Add training pipeline latent bug audit`

Requested test:
```text
tests\test_teams.py .........................                            [100%]
======================= 147 passed, 2 skipped in 41.03s =======================
```

One-liner takeaway: the next biggest landmine is `sys.exit(1)` inside SDK training helpers used by the backend worker, because `SystemExit` bypasses the job queue’s `except Exception` failure path and can leave a runtime training failure unreported as a clean failed job.

Note: the worktree still has unrelated modified files I did not touch or revert: `pyproject.toml`, `src/violawake_sdk/tools/train.py`, and `tests/unit/test_train.py`, plus untracked Codex log/prompt files.
