Project Structure:
📁 abersetz
├── 📁 .github
│   └── 📁 workflows
│       ├── 📄 push.yml
│       └── 📄 release.yml
├── 📁 docs
│   ├── 📄 _config.yml
│   ├── 📄 api.md
│   ├── 📄 cli.md
│   ├── 📄 configuration.md
│   ├── 📄 index.md
│   └── 📄 installation.md
├── 📁 examples
│   ├── 📁 es
│   ├── 📁 pl
│   │   ├── 📄 poem_en.txt
│   │   └── 📄 poem_pl.txt
│   ├── 📄 advanced_api.py
│   ├── 📄 basic_api.py
│   ├── 📄 batch_translate.sh
│   ├── 📄 config_setup.sh
│   ├── 📄 engines_config.json
│   ├── 📄 pipeline.sh
│   ├── 📄 poem_en.txt
│   ├── 📄 poem_pl.txt
│   ├── 📄 translate.sh
│   ├── 📄 validate_report.sh
│   ├── 📄 vocab.json
│   └── 📄 walkthrough.md
├── 📁 external
├── 📁 issues
│   ├── 📄 102-review.md
│   ├── 📄 103.txt
│   ├── 📄 105.txt
│   └── 📄 200.txt
├── 📁 src
│   └── 📁 abersetz
│       ├── 📄 __init__.py
│       ├── 📄 __main__.py
│       ├── 📄 abersetz.py
│       ├── 📄 chunking.py
│       ├── 📄 cli.py
│       ├── 📄 cli_fast.py
│       ├── 📄 config.py
│       ├── 📄 engine_catalog.py
│       ├── 📄 engines.py
│       ├── 📄 openai_lite.py
│       ├── 📄 pipeline.py
│       ├── 📄 setup.py
│       └── 📄 validation.py
├── 📁 tests
│   ├── 📄 conftest.py
│   ├── 📄 test_chunking.py
│   ├── 📄 test_cli.py
│   ├── 📄 test_config.py
│   ├── 📄 test_engine_catalog.py
│   ├── 📄 test_engines.py
│   ├── 📄 test_examples.py
│   ├── 📄 test_integration.py
│   ├── 📄 test_offline.py
│   ├── 📄 test_openai_lite.py
│   ├── 📄 test_package.py
│   ├── 📄 test_pipeline.py
│   ├── 📄 test_setup.py
│   └── 📄 test_validation.py
├── 📄 .gitignore
├── 📄 AGENTS.md
├── 📄 build.sh
├── 📄 CHANGELOG.md
├── 📄 CLAUDE.md
├── 📄 DEPENDENCIES.md
├── 📄 GEMINI.md
├── 📄 IDEA.md
├── 📄 LICENSE
├── 📄 LLXPRT.md
├── 📄 package.toml
├── 📄 PLAN.md
├── 📄 pyproject.toml
├── 📄 QWEN.md
├── 📄 README.md
├── 📄 SPEC.md
├── 📄 TESTING.md
├── 📄 TODO.md
├── 📄 translation_report.json
└── 📄 WORK.md


<documents>
<document index="1">
<source>.cursorrules</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features

- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.

## Installation

```bash
pip install abersetz
```

## Quick Start

```bash
abersetz translate ./docs --to-lang pl --engine translators/google --output ./build/pl
```

### CLI Options (preview)

- `--from-lang`: source language (defaults to `auto`).
- `--to-lang`: target language (default `en`).
- `--engine`: one of
  - `translators/<provider>` (e.g. `translators/google`)
  - `deep-translator/<provider>` (e.g. `deep-translator/deepl`)
  - `hysf`
  - `ullm/<profile>` where profiles are defined in config.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.

## Configuration

`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:

- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.json`):

```json
{
  "defaults": {
    "engine": "translators/google",
    "from_lang": "auto",
    "to_lang": "en",
    "chunk_size": 1200,
    "html_chunk_size": 1800
  },
  "credentials": {
    "siliconflow": {"env": "SILICONFLOW_API_KEY"}
  },
  "engines": {
    "hysf": {
      "chunk_size": 2400,
      "credential": {"name": "siliconflow"},
      "options": {
        "model": "tencent/Hunyuan-MT-7B",
        "base_url": "https://api.siliconflow.com/v1",
        "temperature": 0.3
      }
    },
    "ullm": {
      "chunk_size": 2400,
      "credential": {"name": "siliconflow"},
      "options": {
        "profiles": {
          "default": {
            "base_url": "https://api.siliconflow.com/v1",
            "model": "tencent/Hunyuan-MT-7B",
            "temperature": 0.3,
            "max_input_tokens": 32000,
            "prolog": {}
          }
        }
      }
    }
  }
}
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API

```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="translators/google"),
)
```

## Examples

The `examples/` directory holds ready-to-run demos:

- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.

<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.**version**)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>
</document_content>
</document>

<document index="2">
<source>.github/workflows/push.yml</source>
<document_content>
name: Build & Test

on:
  push:
    branches: [main]
    tags-ignore: ["v*"]
  pull_request:
    branches: [main]
  workflow_dispatch:

permissions:
  contents: write
  id-token: write

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  quality:
    name: Code Quality
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Ruff lint
        uses: astral-sh/ruff-action@v3
        with:
          version: "latest"
          args: "check --output-format=github"

      - name: Run Ruff Format
        uses: astral-sh/ruff-action@v3
        with:
          version: "latest"
          args: "format --check --respect-gitignore"

  test:
    name: Run Tests
    needs: quality
    strategy:
      matrix:
        python-version: ["3.10", "3.11", "3.12"]
        os: [ubuntu-latest]
      fail-fast: true
    runs-on: ${{ matrix.os }}
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install UV
        uses: astral-sh/setup-uv@v5
        with:
          version: "latest"
          python-version: ${{ matrix.python-version }}
          enable-cache: true
          cache-suffix: ${{ matrix.os }}-${{ matrix.python-version }}

      - name: Install test dependencies
        run: |
          uv pip install --system --upgrade pip
          uv pip install --system ".[test]"

      - name: Run tests with Pytest
        run: uv run pytest -n auto --maxfail=1 --disable-warnings --cov-report=xml --cov-config=pyproject.toml --cov=src/abersetz --cov=tests tests/

      - name: Upload coverage report
        uses: actions/upload-artifact@v4
        with:
          name: coverage-${{ matrix.python-version }}-${{ matrix.os }}
          path: coverage.xml

  build:
    name: Build Distribution
    needs: test
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install UV
        uses: astral-sh/setup-uv@v5
        with:
          version: "latest"
          python-version: "3.12"
          enable-cache: true

      - name: Install build tools
        run: uv pip install build hatchling hatch-vcs

      - name: Build distributions
        run: uv run python -m build --outdir dist

      - name: Upload distribution artifacts
        uses: actions/upload-artifact@v4
        with:
          name: dist-files
          path: dist/
          retention-days: 5 
</document_content>
</document>

<document index="3">
<source>.github/workflows/release.yml</source>
<document_content>
name: Release

on:
  push:
    tags: ["v*"]

permissions:
  contents: write
  id-token: write

jobs:
  release:
    name: Release to PyPI
    runs-on: ubuntu-latest
    environment:
      name: pypi
      url: https://pypi.org/p/abersetz
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install UV
        uses: astral-sh/setup-uv@v5
        with:
          version: "latest"
          python-version: "3.12"
          enable-cache: true

      - name: Install build tools
        run: uv pip install build hatchling hatch-vcs

      - name: Build distributions
        run: uv run python -m build --outdir dist

      - name: Verify distribution files
        run: |
          ls -la dist/
          test -n "$(find dist -name '*.whl')" || (echo "Wheel file missing" && exit 1)
          test -n "$(find dist -name '*.tar.gz')" || (echo "Source distribution missing" && exit 1)

      - name: Publish to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          password: ${{ secrets.PYPI_TOKEN }}

      - name: Create GitHub Release
        uses: softprops/action-gh-release@v1
        with:
          files: dist/*
          generate_release_notes: true
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 
</document_content>
</document>

<document index="4">
<source>.gitignore</source>
<document_content>
!**/[Pp]ackages/build/
!.axoCover/settings.json
!.vscode/extensions.json
!.vscode/launch.json
!.vscode/settings.json
!.vscode/tasks.json
!?*.[Cc]ache/
!Directory.Build.rsp
$tf/
*$py.class
**/*.DesktopClient/GeneratedArtifacts
**/*.DesktopClient/ModelManifest.xml
**/*.HTMLClient/GeneratedArtifacts
**/*.Server/GeneratedArtifacts
**/*.Server/ModelManifest.xml
**/[Pp]ackages/*
*- [Bb]ackup ([0-9]).rdl
*- [Bb]ackup ([0-9][0-9]).rdl
*- [Bb]ackup.rdl
*.[Cc]ache
*.[Pp]ublish.xml
*.[Rr]e[Ss]harper
*.a
*.app
*.appx
*.appxbundle
*.appxupload
*.aps
*.azurePubxml
*.bim.layout
*.bim_*.settings
*.binlog
*.btm.cs
*.btp.cs
*.build.csdef
*.cab
*.cachefile
*.code-workspace
*.cover
*.coverage
*.coveragexml
*.d
*.dbmdl
*.dbproj.schemaview
*.dll
*.dotCover
*.DotSettings.user
*.dsp
*.dsw
*.dylib
*.e2e
*.egg
*.egg-info/
*.exe
*.gch
*.GhostDoc.xml
*.gpState
*.ilk
*.iobj
*.ipdb
*.jfm
*.jmconfig
*.la
*.lai
*.ldf
*.lib
*.lo
*.log
*.mdf
*.meta
*.mm.*
*.mod
*.msi
*.msix
*.msm
*.msp
*.ncb
*.ndf
*.nuget.props
*.nuget.targets
*.nupkg
*.nvuser
*.o
*.obj
*.odx.cs
*.opendb
*.opensdf
*.opt
*.out
*.pch
*.pdb
*.pfx
*.pgc
*.pgd
*.pidb
*.plg
*.psess
*.publishproj
*.publishsettings
*.pubxml
*.py,cover
*.py[cod]
*.pyc
*.rdl.data
*.rptproj.bak
*.rptproj.rsuser
*.rsp
*.rsuser
*.sap
*.sbr
*.scc
*.sdf
*.sln.docstates
*.sln.iml
*.slo
*.smod
*.snupkg
*.so
*.suo
*.svclog
*.swo
*.swp
*.tlb
*.tlh
*.tli
*.tlog
*.tmp
*.tmp_proj
*.tss
*.user
*.userosscache
*.userprefs
*.vbp
*.vbw
*.VC.db
*.VC.VC.opendb
*.VisualState.xml
*.vsp
*.vspscc
*.vspx
*.vssscc
*.xsd.cs
*_autogen/
*_h.h
*_i.c
*_p.c
*_wpftmp.csproj
*~
.*crunch*.local.xml
._*
.axoCover/*
.builds
.cache
.coverage
.coverage.*
.cr/personal
.DS_Store
.DS_Store?
.eggs/
.env
.fake/
.history/
.hypothesis/
.idea/
.installed.cfg
.ionide/
.localhistory/
.mfractor/
.nox/
.ntvs_analysis.dat
.paket/paket.exe
.pytest_cache/
.Python
.ruff_cache/
.sass-cache/
.Spotlight-V100
.tox/
.Trashes
.venv
.vs/
.vscode
.vscode/
.vscode/*
.vshistory/
[Aa][Rr][Mm]/
[Aa][Rr][Mm]64/
[Bb]in/
[Bb]uild[Ll]og.*
[Dd]ebug/
[Dd]ebugPS/
[Dd]ebugPublic/
[Ee]xpress/
[Ll]og/
[Ll]ogs/
[Oo]bj/
[Rr]elease/
[Rr]eleasePS/
[Rr]eleases/
[Tt]est[Rr]esult*/
[Ww][Ii][Nn]32/
__pycache__/
__version__.py
_Chutzpah*
_deps
_NCrunch_*
_pkginfo.txt
_private
_Pvt_Extensions
_ReSharper*/
_TeamCity*
_UpgradeReport_Files/
_version.py
AppPackages/
artifacts/
ASALocalRun/
AutoTest.Net/
Backup*/
BenchmarkDotNet.Artifacts/
bld/
build/
BundleArtifacts/
ClientBin/
cmake_install.cmake
CMakeCache.txt
CMakeFiles
CMakeLists.txt.user
CMakeScripts
CMakeUserPresets.json
compile_commands.json
cover/
coverage*.info
coverage*.json
coverage*.xml
coverage.xml
csx/
CTestTestfile.cmake
develop-eggs/
dlldata.c
DocProject/buildhelp/
DocProject/Help/*.hhc
DocProject/Help/*.hhk
DocProject/Help/*.hhp
DocProject/Help/*.HxC
DocProject/Help/*.HxT
DocProject/Help/html
DocProject/Help/Html2
downloads/
ecf/
eggs/
ehthumbs.db
env.bak/
env/
ENV/
FakesAssemblies/
FodyWeavers.xsd
Generated\ Files/
Generated_Code/
healthchecksdb
htmlcov/
install_manifest.txt
ipch/
lib/
lib64/
Makefile
MANIFEST
MigrationBackup/
mono_crash.*
nCrunchTemp_*
node_modules/
nosetests.xml
nunit-*.xml
OpenCover/
orleans.codegen.cs
Package.StoreAssociation.xml
paket-files/
parts/
project.fragment.lock.json
project.lock.json
publish/
PublishScripts/
rcf/
ScaffoldingReadMe.txt
sdist/
ServiceFabricBackup/
StyleCopReport.xml
Testing
TestResult.xml
Thumbs.db
UpgradeLog*.htm
UpgradeLog*.XML
var/
venv.bak/
venv/
VERSION.txt
wheels/
x64/
x86/
~$*
external/
dist/
src/abersetz/__about__.py

</document_content>
</document>

<document index="5">
<source>.pre-commit-config.yaml</source>
<document_content>
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.3.4
    hooks:
      - id: ruff
        args: [--fix]
      - id: ruff-format
        args: [--respect-gitignore]
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: trailing-whitespace
      - id: check-yaml
      - id: check-toml
      - id: check-added-large-files
      - id: debug-statements
      - id: check-case-conflict
      - id: mixed-line-ending
        args: [--fix=lf] 
</document_content>
</document>

<document index="6">
<source>AGENTS.md</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.

## Installation
```bash
pip install abersetz
```

## Quick Start
```bash
abersetz tr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.




<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.__version__)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>

</document_content>
</document>

<document index="7">
<source>CHANGELOG.md</source>
<document_content>
---
this_file: CHANGELOG.md
---
# Changelog

All notable changes to abersetz will be documented in this file.

## [Unreleased]

### Release Highlights (Draft)
- Short engine selectors everywhere (`tr/*`, `dt/*`, `ll/*`, `hy`) with automatic config migration and backwards-compatible aliases.
- New `abersetz validate` health-check plus setup wizard integration for immediate pass/fail feedback.
- Provider-aware setup wizard with pricing hints, richer CLI tables, and coverage-driven test suites ensuring 91% baseline.

### Maintenance
- 2025-09-21 12:39 UTC — `python -m pytest -xvs` (180 passed, 8 skipped in 91.12s; coverage 98% with uncovered lines limited to CLI 286 and intentionally skipped integrations.)
- 2025-09-21 12:39 UTC — `python -m pytest --cov=. --cov-report=term-missing` (180 passed, 8 skipped in 80.92s; coverage unchanged at 98% with explicit misses on CLI line 286 plus skipped integration scaffolding and guarded setup/pipeline asserts.)
- 2025-09-21 12:39 UTC — `uvx mypy .` (success; zero errors aside from the expected `annotation-unchecked` notice for the advanced API helper.)
- 2025-09-21 12:39 UTC — `uvx bandit -r .` (448 Low-severity findings stemming from deliberate test `assert`s and the config backup guard; no Medium/High issues.)
- 2025-09-21 12:39 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts immediately after verification.
- 2025-09-21 12:45 UTC — `python -m pytest -xvs` (180 passed, 8 skipped in 80.56s; coverage 98% with only skipped integrations and a single CLI guard line outstanding.)
- 2025-09-21 12:45 UTC — `python -m pytest --cov=. --cov-report=term-missing` (180 passed, 8 skipped in 94.41s; coverage unchanged at 98% with explicit misses on CLI line 286, skipped integrations, and guarded regression scaffolding.)
- 2025-09-21 12:45 UTC — `uvx mypy .` (success; zero errors with the expected annotation-unchecked notice only.)
- 2025-09-21 12:45 UTC — `uvx bandit -r .` (448 Low-severity findings from intentional test `assert`s and the config backup guard; Medium/High severities remain zero.)
- 2025-09-21 12:45 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts after the full verification sweep.
- 2025-09-21 12:16 UTC — `python -m pytest -xvs` (177 passed, 8 skipped in 88.81s; coverage plugin reported 98% overall with misses limited to intentionally skipped integration scaffolding and four guarded setup/pipeline lines.)
- 2025-09-21 12:16 UTC — `python -m pytest --cov=. --cov-report=term-missing` (177 passed, 8 skipped in 85.02s; coverage table unchanged at 98% with explicit misses listing the skipped integration suite plus the guarded setup/pipeline assertions.)
- 2025-09-21 12:16 UTC — `uvx mypy .` (3 errors: CLI optional-output assignment handling in `src/abersetz/cli.py` and credential default typing in `external/dump_models.py`; all other modules clean.)
- 2025-09-21 12:16 UTC — `uvx bandit -r .` (438 Low-severity findings stemming from deliberate test `assert`s and the config backup `try/except/pass`; Medium/High severities remain zero.)
- 2025-09-21 12:16 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts immediately after verification.
- 2025-09-21 12:06 UTC — Added `[tool.mypy]` overrides to silence stubless third-party imports and introduced pipeline regression tests for string paths and HTML chunk hints while updating the integration API usage.
- 2025-09-21 12:06 UTC — `python -m pytest -xvs` (177 passed, 8 skipped in 88.93s; coverage 98% with only intentionally skipped integrations outstanding.)
- 2025-09-21 12:06 UTC — `python -m pytest --cov=. --cov-report=term-missing` (177 passed, 8 skipped in 81.35s; coverage unchanged at 98% with misses restricted to skipped integrations and guarded pipeline/setup lines.)
- 2025-09-21 12:06 UTC — `uvx mypy .` (3 errors: CLI output option typing and external dump optional string fallback.)
- 2025-09-21 12:06 UTC — `uvx bandit -r .` (438 Low-severity findings from expected pytest `assert`s and config backup guard; Medium/High severities remain zero.)
- 2025-09-21 12:06 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-verification.
- 2025-09-21 11:50 UTC — `python -m pytest -xvs` (175 passed, 8 skipped in 90.23s; overall coverage reported at 98% with only intentionally skipped integration scaffolding outstanding.)
- 2025-09-21 11:50 UTC — `python -m pytest --cov=. --cov-report=term-missing` (175 passed, 8 skipped in 80.11s; coverage totals steady at 98% with misses isolated to skipped integration placeholders and guarded pipeline/setup asserts.)
- 2025-09-21 11:50 UTC — `uvx mypy .` (49 errors; unchanged missing third-party stubs plus known CLI/pipeline union assignments and the intentional integration keyword argument.)
- 2025-09-21 11:50 UTC — `uvx bandit -r .` (430 Low-severity findings stemming from expected pytest `assert`s and the config backup `try/except/pass`; Medium/High severities remain zero.)
- 2025-09-21 11:50 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-verification.
- 2025-09-21 11:39 UTC — `python -m pytest tests/test_examples.py -xvs` (22 passed in 1.83s; `examples/advanced_api.py` coverage now 100% with new voc/CLI tests.)
- 2025-09-21 11:39 UTC — `python -m pytest -xvs` (175 passed, 8 skipped in 84.35s; total coverage 98% with only intentionally skipped integration scaffolding outstanding.)
- 2025-09-21 11:39 UTC — `python -m pytest --cov=. --cov-report=term-missing` (175 passed, 8 skipped in 82.11s; coverage summary confirms `examples/advanced_api.py` fully covered, residual misses confined to skipped integration placeholders.)
- 2025-09-21 11:39 UTC — `uvx mypy .` (49 errors; unchanged missing third-party stubs plus known CLI union assignments.)
- 2025-09-21 11:39 UTC — `uvx bandit -r .` (430 Low-severity findings from expected pytest `assert`s and the config backup guard; Medium/High severities remain clear.)
- 2025-09-21 11:39 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-verification.
- 2025-09-21 11:28 UTC — `python -m pytest -xvs` (171 passed, 8 skipped in 94.32s; inline coverage holds at 97% with uncovered lines restricted to `examples/advanced_api.py` and intentionally skipped integration scaffolding.)
- 2025-09-21 11:28 UTC — `python -m pytest --cov=. --cov-report=term-missing` (171 passed, 8 skipped in 80.50s; coverage breakdown unchanged with `examples/advanced_api.py` lines 141-343 and `tests/test_integration.py` skip placeholders flagged.)
- 2025-09-21 11:28 UTC — `uvx mypy .` (49 errors; all attributable to missing third-party stubs plus known CLI union assignments, no regressions detected.)
- 2025-09-21 11:28 UTC — `uvx bandit -r .` (419 Low-severity findings from deliberate pytest `assert`s and the config backup `try/except/pass` guard; Medium/High severities remain clear.)
- 2025-09-21 11:28 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-report.
- 2025-09-21 09:24 UTC — `python -m pytest -xvs` (171 passed, 8 skipped in 98.65s; coverage inline summary now shows 100% for `chunking.py`, `engine_catalog.py`, and `pipeline.py`).
- 2025-09-21 09:24 UTC — `python -m pytest --cov=. --cov-report=term-missing` (171 passed, 8 skipped in 82.02s; totals steady at 97% with remaining gaps limited to `examples/advanced_api.py` and intentionally skipped integrations).
- 2025-09-21 09:24 UTC — Targeted pytest runs (`tests/test_chunking.py`, `tests/test_engine_catalog.py`, `tests/test_pipeline.py -k "chunk_size"`) all passed, confirming fallback import guards and engine chunk sizing coverage.
- 2025-09-21 09:24 UTC — `uvx mypy .` (49 errors; unchanged missing-stub diagnostics plus existing CLI union assignments).
- 2025-09-21 09:24 UTC — `uvx bandit -r .` (419 Low-severity findings: expected test `assert`s and config backup guard; Medium/High severities remain clear).
- 2025-09-21 09:24 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-touchup.
- 2025-09-21 09:14 UTC — `python -m pytest -xvs` (170 passed, 8 skipped in 104.55s; coverage inline summary reported 97% overall with residual misses in `examples/advanced_api.py` and intentionally skipped integration scaffolding).
- 2025-09-21 09:14 UTC — `python -m pytest --cov=. --cov-report=term-missing` (170 passed, 8 skipped in 83.28s; coverage totals unchanged at 97% with `examples/advanced_api.py` flagged for 21 uncovered lines and integration skips enumerated).
- 2025-09-21 09:14 UTC — `uvx mypy .` (49 errors; outstanding items are missing third-party stubs plus known CLI/output union checks, no regressions observed).
- 2025-09-21 09:14 UTC — `uvx bandit -r .` (415 Low-severity findings from expected test `assert`s and the config backup guard; Medium/High severities remain clear).
- 2025-09-21 09:14 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts after the verification sweep.
- 2025-09-21 11:03 UTC — `python -m pytest -xvs` (170 passed, 8 skipped in 108.03s; coverage climbs to 97% overall with `examples/advanced_api.py` at 88% and `setup.py` fully covered).
- 2025-09-21 11:03 UTC — `python -m pytest --cov=. --cov-report=term-missing` (170 passed, 8 skipped in 85.13s; remaining uncovered lines confined to optional CLI banners and skipped integrations).
- 2025-09-21 11:03 UTC — `uvx mypy .` (49 errors; reduced from 58 after report refactor and coverage additions, residual issues are missing stubs plus known CLI signature unions).
- 2025-09-21 11:03 UTC — `uvx bandit -r .` (415 Low-severity findings from expected test asserts and config backup guard; Medium/High severities still clear).
- 2025-09-21 11:03 UTC — Added typed report helpers and new example/setup tests, unlocking 100% coverage for `setup.py` and high-confidence async example workflows.
- 2025-09-21 08:46 UTC — `python -m pytest -xvs` (162 passed, 8 skipped in 80.41s; coverage plugin summary steady at 95% overall with remaining misses in `examples/advanced_api.py` and `setup.py` fallback helper).
- 2025-09-21 08:46 UTC — `python -m pytest --cov=. --cov-report=term-missing` (162 passed, 8 skipped in 80.61s; missing lines unchanged: `examples/advanced_api.py`, `setup.py:221/262/286/471`, integration skips).
- 2025-09-21 08:46 UTC — `uvx mypy .` (58 errors; all missing third-party stubs plus longstanding example/test typing gaps, no new diagnostics).
- 2025-09-21 08:46 UTC — `uvx bandit -r .` (390 Low-severity findings from expected test asserts and config backup guard; Medium/High severities remain clear).
- 2025-09-21 08:46 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts after the QA sweep.
- 2025-09-21 10:38 UTC — `python -m pytest -xvs` (162 passed, 8 skipped in 82.49s; coverage table unchanged at 95% overall after test helper adjustments.)
- 2025-09-21 10:38 UTC — `uvx mypy .` (58 errors remaining; removed attr/truthiness/arg-type warnings, outstanding diagnostics are missing third-party stubs plus legacy example typing.)
- 2025-09-21 10:38 UTC — Targeted pytest runs (`tests/test_engines.py -k deep_translator_engine_retry_on_failure`, `tests/test_offline.py`, `tests/test_cli.py -k cli_engines_lists_configured_providers`) all passed to validate helper tweaks.
- 2025-09-21 10:38 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-work sweep.
- 2025-09-21 10:29 UTC — `python -m pytest -xvs` (162 passed, 8 skipped in 81.14s; coverage summary stayed at 95% overall with `setup.py` retaining 4 uncovered lines and integration scaffolding intentionally skipped.)
- 2025-09-21 10:29 UTC — `python -m pytest --cov=. --cov-report=term-missing` (162 passed, 8 skipped in 81.10s; coverage 95% overall with misses in `examples/advanced_api.py`, `setup.py:221/262/286/471`, and inert integration placeholders.)
- 2025-09-21 10:29 UTC — `uvx mypy .` (67 errors across 21 files; all due to missing third-party stubs plus known lenient typing in examples/tests.)
- 2025-09-21 10:29 UTC — `uvx bandit -r .` (390 Low-severity findings from expected test asserts and config backup guard; no Medium/High severities.)
- 2025-09-21 10:29 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-suite.
- 2025-09-21 10:20 UTC — `python -m pytest -xvs` (162 passed, 8 skipped in 80.66s; coverage summary reported 95% overall with `examples/advanced_api.py` now at 47% and `setup.py` uncovered lines reduced to four.)
- 2025-09-21 10:20 UTC — `python -m pytest --cov=. --cov-report=term-missing` (162 passed, 8 skipped in 81.84s; coverage held at 95% overall, missing lines restricted to `examples/advanced_api.py`, `setup.py:221/262/286/471`, and intentionally skipped integrations.)
- 2025-09-21 10:20 UTC — `uvx mypy .` (67 errors across 21 files; Chat namespace diagnostics cleared, residual issues stem from missing third-party stubs and legacy example/test typing gaps.)
- 2025-09-21 10:20 UTC — `uvx bandit -r .` (390 Low-severity findings (`B101` test asserts plus config backup guard); no Medium/High severities.)
- 2025-09-21 10:20 UTC — Targeted pytest runs (`tests/test_openai_lite.py -k completions`, `tests/test_setup.py -k select_default_engine`, `tests/test_examples.py -k translation_workflow`) verified new guards.
- 2025-09-21 10:20 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.coverage`, and `.benchmarks` artifacts post-suite.
- 2025-09-21 10:03 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.coverage`, and `.benchmarks` after the verification sweep.
- 2025-09-21 10:02 UTC — `python -m pytest -xvs` (154 passed, 8 skipped in 83.79s; inline coverage summary reported 94% total with misses isolated to `setup.py` fallback, `examples/advanced_api.py`, and intentionally skipped integration scaffolding.)
- 2025-09-21 10:02 UTC — `python -m pytest --cov=. --cov-report=term-missing` (154 passed, 8 skipped in 82.36s; coverage held at 94% overall with the same uncovered lines enumerated in the report.)
- 2025-09-21 10:02 UTC — `uvx mypy .` (71 errors across 21 files; unchanged missing third-party stubs plus attr/type diagnostics in examples, CLI fixtures, and integration tests.)
- 2025-09-21 10:02 UTC — `uvx bandit -r .` (377 Low-severity findings from expected test `assert`s and the config backup guard; Medium/High severities remain absent.)
- 2025-09-21 09:36 UTC — `python -m pytest -xvs` (151 passed, 8 skipped in 86.11s; coverage inline report flagged the remaining misses on `config.py:322`, `setup.py:220/261/285/457-459`, and the intentionally skipped integration scaffolding.)
- 2025-09-21 09:36 UTC — `python -m pytest --cov=. --cov-report=term-missing` (151 passed, 8 skipped in 86.37s; coverage steady at 97% overall with the same uncovered lines.)
- 2025-09-21 09:36 UTC — `uvx mypy .` (77 errors across 21 files; entirely missing third-party stubs plus legacy example/external Optional defaults and `_translators` attribute access in tests.)
- 2025-09-21 09:36 UTC — `uvx bandit -r .` (364 Low-severity findings produced by test `assert`s and the config backup guard; Medium/High severities absent.)
- 2025-09-21 09:36 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, and `.ruff_cache` after documenting results.
- 2025-09-21 09:45 UTC — `python -m pytest tests/test_config.py -k recursive_name -xvs` (validated recursion fix; INFO log emitted once and function returned `None`).
- 2025-09-21 09:46 UTC — `python -m pytest tests/test_engines.py -k "translators_engine" -xvs` (translator stubs exercised text/HTML/retry flows without touching private attributes).
- 2025-09-21 09:47 UTC — `python -m pytest tests/test_examples.py -k "translation_workflow or translate_with_consistency" -xvs` (lazy config default and vocabulary copy behaviour confirmed with stubbed `translate_path`).
- 2025-09-21 09:54 UTC — `python -m pytest -xvs` (154 passed, 8 skipped in 81.16s; recursion fix verified and translator stubs exercised without private attribute access).
- 2025-09-21 09:54 UTC — `python -m pytest --cov=. --cov-report=term-missing` (154 passed, 8 skipped in 81.67s; coverage steady at 97% with residual misses isolated to `setup.py` fallback and skipped integration suite).
- 2025-09-21 09:55 UTC — `uvx mypy .` (71 errors across 21 files; reduced by addressing translator stubs and recursive credential path, remaining diagnostics stem from missing third-party stubs and legacy example tooling).
- 2025-09-21 09:55 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, and `.ruff_cache` after the verification sweep.
- 2025-09-21 09:31 UTC — `python -m pytest -xvs` (151 passed, 8 skipped in 92.98s; coverage inline summary 97% with `setup.py` misses trimmed to 6 lines and `config.py:322` still outstanding.)
- 2025-09-21 09:31 UTC — `python -m pytest --cov=. --cov-report=term-missing` (151 passed, 8 skipped in 87.49s; uncovered lines limited to `config.py:322`, `setup.py:220/261/285/457-459`, and intentionally skipped integration scaffolding.)
- 2025-09-21 09:31 UTC — `uvx mypy .` (77 errors across 21 files; resolved `_BasicApiModule` attribute complaints and `Path.stat` override mismatch, remaining diagnostics are missing third-party stubs plus legacy example/external typing gaps.)
- 2025-09-21 09:31 UTC — `uvx bandit -r .` (364 Low-severity findings from test `assert`s and config backup guard; Medium/High severities absent.)
- 2025-09-21 09:31 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` immediately after the verification sweep.
- 2025-09-21 09:19 UTC — `python -m pytest -xvs` (150 passed, 8 skipped in 80.27s; coverage inline summary reported 97% with residual misses on `config.py:322`, `setup.py` verbose/error branches, and intentionally skipped integrations.)
- 2025-09-21 09:19 UTC — `python -m pytest --cov=. --cov-report=term-missing` (150 passed, 8 skipped in 79.98s; coverage steady at 97% with identical uncovered lines plus the `tests/test_integration.py` skip list.)
- 2025-09-21 09:19 UTC — `uvx mypy .` (88 errors across 22 files; missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w and Optional defaults in examples/external helpers remain outstanding.)
- 2025-09-21 09:19 UTC — `uvx bandit -r .` (361 Low-severity findings produced by test `assert` usage and the config backup guard; no Medium/High severities.)
- 2025-09-21 09:19 UTC — /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` artifacts immediately after documentation updates.
- 2025-09-21 09:11 UTC — `python -m pytest -xvs` (150 passed, 8 skipped; coverage climbed to 97% with `engines.py` now fully covered and `setup.py` gaps limited to verbose/error-reporting branches.)
- 2025-09-21 09:11 UTC — `python -m pytest --cov=. --cov-report=term-missing` (150 passed, 8 skipped; coverage 97% overall with remaining misses on `config.py:322`, selective `setup.py` paths, and intentionally skipped integrations.)
- 2025-09-21 09:11 UTC — `uvx mypy .` (88 errors across 22 files; missing stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w plus Optional defaults in examples/external helpers.)
- 2025-09-21 09:11 UTC — `uvx bandit -r .` (361 Low-severity findings; expected test `assert` usage and config backup guard, no Medium/High issues.)
- 2025-09-21 09:11 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` artifacts; `.ruff_cache` was not created during this run.
- 2025-09-21 09:10 UTC — `python -m pytest tests/test_setup.py -k "list_payload or logs_verbose" -xvs` (2 passed; covered list payload parsing and verbose log capture for `_test_single_endpoint`.)
- 2025-09-21 09:10 UTC — `python -m pytest tests/test_setup.py -k "deepl or prefers_hysf or returns_immediately" -xvs` (3 passed; exercised Deepl mapping, `_validate_config([])` early exit, and hysf default fallback.)
- 2025-09-21 09:09 UTC — `python -m pytest tests/test_engines.py -k "handles_html or rejects_unknown_provider or build_hysf" -xvs` (3 passed; covered Translators HTML path, deep-translator unsupported-provider error, and `_build_hysf_engine` credential guard.)
- 2025-09-21 08:57 UTC — `python -m pytest -xvs` (142 passed, 8 skipped in 89.51s; coverage remains 96% with residual misses on `config.py:322`, `engines.py` fallback branches, and setup validation scaffolding.)
- 2025-09-21 08:57 UTC — `python -m pytest --cov=. --cov-report=term-missing` (142 passed, 8 skipped in 81.25s; coverage 96% overall with remaining gaps identical to the standard run plus intentionally skipped integration suite lines.)
- 2025-09-21 08:57 UTC — `uvx mypy .` (85 errors across 22 files; missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w and Optional defaults in `examples/advanced_api.py` remain.)
- 2025-09-21 08:57 UTC — `uvx bandit -r .` (347 Low-severity findings; expected test `assert` usage and the config backup best-effort handler, Medium/High severities clear.)
- 2025-09-21 08:57 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`; `.ruff_cache` was not present this run.
- 2025-09-21 08:46 UTC — `uvx bandit -r .` (347 Low-severity findings; expected test `assert`s and config backup guard, Medium/High severities clear.)
- 2025-09-21 08:46 UTC — `uvx mypy .` (85 errors across 22 files; missing third-party stubs plus Optional defaults in examples/external scripts persist.)
- 2025-09-21 08:46 UTC — `python -m pytest --cov=. --cov-report=term-missing` (142 passed, 8 skipped; coverage 96% with remaining misses on credential recursion fallback, deep-translator unsupported provider branch, and setup validation scaffolding.)
- 2025-09-21 08:45 UTC — `python -m pytest -xvs` (142 passed, 8 skipped; new tests cover credential recursion, large-file warnings, and LLM payload fallbacks.)
- 2025-09-21 08:45 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.
- 2025-09-21 08:44 UTC — `python -m pytest tests/test_engines.py -k "parse_payload or missing_selector" -xvs` (4 passed; exercised LLM payload fallbacks and missing engine config error.)
- 2025-09-21 08:44 UTC — `python -m pytest tests/test_pipeline.py -k warns_on_large_file -xvs` (1 passed; verified large-file warning branch.)
- 2025-09-21 08:43 UTC — `python -m pytest tests/test_config.py -k "resolve_credential_returns_none or resolve_credential_reuses" -xvs` (2 passed; validated null credential and alias recursion paths.)
- 2025-09-21 08:36 UTC — `uvx bandit -r .` (337 Low-severity findings; expected test `assert`s and config backup guard, Medium/High severities clear.)
- 2025-09-21 08:36 UTC — `uvx mypy .` (83 errors across 22 files; missing third-party stubs plus intentional Optional defaults and OpenAI shim attribute usage.)
- 2025-09-21 08:36 UTC — `python -m pytest --cov=. --cov-report=term-missing` (135 passed, 8 skipped; coverage steady at 96% with misses limited to credential recursion, retry fallbacks, pipeline messaging, setup validation branches, and skipped integrations.)
- 2025-09-21 08:35 UTC — `python -m pytest -xvs` (135 passed, 8 skipped in 81.67s after rerun; coverage plugin confirmed CLI/examples/validation at 100%, total coverage 96%.)
- 2025-09-21 08:35 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` via /cleanup.
- 2025-09-21 08:27 UTC — `uvx bandit -r .` (337 Low-severity findings after new tests; Medium/High severities clear.)
- 2025-09-21 08:26 UTC — `uvx mypy .` (83 errors across 22 files; missing third-party stubs and Optional defaults unchanged.)
- 2025-09-21 08:26 UTC — `python -m pytest --cov=. --cov-report=term-missing` (135 passed, 8 skipped; coverage 96% with `cli.py` and `engine_catalog.py` now 100%, residual misses on credential recursion and integration scaffolding.)
- 2025-09-21 08:25 UTC — `python -m pytest -xvs` (135 passed, 8 skipped; confirmed new CLI/config/catalog tests and maintained 96% coverage.)
- 2025-09-21 08:23 UTC — `python -m pytest tests/test_engine_catalog.py -k normalize_selector -xvs` (6 passed; covered `None`/blank selector guards.)
- 2025-09-21 08:22 UTC — `python -m pytest tests/test_config.py -k engine_config_to_dict -xvs` (1 passed; verified optional field serialization.)
- 2025-09-21 08:21 UTC — `python -m pytest tests/test_cli.py -k deep_translator_string -xvs` (1 passed; exercised deep-translator string provider branch.)
- 2025-09-21 08:16 UTC — `python -m pytest -xvs` (130 passed, 8 skipped; coverage 96% with `cli.py` 99% and remaining misses confined to setup/config fallbacks plus skipped integrations.)
- 2025-09-21 08:16 UTC — `python -m pytest --cov=. --cov-report=term-missing` (130 passed, 8 skipped; coverage steady at 96% with misses on `cli.py:177`, config/env fallbacks, setup validation branches, and intentionally skipped integration scaffolding.)
- 2025-09-21 08:16 UTC — `uvx mypy .` (83 errors across 22 files; missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich plus Optional/default typing gaps in examples and external research scripts.)
- 2025-09-21 08:16 UTC — `uvx bandit -r .` (329 Low-severity findings: expected test `assert` usage and the config backup `try/except/pass` guard; Medium/High severities clear.)
- 2025-09-21 08:16 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` during /cleanup following documentation updates.
- 2025-09-21 08:05 UTC — `python -m pytest -xvs` (130 passed, 8 skipped; coverage 96% with `examples/basic_api.py` and `chunking.py` now fully covered.)
- 2025-09-21 08:06 UTC — `python -m pytest --cov=. --cov-report=term-missing` (130 passed, 8 skipped; coverage steady at 96% with remaining misses unchanged.)
- 2025-09-21 08:06 UTC — `uvx mypy .` (92 errors unchanged; missing third-party stubs plus intentional `Chat.completions` and example protocol lookups.)
- 2025-09-21 08:06 UTC — `uvx bandit -r .` (329 Low-severity findings covering new test `assert`s and the config backup guard; Medium/High severities clear.)
- 2025-09-21 08:04 UTC — `python -m pytest tests/test_examples.py -k cli_dispatch -xvs` (1 passed; CLI dispatch test ensures example selection path works under stubbed translation.)
- 2025-09-21 08:04 UTC — `python -m pytest tests/test_chunking.py -k "blank or fallback" -xvs` (2 passed; fallback slices verified under forced ImportError.)
- 2025-09-21 08:05 UTC — `python -m pytest tests/test_package.py -k getattr -xvs` (1 passed; invalid attribute requests raise `AttributeError` and cache remains stable.)
- 2025-09-21 07:57 UTC — `python -m pytest -xvs` (126 passed, 8 skipped; coverage snapshot 96% with residual misses isolated to integration scaffolding and fallback branches across CLI/config/setup.)
- 2025-09-21 07:58 UTC — `python -m pytest --cov=. --cov-report=term-missing` (126 passed, 8 skipped; coverage steady at 96% with uncovered lines limited to `examples/basic_api.py:150`, package `__init__`, chunking fallbacks, engine retry guards, setup progress paths, and intentionally skipped integrations.)
- 2025-09-21 07:58 UTC — `uvx mypy .` (92 errors unchanged; missing third-party stubs plus intentional `Chat.completions` and dynamic example attributes.)
- 2025-09-21 07:58 UTC — `uvx bandit -r .` (321 Low-severity findings for deliberate test `assert`s and backup guard; Medium/High severities remain clear.)
- 2025-09-21 07:48 UTC — `python -m pytest -xvs` (126 passed, 8 skipped; project coverage 96% with `engines.py` gaps closed.)
- 2025-09-21 07:49 UTC — `python -m pytest --cov=. --cov-report=term-missing` (126 passed, 8 skipped; coverage steady at 96% with remaining misses isolated to integration scaffolding and setup progress branches.)
- 2025-09-21 07:49 UTC — `uvx mypy .` (92 errors unchanged; missing third-party stubs and intentional dynamic attributes in OpenAI shim and tests.)
- 2025-09-21 07:50 UTC — `uvx bandit -r .` (321 Low-severity findings covering deliberate test `assert`s and backup guard; Medium/High severities clear.)
- 2025-09-21 07:50 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` during /cleanup.
- 2025-09-21 07:46 UTC — `python -m pytest tests/test_engines.py -xvs` (15 passed; `src/abersetz/engines.py` coverage lifted to 96% with new error-handling tests.)
- 2025-09-21 07:45 UTC — `python -m pytest tests/test_engines.py -k "profile" -xvs` (4 passed; verified default/variant selection and error handling for `_select_profile`.)
- 2025-09-21 07:44 UTC — `python -m pytest tests/test_engines.py -k "llm" -xvs` (3 passed; confirmed fail-fast behaviour when models or credentials are missing.)
- 2025-09-21 07:36 UTC — `python -m pytest -xvs` (118 passed, 8 skipped; total coverage snapshot 95% with `cli.py` at 99% and remaining misses limited to integration scaffolding.)
- 2025-09-21 07:37 UTC — `python -m pytest --cov=. --cov-report=term-missing` (118 passed, 8 skipped; coverage steady at 95% with gaps in CLI verbose path, config/env fallbacks, setup progress output, and intentionally skipped integration tests.)
- 2025-09-21 07:37 UTC — `uvx mypy .` (92 errors unchanged; all due to missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/platformdirs/tomli_w/loguru/rich/requests plus deliberate dynamic `Chat.completions` usage and example exports.)
- 2025-09-21 07:38 UTC — `uvx bandit -r .` (316 Low-severity findings covering expected test `assert` usage and the config backup `try/except/pass` guard; Medium/High severities clear.)
- 2025-09-21 07:38 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` during /cleanup.
- 2025-09-21 07:30 UTC — `python -m pytest -xvs` (118 passed, 8 skipped; `cli.py` coverage advanced to 99% and `config.py` to 98%, total coverage 95%.)
- 2025-09-21 07:31 UTC — `python -m pytest --cov=. --cov-report=term-missing` (118 passed, 8 skipped; coverage steady at 95% with newly added CLI/config tests closing previously uncovered lines.)
- 2025-09-21 07:31 UTC — `python -m pytest tests/test_cli.py -k "config_commands or string_branches" -xvs` (2 passed; targeted verification for ConfigCommands round-trip and single-provider branches.)
- 2025-09-21 07:31 UTC — `python -m pytest tests/test_config.py -k recurses -xvs` (1 passed; confirmed recursive credential resolution logs missing env once and returns stored secret.)
- 2025-09-21 07:32 UTC — `uvx mypy .` (92 errors unchanged, all due to missing third-party stubs and intended dynamic attributes.)
- 2025-09-21 07:32 UTC — `uvx bandit -r .` (316 Low-severity findings: deliberate `assert` usage in tests and config backup fallback; Medium/High severities clear.)
- 2025-09-21 07:32 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` during /cleanup.
- 2025-09-21 07:20 UTC — `python -m pytest -xvs` (116 passed, 8 skipped; inline coverage snapshot 95% overall with remaining misses concentrated in CLI/config/engine helper branches and intentionally skipped integrations.)
- 2025-09-21 07:21 UTC — `python -m pytest --cov=. --cov-report=term-missing` (116 passed, 8 skipped; total coverage confirmed at 95% with uncovered lines highlighted in CLI verbose paths, config fallbacks, engine retry logic, setup progress messages, and integration scaffolding.)
- 2025-09-21 07:22 UTC — `uvx mypy .` (92 errors stemming from missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/platformdirs/tomli_w/loguru/rich/requests plus intentional `Chat.completions` attribute lookups and dynamically exported example helpers.)
- 2025-09-21 07:23 UTC — `uvx bandit -r .` (308 Low-severity findings covering deliberate `assert` usage throughout tests and the guarded config backup fallback; Medium/High severities remain clear.)
- 2025-09-21 07:24 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` during /cleanup following verification sweep.
- 2025-09-21 07:08 UTC — `python -m pytest -xvs` (112 passed, 8 skipped; coverage snapshot 94% with remaining misses isolated to CLI/config/engine helpers and intentionally skipped integration flows).
- 2025-09-21 07:09 UTC — `python -m pytest --cov=. --cov-report=term-missing` (112 passed, 8 skipped; overall coverage 94% with uncovered lines reported for CLI render fallbacks, config defaults, engine catalog/provider aggregation, setup progress output, and integration placeholders).
- 2025-09-21 07:10 UTC — `uvx mypy .` (92 errors expected from missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/rich/platformdirs/tomli_w/loguru/requests plus intentional mocked `Chat.completions` attributes and dynamically exported example helpers).
- 2025-09-21 07:10 UTC — `uvx bandit -r .` (300 Low-severity findings covering deliberate test asserts and the guarded config backup fallback; Medium/High severities remain clear).
- 2025-09-21 07:16 UTC — Added CLI shell coverage tests (`tests/test_cli.py`) to cover pipeline errors, setup forwarding, and Fire entrypoints; `cli.py` coverage now 96% (full suite: `python -m pytest -xvs`, `python -m pytest --cov=. --cov-report=term-missing`).
- 2025-09-21 04:53 UTC — `python -m pytest -xvs` (109 passed, 8 skipped; coverage plugin snapshot 94% overall with remaining gaps limited to CLI helper branches, engine retry paths, setup prompts, and intentionally skipped integration flows).
- 2025-09-21 04:55 UTC — `python -m pytest --cov=. --cov-report=term-missing` (109 passed, 8 skipped; total coverage steady at 94% with uncovered lines enumerated across CLI/config/engine helpers plus integration placeholders).
- 2025-09-21 04:56 UTC — `uvx mypy .` (expected 40+ errors driven by missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/rich/platformdirs/tomli_w/loguru and mocked `Chat.completions` attributes in tests/openai shim).
- 2025-09-21 04:57 UTC — `uvx bandit -r .` (292 Low-severity findings: deliberate `assert` usage across tests and the guarded config backup fallback; no Medium/High issues).
- 2025-09-21 05:00 UTC — `python -m pytest -k "render_results or collect_engine_entries or chunk_size_for" -xvs` (5 passed; targeted smoke verifying new coverage-focused tests).
- 2025-09-21 05:01 UTC — `python -m pytest -xvs` (112 passed, 8 skipped; CLI coverage 92%, `engines.py` 92%, and new tests landed at 100% coverage).
- 2025-09-21 05:02 UTC — `python -m pytest --cov=. --cov-report=term-missing` (112 passed, 8 skipped; total coverage steady at 94% with residual misses isolated to CLI/setup helper branches and intentional integration skips).
- 2025-09-21 05:03 UTC — `uvx mypy .` (unchanged 40+ errors stemming from missing third-party stubs and mocked `Chat.completions` helpers).
- 2025-09-21 05:04 UTC — `uvx bandit -r .` (300 Low-severity findings reflecting expanded test asserts and existing backup fallback handling; still no Medium/High issues).
- Added `tests/test_cli.py::test_render_results_lists_destinations`, `tests/test_cli.py::test_collect_engine_entries_accepts_single_provider_string`, and `tests/test_engines.py::test_engine_base_chunk_size_prefers_html_then_plain` to close remaining coverage gaps in `_render_results`, translator provider string handling, and `EngineBase.chunk_size_for`.
- 2025-09-21 06:13 UTC — `python -m pytest -xvs` (96 passed, 8 skipped; coverage plugin summary 93% overall, remaining misses centred on CLI/config/engine helpers and intentionally skipped integrations).
- 2025-09-21 06:15 UTC — `python -m pytest --cov=. --cov-report=term-missing` (96 passed, 8 skipped; total coverage 93% with uncovered lines enumerated across CLI/config/engine modules and integration skips).
- 2025-09-21 06:18 UTC — `uvx mypy .` (70+ errors: missing third-party stubs for pytest/httpx/tenacity/langcodes/rich/platformdirs/tomli_w/loguru plus `Chat.completions` attribute assumptions in `openai_lite.py` and tests).
- 2025-09-21 06:19 UTC — `uvx bandit -r .` (264 Low-severity findings: intentional `assert` usage throughout tests and fallback backup handler in `config.py`).
- 2025-09-21 06:25 UTC — `python -m pytest -xvs` (103 passed, 8 skipped; coverage plugin summary 94% overall with `config.py` 94%, `engine_catalog.py` 93%, `cli.py` 90%).
- 2025-09-21 06:26 UTC — `python -m pytest --cov=. --cov-report=term-missing` (103 passed, 8 skipped; total coverage 94% with residual misses limited to integration skips and a few CLI/config helper branches).
- 2025-09-21 06:26 UTC — Added regression tests for config fallback logging (`tests/test_config.py`), translator provider collection (`tests/test_engine_catalog.py`), and CLI engine filters (`tests/test_cli.py`), lifting coverage across `config.py`, `engine_catalog.py`, and `cli.py`.
- 2025-09-21 06:27 UTC — `uvx mypy .` (unchanged 70+ errors driven by missing third-party stubs and stubbed helper attributes).
- 2025-09-21 06:27 UTC — `uvx bandit -r .` (273 Low-severity findings: expected test asserts plus config backup fallback; no Medium/High issues).
- Added `tests/test_pipeline.py::test_translate_path_errors_on_unreadable_file` to assert permission errors raise `PipelineError` with path context (POSIX-only due to chmod semantics).
- Exercised `_persist_output` write-over and dry-run branches via new pipeline tests, ensuring source files update in-place or remain untouched during dry run while voc sidecars honour flags.
- Expanded `tests/test_examples.py` to mock all documented flows plus CLI usage, boosting `examples/basic_api.py` coverage to 98% without network dependencies.
- 2025-09-21 05:52 UTC — `python -m pytest -xvs` (86 passed, 8 skipped; coverage plugin reported 92% overall with warnings solely from config reset fallbacks.)
- 2025-09-21 05:53 UTC — `python -m pytest --cov=. --cov-report=term-missing` (86 passed, 8 skipped; total coverage held at 92%, misses confined to `examples/basic_api.py`, CLI/config/setup seams, and intentionally skipped integration flows.)
- 2025-09-21 05:54 UTC — `uvx mypy .` (70 errors remaining, primarily missing third-party stubs for pytest/httpx/tenacity/langcodes/rich and strict Optional handling in CLI/examples.)
- 2025-09-21 05:54 UTC — `uvx bandit -r .` (234 Low-severity findings corresponding to deliberate `assert` usage in tests and the guarded backup writer in `config.py`.)
- Ran `python -m pytest -xvs` — 78 passed, 8 skipped, 0 failed; coverage plugin snapshot at 92% overall with remaining misses concentrated in `cli.py`, `config.py`, `engine_catalog.py`, `engines.py`, `pipeline.py`, `setup.py`.
- Ran `python -m pytest --cov=. --cov-report=term-missing` — confirmed 92% total coverage and captured exact uncovered lines for follow-up.
- Ran `uvx mypy .` — surfaced 74 errors driven by missing third-party stubs (pytest, httpx, tenacity, langcodes, rich) plus strict Optional/type issues that need triage.
- Ran `uvx bandit -r .` — reported 204 Low-severity findings (intentional `assert` usage in tests, backup fallback `try/except/pass`).
- Removed transient artifacts `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` after test suite completed.
- Added CLI helper tests for pattern/json parsing, empty-state rendering, and ensured single-provider configs populate `_collect_engine_entries` correctly.
- 2025-09-21 05:33 UTC — `python -m pytest -xvs` (83 passed, 8 skipped; coverage plugin 93% overall after rerunning with extended timeout).
- 2025-09-21 05:35 UTC — `python -m pytest --cov=. --cov-report=term-missing` (83 passed, 8 skipped; total coverage steady at 93% with misses isolated to CLI/config/pipeline/setup hot spots and integration fixtures).
- 2025-09-21 05:37 UTC — `uvx mypy .` (72 errors; dominated by missing stubs for pytest/httpx/tenacity/langcodes/rich and attribute/type mismatches in `openai_lite.py`, `cli.py`, `tests/test_engines.py`, `tests/test_setup.py`, and examples).
- 2025-09-21 06:35 UTC — `python -m pytest -xvs` (103 passed, 8 skipped; coverage plugin summary 94% overall with remaining misses concentrated in CLI/config/engine helpers and integration skips).
- 2025-09-21 06:36 UTC — `python -m pytest --cov=. --cov-report=term-missing` (103 passed, 8 skipped; total coverage 94%, uncovered lines documented for CLI helper branches, config defaults, engine fallbacks, setup prompts, and intentionally partial integration tests).
- 2025-09-21 06:37 UTC — `uvx mypy .` (92 errors: dominated by missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/rich/platformdirs/tomli_w/loguru plus expected `Chat.completions` attribute lookups and stubbed example exports).
- 2025-09-21 06:37 UTC — `uvx bandit -r .` (273 Low-severity findings; expected test `assert` usage and the guarded backup writer fallback in `config.py`; no Medium/High findings).
- 2025-09-21 06:38 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` during /cleanup following the verification sweep.
- 2025-09-21 06:44 UTC — `python -m pytest -xvs` (109 passed, 8 skipped; coverage plugin summary 94% overall with `config.py` 97% and `engine_catalog.py` 95% after new fallback/aggregation tests).
- 2025-09-21 06:45 UTC — `python -m pytest --cov=. --cov-report=term-missing` (109 passed, 8 skipped; total coverage steady at 94% with remaining misses confined to CLI helper branches, engine retry paths, setup prompts, and intentionally skipped integrations).
- 2025-09-21 06:45 UTC — `uvx mypy .` (92 errors unchanged; missing third-party stubs plus expected mocked attribute lookups across openai/httpx/tenacity/langcodes/rich/platformdirs/tomli_w/loguru and fixture helper projections).
- 2025-09-21 06:46 UTC — `uvx bandit -r .` (292 Low-severity findings reflecting intentional asserts in tests and config backup fallback; no Medium/High findings).
- Added regression tests for `Defaults.from_dict(None)`, `EngineConfig.from_dict(name, None)`, `Credential.to_dict`/`Credential.from_any`, and `collect_deep_translator_providers(include_paid=True)` raising config coverage to 97% and engine catalog coverage to 95%.
- 2025-09-21 06:46 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` during post-verification /cleanup.
- 2025-09-21 05:38 UTC — `uvx bandit -r .` (221 Low issues; primarily deliberate `assert` usage in tests and the guarded backup writer in `config.py`).
- Added `tests/test_pipeline.py::test_translate_path_handles_mixed_formats` to cover TXT+HTML flows with the dummy engine, lifting `pipeline.py` coverage to 96% and guarding output path handling.
- Introduced `examples/basic_api.format_example_doc` with fallback messaging plus `tests/test_examples.py` to prevent `.strip()` on `None` docstrings in example listings.
- Guarded `tests/test_setup.py::test_generate_config_uses_fallbacks` against `None` credentials, clearing the prior mypy union-attr warning (remaining failures come from missing third-party stubs).

### Changed
- Normalized engine selectors to canonical short forms (`tr/*`, `dt/*`, `ll/*`, `hy`) with automatic config migration and CLI output displaying short families only.
- Added `--family`/`--configured-only` filters to `abersetz engines` and surfaced popular language hints in `abersetz lang`.
- Switched persisted configuration from JSON to TOML with automatic migration for existing installs.
- Added TOML parser/serializer dependencies (`tomli` fallback and `tomli-w`) to support the new format.
- Simplified CLI syntax so the target language is the first positional argument (e.g. `abtr de file.md`).
- Dropped legacy JSON configuration support; only `config.toml` is produced and read.
- Enriched setup discovery output with provider pricing hints sourced from the `external/` research bundle and displayed them in the summary table.
- Documented validation command usage with guidance on limiting selectors for offline smoke tests.

### Added
- Selector alias utilities ensuring legacy selectors remain accepted while tests cover normalization and CLI rendering.
- `abersetz lang` command listing supported language codes and their English names.
- `abersetz setup` command for automatic configuration discovery and initialization
  - Scans environment for API keys from 13+ providers
  - Tests discovered endpoints with /models calls
  - Generates optimized config based on available services
  - Interactive table display of discovered services
  - Supports --non-interactive mode for CI/automation
- `abersetz validate` command that exercises every configured selector, renders a rich validation table, and now runs automatically after setup completes so users receive immediate pass/fail feedback.
- `examples/validate_report.sh` helper to capture validation output for quick audits or CI artifacts.
- Comprehensive test suites for the setup wizard and `openai_lite` shim covering success/error flows, interactive output, and fallback heuristics.

### Fixed
- Fixed `abersetz config path` command double output issue by removing redundant console.print call
- Fixed `abersetz config show` to output TOML format instead of JSON
- Fixed `tomli_w.dumps()` call by removing unsupported `sort_keys` parameter
- Added `__main__.py` to enable `python -m abersetz` execution
- Fixed CLI test that was calling tr method with incorrect argument order
- Hardened setup wizard flow so validation failures log actionable warnings while preserving non-interactive execution.

### Performance Optimizations
- **Achieved 17x startup performance improvement**: Import time reduced from 8.5s to 0.43s
  - Replaced heavyweight OpenAI SDK with lightweight httpx-based client (saved 7.6s)
  - Implemented lazy imports for translation engines (translators, deep-translator)
  - Added module-level `__getattr__` in `__init__.py` for deferred loading
  - Created fast CLI entry point for instant --version checks
- **Engine shortcuts added**: Use `tr/*`, `dt/*`, `ll/*` instead of full names
- **Fixed `abersetz engines` output**: Removed duplicate object representations
- **Improved table formatting**: Standardized display with checkmarks and better styling

### Planning
- Comprehensive startup optimization plan created to reduce import time from 8.5s to <1s
  - Identified OpenAI SDK as primary bottleneck (7.6s, 89% of startup time)
  - Designed 7-phase refactoring with lazy imports and OpenAI SDK replacement
  - Planned httpx-based lightweight OpenAI client implementation
  - Documented module-level `__getattr__` patterns for deferred imports
  - Created detailed implementation roadmap with performance targets

## [0.1.0] - 2025-01-20

### Added
- Initial release of abersetz - minimalist file translator
- Core translation pipeline with locate → chunk → translate → merge workflow
- Support for multiple translation engines:
  - translators library (Google, Bing, etc.)
  - deep-translator library (DeepL, Google Translate, etc.)
  - Custom hysf engine using Siliconflow API
  - Custom ullm engine for LLM-based translation with voc management
- Automatic file discovery with recursive globbing and include/xclude filters
- HTML vs plain-text detection for markup preservation
- Semantic chunking using semantic-text-splitter for better context boundaries
- voc-aware translation pipeline with JSON voc propagation
- Configuration management using platformdirs for portable settings
- Environment variable support for API credentials
- Fire-based CLI with rich console output
- Comprehensive test suite with 91% code coverage
- Example files demonstrating usage

### Fixed
- Fixed pyproject.toml configuration for modern uv/hatch compatibility
- Updated dependency group configuration to use standard [dependency-groups]
- Fixed type annotations to use modern Python union syntax (|)

## [0.1.1] - 2025-01-21

### Changed
- Renamed CLI main command from `translate` to `tr` for brevity
- Added `abtr` console script as direct shorthand for `abersetz tr`
- Improved CLI help output by instantiating the Fire class correctly
- Reduced logging and rich output to minimum for cleaner interface
- Simplified CLI output to just show destination files

### Added
- Version command (`abersetz version`) to display tool version
- Language code validation with silent handling of non-standard codes

### Fixed
- Fixed Fire CLI to properly expose available commands in help output
- Updated test suite to match renamed CLI command
- Fixed deep-translator retry test by properly mocking the provider

### Improved
- Better error handling for malformed config files with automatic backup
- Added retry mechanisms with tenacity for all translation engines
- Created comprehensive integration tests with skip markers for CI

### Technical Details
- Python 3.10+ support
- Semantic chunking with configurable sizes per engine
- Offline-friendly dry-run mode for testing
- Optional voc sidecar files with --save-voc flag
- Retry logic with tenacity for robust API calls

</document_content>
</document>

<document index="8">
<source>CLAUDE.md</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.
- Built-in `abersetz validate` health check that pings each configured engine, reports latency, and surfaces pricing hints from the research catalog.

## Installation
```bash
pip install abersetz
```

## Quick Start
```bash
# Discover and configure available services
abersetz setup

# Validate configured engines and review pricing hints
abersetz validate --target-lang es

# Translate files
abersetz tr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.
- `abersetz validate` extras:
  - `--selectors tr/google,ll/default`: limit validation to specific selectors (comma-separated).
  - `--target-lang es`: override the sample translation language used during validation.
  - `--sample-text "Hello!"`: supply a custom validation snippet.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.




<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.__version__)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>

</document_content>
</document>

<document index="9">
<source>DEPENDENCIES.md</source>
<document_content>
---
this_file: DEPENDENCIES.md
---
# Dependencies

## Production Dependencies

### Translation Engines
- **translators** (>=5.9): Provides access to multiple free translation APIs (Google, Bing, Baidu, etc.) through a unified interface. Core requirement for free translation capabilities.
- **deep-translator** (>=1.11): Alternative translation library with support for additional providers including DeepL. Provides fallback options and file translation utilities.
- **httpx** (>=0.25): Modern HTTP client with sync/async support. Replaces the heavyweight OpenAI SDK with a lightweight implementation, reducing import time by 7.6 seconds.

### CLI and User Interface
- **fire** (>=0.5): Google's Python Fire library for automatic CLI generation from functions. Minimal boilerplate, automatic help generation, and intuitive command structure.
- **rich** (>=13.9): Rich terminal formatting and progress indicators. Provides beautiful console output with tables, progress bars, and colored text.
- **langcodes** (>=3.4): Mature language metadata with CLDR coverage, powering the `abersetz lang` listing without maintaining custom tables.

### Core Utilities
- **loguru** (>=0.7): Simple yet powerful logging with minimal setup. Provides structured logging with automatic rotation, retention, and colored output.
- **platformdirs** (>=4.3): Cross-platform user directories for configuration storage. Ensures config files are stored in appropriate OS-specific locations.
- **tomli-w** (>=1.0): Lightweight TOML serializer used to persist configuration data in the new `config.toml` format without writing custom emitters.
- **tomli** (>=2.0, Python <3.11 only): Backports the standard library TOML parser for Python 3.10 environments, guaranteeing consistent config loading across supported versions.
- **semantic-text-splitter** (>=0.7): Intelligent text chunking that respects semantic boundaries. Critical for maintaining context in translation chunks.
- **tenacity** (>=8.4): Robust retry logic with exponential backoff. Essential for handling transient API failures and rate limits.

## Development Dependencies

### Testing
- **pytest** (>=8.3): Modern testing framework with powerful fixtures and plugins. Industry standard for Python testing.
- **pytest-cov** (>=6.0): Coverage plugin for pytest. Ensures code quality with coverage reports.

### Code Quality
- **ruff** (>=0.9): Fast Python linter and formatter combining multiple tools. Replaces black, flake8, isort, and more.
- **mypy** (>=1.10): Static type checker for Python. Catches type errors before runtime.

## Why These Packages?

1. **Multiple Translation Backends**: Having both `translators` and `deep-translator` provides redundancy and access to different translation providers. Users can choose based on availability, quality, or cost.

2. **LLM Support**: The lightweight httpx-based client layer (no heavyweight SDKs) keeps LLM-driven translation profiles available without slowing startup or bloating dependencies.

3. **Developer Experience**: `fire` and `rich` create an intuitive CLI with minimal code. `loguru` simplifies debugging without complex logging configuration.

4. **Reliability**: `tenacity` ensures the tool handles network issues gracefully, while `semantic-text-splitter` maintains translation quality by preserving context.

5. **Cross-Platform**: `platformdirs` ensures the tool works correctly on Windows, macOS, and Linux without platform-specific code.

6. **Code Quality**: The development dependencies ensure high code quality through testing (91% coverage) and automatic formatting/linting.

## Verification Log
- 2025-09-21 11:03 UTC — /work reliability polish sweep (pytest, coverage, mypy, bandit) confirmed no dependency changes; improvements limited to tests and typing helpers.
- 2025-09-21 08:46 UTC — /report QA sweep (pytest, coverage, mypy, bandit) confirmed dependency roster unchanged; no new packages introduced.
- 2025-09-21 10:38 UTC — /work iteration adjusted tests only; dependency roster remains unchanged after QA sweep.
- 2025-09-21 10:29 UTC — /report QA sweep (pytest, coverage, mypy, bandit) confirmed dependency roster unchanged; no new packages introduced.
- 2025-09-21 08:06 UTC — Post-/work QA sweep (pytest, coverage, mypy, bandit) introduced only tests; dependency roster remains unchanged.
- 2025-09-21 07:59 UTC — /report sweep reran full QA (pytest, coverage, mypy, bandit); dependency roster unchanged with no new packages introduced.
- 2025-09-21 05:38 UTC — Reviewed dependency roster during /report; no additions or removals required.
- 2025-09-21 05:50 UTC — Revalidated after quality guardrails sprint; no dependency changes introduced by new tests or helpers.
- 2025-09-21 06:19 UTC — /report sweep confirmed dependency list remains accurate; no additions or removals required for latest verification run.
- 2025-09-21 06:27 UTC — Post-/work regression tests touched only test code; dependency roster unchanged.
- 2025-09-21 06:38 UTC — /report verification: reran full test/coverage/mypy/bandit sweep; dependency lineup unchanged with no new packages introduced.
- 2025-09-21 06:46 UTC — Configuration hardening tests added without altering dependencies; latest verification sweep confirms package set remains stable.

</document_content>
</document>

<document index="10">
<source>GEMINI.md</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.

## Installation
```bash
pip install abersetz
```

## Quick Start
```bash
abersetz tr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.




<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.__version__)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>

</document_content>
</document>

<document index="11">
<source>IDEA.md</source>
<document_content>

We want a `abersetz` Python package that performs language translation of text in files. Single file or multiple files. We also want a Fire CLI tool. 

Copy structure and ideas and overall functionality from @external/cerebrate-file.txt

The working scheme is: 

- We locate files
- We split the files into chunks
- We translate the chunks
- We merge the chunks 
- We save the translated files into a new folder or we write_over

https://pypi.org/project/translators/ ships with a CLI tool called 'fanyi' that can be used to translate text: 

```
>fanyi --help                                                                                                                          usage: fanyi input [--help] [--translator] [--from] [--to] [--is_html] [--version]

Translators(fanyi for CLI) is a library that aims to bring free, multiple, enjoyable translations to individuals and students in Python.

positional arguments:
  input                 raw text or path to a file to be translated.

options:
  --help                show help information.
  --translator          e.g. bing, google, yandex, etc...
  --from                from_language, default `auto` detected.
  --to                  to_language, default `en`.
  --is_html             is_html, default `0`.
  --version             Show version information.
```

Our tool should use the 'recurse' flag like @external/cerebrate-file.txt . It should not translate text but should translate files instead (similar to @external/cerebrate-file.txt ).


We do need this mechanism: 

```
  --from                from_language, default `auto` detected.
  --to                  to_language, default `en`.
```

As for HTML, we should actually have some sort of DETECTION of HTML. 

We need to connect our package to "translators" and "deep-translator" packages, and use the translator engines from there easily. 

But on top of that, we also implement our own translator engines. 

The first custom engine is 'hysf' (hunyuan/siliconflow). It should work by calling the OpenAI package with the siliconflow API and the model name 'tencent/Hunyuan-MT-7B'. The model has a 33k token window and the prompt format is like so (using curl):

And then, we also need to use the platformdirs package to store the API keys (in a dual form: we either store the env var name or the actual value), and other configuration. For example chunk sizes for various translator engines. 



```
curl -s --request POST --url https://api.siliconflow.com/v1/chat/completions --header "Authorization: Bearer ${SILICONFLOW_API_KEY}" --header 'Content-Type: application/json' --data '{"model":"tencent/Hunyuan-MT-7B","temperature":1.0,"messages":[{"role":"user","content": "Translate the following segment into Polish, without additional explanation.\n\nMYTEXT"}]}' | jq -r '.choices[0].message.content'
```

where MYTEXT is the text to translate, and Polish is the target language. We should use the OpenAI Python package plus tenacity to handle the API calls.

The second custom engine is 'ullm' (universal large language model) with configurable API endpoint provider URLs, model names, API key env var names or values, temperature, chunk size, and max input token length. See @external/dump_models.py for examples of LLM configurations. 

The implementation of the LLM engine should be similar to @external/cerebrate-file.txt but using the OpenAI Python package plus tenacity to handle the API calls. 

The main point is that the first chunk for the translation input should be sent with a potentially configured "prolog" which would typically be a custom voc expressed in JSON. 

The LLM prompt request for the translation to be output inside the `<output>` tag, and optionally would (in the same call) include `<voc>` where the prompt would request the model to output a same-formatted JSON that would include "newly established custom voc". The idea is that the model should be able to translate, and then also output the most important translations as a from-to dict so that subsequent chunks could translate the same stuff consistently. 

Our tool would parse for those voc outputs and would merge that into our running voc (and add it into the next chunk). We could also give the tool the --save_voc param and then in addition to the saved chunk, our tool would save the updated voc JSON next to the output file. 

<TASK>

1. Now /plan all this into @PLAN.md 

2. Into @TODO.md write a flat linear list of `- [ ]` itemized tasks. 

3. Replace @README.md with a detailed explanation of what our package does, how it works and why. 

4. Edit @CLAUDE.md : keep its contents but at its very beginning add all the contents of the new @README.md 

5. Start implementing tasks from @PLAN.md and @TODO.md  

6. Create an `examples` folder and write actual real examples there. 

7. Review, analyze, verify, test (on actual real examples). 

8. Refine, improve, iterate. 

Focus all your efforts on producing a lean, performant, focused minimal viable product. Eliminate unnecessary fluff. Minimize custom code if ready-made code can be used. 
</TASK>

## Potential dependencies

- https://github.com/benbrandt/text-splitter (see @external/text-splitter.txt and @external/semantic-text-splitter.txt )
- https://pypi.org/project/tokenizers/ (see @external/tokenizers.txt ) 
- https://pypi.org/project/tiktoken/ (see @external/tiktoken.txt )
- https://pypi.org/project/ftfy/ (see @external/python-ftfy.txt )
- https://pypi.org/project/langcodes/ (see @external/langcodes.txt )
- https://github.com/openai/openai-python
- tenacity
- deep-translator (see @external/deep-translator.txt )
- https://pypi.org/project/translators/ ( see @external/translators.txt )
- https://github.com/tox-dev/platformdirs ( see @external/platformdirs.txt )
</document_content>
</document>

<document index="12">
<source>LICENSE</source>
<document_content>
MIT License

Copyright (c) 2025 Adam Twardoch

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
</document_content>
</document>

<document index="13">
<source>LLXPRT.md</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.

## Installation
```bash
pip install abersetz
```

## Quick Start
```bash
abersetz tr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.




<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.__version__)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>

</document_content>
</document>

<document index="14">
<source>PLAN.md</source>
<document_content>
---
this_file: PLAN.md
---
# Abersetz Evolution Plan (Issue #200)

## Scope (One Sentence)
Deliver a responsive translation CLI that defaults to short engine selectors, validates every configured engine end-to-end, and ships with polished docs, examples, and tests that make abersetz easy to adopt and extend.

## Guiding Principles
- Preserve backward compatibility via aliases while promoting the new short selector format (`tr/google`, `dt/deepl`, `ll/default`, etc.).
- Prefer existing, battle-tested packages (`translators`, `deep-translator`, httpx, rich) over custom reinventions.
- Every new feature gains automated tests, documentation, and examples before calling it done.
- Prioritize fast feedback: run targeted pytest suites and smoke the CLI for every phase.

## Phase 1 – Engine Selector Simplification *(Completed)*
**Summary**: Short selector support landed in `config.py`, `engine_catalog.py`, and CLI surfaces. Legacy names are normalized transparently, tests cover migrations, and generated configs now emit `tr/*` / `dt/*` selectors by default.

## Phase 2 – CLI UX Refresh *(Completed)*
**Summary**: CLI accepts short selectors across commands, `abersetz engines` renders the condensed table with configuration markers, and `lang` lists popular targets using `langcodes`. Snapshot-style Fire runner tests are in place for the engines output.

## Phase 3 – Engine Validation Command *(Completed)*
**Summary**: A new `validation.py` module powers `abersetz validate` and the post-setup health check. The command translates a short sample per selector, displays results in a rich table, and ships with exhaustive unit tests using stub engines.

## Phase 4 – Auto-Configuration & Engine Research Enhancements
**Goals**: Broaden provider awareness and produce smarter defaults using the research in `external/` and recent API trends.
- Automate provider metadata extraction from `external/translators.txt`, `external/deep-translator.txt`, and current API research (Google, Azure, AWS, DeepL, LibreTranslate, OpenRouter, etc.) so discovery stays accurate without manual updates.
- Keep pricing/tier hints fresh by syncing the new metadata into setup output (including highlighting free/community tiers).
- Add structured hints for optional packages the user might need (e.g., install `translators[google]`)
- Allow users to opt into community/self-hosted engines like LibreTranslate with a `--include-community` flag.
- Document every provider addition in `DEPENDENCIES.md` with justification referencing external sources.

## Phase 5 – Documentation, Examples, and Tests
**Goals**: Make abersetz feel “vibrant” with real-world material and guardrails.
- Update `README.md`, `CLAUDE.md`, and `CHANGELOG.md` with the new selector format, validation workflow, and configuration guidance.
- Expand `WORK.md` logging template to capture validation runs and outcomes per session.
- Add at least three runnable examples in `examples/`: (1) Multi-file translation with mixed engines, (2) Validation summary report, (3) Config diff before/after setup.
- Create user-facing walkthrough in `docs/` (or extend README) describing how to pick engines based on cost, drawing on the external research completed above.
- Ensure tests cover: selector normalization, CLI output, validation command, setup integration, and documentation link checks (using `pytest` or `pytest-regressions`).
- 2025-09-21 Action Items *(Completed 2025-09-21)*:
  - Added regression tests for `config_dir` fallback and `resolve_credential` logging (see `tests/test_config.py`).
  - Covered `_filter_available` deduping plus `collect_translator_providers` import failure with new cases in `tests/test_engine_catalog.py`.
  - Extended `tests/test_cli.py` to assert `_collect_engine_entries` respects `family` and `configured_only` filters, including long-form family selectors.

## Maintenance Sprint – Reliability Polish *(Completed 2025-09-21)*
**Objective**: Raise confidence in existing helper utilities by tightening coverage for configuration defaults, enforcing typed report generation, and locking down parallel translation behaviour.

### Task 1 – Cover setup discovery and default fallbacks
- **Problem**: `setup.py` still has uncovered lines (221, 262, 286, 471), leaving the endpoint testing branch, fallback logging, and empty-engine return path unverified.
- **Approach**: Extend `tests/test_setup.py` with focused cases to (a) drive `_test_endpoints` through the base-url path, (b) force `_test_single_endpoint` to hit the unknown JSON branch and verbose failure logging, and (c) assert `_select_default_engine` returns `None` when no engines exist.
- **Tests**: `python -m pytest tests/test_setup.py -k "test_endpoints or test_test_single_endpoint or select_default_engine" -xvs`; full suite for regression.
- **Success Criteria**: Coverage report no longer lists `setup.py` lines 221/262/286/471 as missed while preserving existing behaviour.
- **Status**: Completed 2025-09-21 11:03 UTC — `tests/test_setup.py` cases now cover API-provider endpoint flow, verbose failure logging, and empty engine fallback (setup.py shows 100% coverage).

### Task 2 – Harden `TranslationWorkflow.generate_report` typing
- **Problem**: `examples/advanced_api.py` triggers mypy "object is not indexable" errors and keeps report aggregation loosely typed.
- **Approach**: Introduce lightweight `TypedDict`/dataclass helpers to describe report structure, replace untyped dict juggling with explicit updates, and ensure tests confirm error aggregation and per-language stats.
- **Tests**: `python -m pytest tests/test_examples.py -k "translation_workflow" -xvs`; `uvx mypy examples/advanced_api.py tests/test_examples.py`.
- **Success Criteria**: Mypy no longer flags lines 68-79/307; report output stays identical.
- **Status**: Completed 2025-09-21 11:03 UTC — `TranslationWorkflow.generate_report` now uses typed helpers; `uvx mypy .` dropped the union-attr diagnostic and regression tests confirm identical JSON output.

### Task 3 – Add `ParallelTranslator` behaviour tests
- **Problem**: `examples/advanced_api.py` coverage sits at 47% with async comparison utilities untested, risking regressions in demo code.
- **Approach**: Mock `create_engine` to supply stub engines returning deterministic translations and one raising an exception; assert `compare_translations` surfaces both successes and failures.
- **Tests**: `python -m pytest tests/test_examples.py -k parallel_translator -xvs`; smoke the async helper under coverage.
- **Success Criteria**: Coverage for `examples/advanced_api.py` climbs (target ≥70%), verifying both success and error handling paths for the async translator.
- **Status**: Completed 2025-09-21 11:03 UTC — new async and example-facing tests drive coverage to 88% with deterministic stubs for success/failure flows.

## Maintenance Sprint – Coverage Hardening *(Completed 2025-09-21)*
**Goal**: Close remaining high-signal coverage gaps in CLI/engine helpers to reinforce translation and setup reliability without adding new surface area.

### Task A – Cover CLI result rendering output *(done)*
- Added `test_render_results_lists_destinations` capturing a wide-console `Console` sink to assert both destination paths appear in the render output, raising `src/abersetz/cli.py` line coverage for `_render_results` to 100%.
- Reused existing `TranslationResult` dataclass without introducing dependencies; ensures future CLI `tr` flows list translated files reliably.

### Task B – Cover translators single-string provider configuration *(done)*
- Added `test_collect_engine_entries_accepts_single_provider_string` to verify configs defining only `"provider"` mark the provider as configured and discoverable, exercising lines 134-135 of `src/abersetz/cli.py`.
- Retained existing fixture coverage for list-based providers to guard dedupe logic and avoid regressions in config migrations.

### Task C – Cover HTML-specific chunk sizing logic *(done)*
- Added `test_engine_base_chunk_size_prefers_html_then_plain` to ensure `EngineBase.chunk_size_for` prefers `html_chunk_size` when present and falls back to `chunk_size`/`None`, covering lines 81-83 of `src/abersetz/engines.py`.
- Validates pipeline fallback semantics without any new abstractions.

### Verification
- `python -m pytest -k "render_results or collect_engine_entries or chunk_size_for" -xvs`
- `python -m pytest -xvs`
- `python -m pytest --cov=. --cov-report=term-missing`
- `uvx mypy .`
- `uvx bandit -r .`

## Micro Maintenance – CLI Shell Coverage *(Completed 2025-09-21)*
**Objective**: Close the remaining uncovered CLI wrapper lines so Fire entrypoints, setup passthrough, and pipeline error surfacing stay regression-proof without inflating scope.

- **Task: Guard translation error reporting** *(done — `tests/test_cli.py::test_cli_translate_reports_pipeline_error`)*
  - Patched `translate_path` via `monkeypatch` to raise `PipelineError` and asserted `[red]...[/red]` console output prior to re-raising. Covered `cli.py:373-375`.
  - Verification: `python -m pytest tests/test_cli.py -k pipeline_error -xvs` plus full suite; `cli.py` coverage climbed to 96%.

- **Task: Exercise setup wrapper** *(done — `tests/test_cli.py::test_cli_setup_forwards_flags`)*
  - Mocked `setup_command` to capture arguments ensuring `AbersetzCLI.setup` forwards flags intact, covering `cli.py:435`.
  - Verification: Targeted pytest run plus full suite; no dependencies added.

- **Task: Verify Fire entrypoints** *(done — `tests/test_cli.py::test_cli_main_invokes_fire`, `...::test_cli_abtr_main_invokes_fire_with_tr`)*
  - Patched `fire.Fire` to capture invocations from `main()`/`abtr_main()` verifying callables, covering `cli.py:464` and `cli.py:474`.
  - Verification: Targeted pytest run plus full suite; no new dependencies.

## Maintenance Sprint – Config & CLI Reliability *(Completed 2025-09-21)*
**Objective**: Close uncovered branches in engine catalog and configuration helpers to keep short selector UX deterministic.

- **Task 1: Cover `ConfigCommands.show` and `.path`**
  - Implementation: Extended `tests/test_cli.py::test_cli_config_commands_show_and_path` to seed a temp config directory, call `ConfigCommands().show()`, parse the TOML, and assert `.path()` resolves to the injected location.
  - Verification: `python -m pytest tests/test_cli.py -k "config_commands" -xvs`, full suite, and coverage run (`python -m pytest --cov=. --cov-report=term-missing`).

- **Task 2: Cover single-provider branches in `_collect_engine_entries`**
  - Implementation: Added `tests/test_cli.py::test_collect_engine_entries_string_branches` exercising string-only `providers` and non-dict `profiles`, asserting selectors `tr/bing`, `dt/deepl`, `ll/default`, and `hy` report `configured=True`.
  - Verification: Targeted pytest invocation (`python -m pytest tests/test_cli.py -k "string_branches" -xvs`) plus full suite with coverage.

- **Task 3: Exercise recursive credential resolution**
  - Implementation: Added `tests/test_config.py::test_resolve_credential_recurses_into_stored_secret` to clear env vars, ensure recursion traverses stored credentials, and capture log output for the missing env debug message.
  - Verification: `python -m pytest tests/test_config.py -k recurses -xvs`, full suite, coverage, mypy, and bandit.

## Phase 6 – Release Readiness
**Goals**: Ship confidently with verifiable quality.
- Run `python -m pytest --cov=. --cov-report=term-missing` targeting ≥90% coverage for touched modules.
- Status: coverage sits at 91% as of 2025-09-21; maintain ≥90% when new features land.
- Execute manual smoke tests: `abersetz engines`, `abersetz validate`, `abersetz setup --non-interactive`, and a sample `abersetz tr` flow using faked engines.
- Prepare release notes summarizing selector transition, validation command, and provider updates.
- Tag version bump once manual verification passes and ensure packaging metadata references the new docs.

## Dependencies & External References
- Continue using `translators` and `deep-translator` for traditional MT; reference `external/translators.txt` and `external/deep-translator.txt` for supported provider lists.
- Lean on `httpx`, `tenacity`, `rich`, and `loguru` already in the project—no new heavy dependencies unless unavoidable.
- Monitor provider docs cited in the Perplexity research (Google, Azure, AWS, DeepL, Translate.com, LibreTranslate) when refining setup hints.

## Timeline & Checkpoints
1. **Week 1**: Complete Phases 1–2 (selector normalization, CLI refresh) with tests.
2. **Week 2**: Deliver Phase 3 validation command and integrate with setup, plus accompanying tests.
3. **Week 3**: Finish provider research updates and documentation/examples (Phases 4–5).
4. **Week 4**: Conduct release readiness sweep (Phase 6), run full test matrix, and publish release.

## Exit Criteria
- CLI displays only short selectors and accepts both short and legacy forms.
- `abersetz validate` reliably checks each engine and is invoked during setup.
- Documentation, examples, and tests reflect the new workflow.
- QA checklist (automated + manual) executed with recorded results in `WORK.md`.

## Maintenance Sprint – Coverage & Docs
- **Expand validation coverage** *(Completed 2025-09-21 — `validation.py` now 100% covered)*: Add focused tests for `_append_selector`, `_extract_providers`, and `_selectors_from_config` branches (include-defaults toggle, ullm fallback) to push `validation.py` ≥95% coverage.
- **CLI validation tests** *(Completed 2025-09-21 — selector parsing verified in `tests/test_cli.py`)*: Extend `tests/test_cli.py` to exercise `AbersetzCLI.validate` and `AbersetzCLI.lang` flows using stub engines to bump `cli.py` coverage above 90% and catch regressions.
- **Documentation refresh** *(Completed 2025-09-21 — validation guidance added to CLI + configuration docs)*: Update `docs/cli.md` (and cross-link in `docs/configuration.md`) with guidance for limiting selectors during validation, showcasing `--selectors`/`--include-defaults` to avoid long offline runs.

## Maintenance Sprint – CLI Reliability Tune-Up
- **Exercise parsing helpers via tests** *(Completed 2025-09-21 — coverage in tests/test_cli.py)*: Add focused unit tests for `_parse_patterns` (None, comma-separated string, tuple inputs) and `_load_json_data` (inline JSON vs. file path) to prevent regressions in selector/pattern handling.
- **Cover console rendering fallbacks** *(Completed 2025-09-21 — see tests/test_cli.py)*: Write tests capturing `_render_engine_entries([])` and `_render_validation_entries([])` to assert the user guidance strings, ensuring CLI behaves gracefully when configurations are empty.
- **Hit provider-aggregation branches** *(Completed 2025-09-21 — tests plus `_collect_engine_entries` fix)*: Construct stub configs feeding single `provider` strings (vs. list) and ullm profiles so `_collect_engine_entries` traverses every branch, and update tests to assert selectors differ when `configured_only` or `family` filters are applied.

## Maintenance Sprint – Configuration & Catalog Hardening *(Completed 2025-09-21)*
- **Task: Guard Defaults/EngineConfig fallbacks** *(Completed 2025-09-21 — tests/test_config.py)*  
  - *Outcome*: Added regression tests for `Defaults.from_dict(None)` and `EngineConfig.from_dict(name, None)`, confirming defaults remain intact when optional sections are stripped. Coverage for `src/abersetz/config.py` climbed from 94% to 97%.  
  - *Verification*: `python -m pytest tests/test_config.py -k "defaults_from_dict or engine_config_from_dict" -xvs` plus full suite; all pass.
- **Task: Strengthen Credential conversion coverage** *(Completed 2025-09-21 — tests/test_config.py)*  
  - *Outcome*: Extended credential tests to assert optional field serialization and to lock in the `TypeError` raised for unsupported payload types. Prevents regressions when validating user supplied credentials.  
  - *Verification*: `python -m pytest tests/test_config.py -k "credential_to_dict or credential_from_any" -xvs` and full-suite run.
- **Task: Cover deep translator provider aggregation** *(Completed 2025-09-21 — tests/test_engine_catalog.py)*  
  - *Outcome*: Ensured `collect_deep_translator_providers(include_paid=True)` returns deterministic, duplicate-free lists with paid providers appended. `engine_catalog.py` coverage rose to 95%.  
  - *Verification*: `python -m pytest tests/test_engine_catalog.py -k "deep_translator_providers" -xvs` and overall suite.

## Maintenance Sprint – Quality Guardrails (Completed 2025-09-21)
- **Task 1: Offline integration coverage for `translate_path`** *(Completed 2025-09-21 — see `tests/test_pipeline.py::test_translate_path_handles_mixed_formats`)*
  - Objective: Raise coverage for the translation pipeline by exercising the happy-path file walk with stub engines, ensuring we catch regressions in include/exclude handling and output directory creation (current gap: `tests/test_integration.py` covers only 24%).
  - Implementation: Added a mixed-format test that drives `translate_path` across TXT and HTML fixtures with the `DummyEngine`, asserting destination locations, normalized selectors, and preserved format metadata.
  - Verification: `python -m pytest tests/test_pipeline.py -k mixed_formats -xvs` then full suite confirmed behaviour; pipeline coverage climbed to 96%.
- **Task 2: Harden Optional[str] handling in `examples/basic_api.py`** *(Completed 2025-09-21 — guarded via `format_example_doc` helper and `tests/test_examples.py`)*
  - Objective: Eliminate runtime `AttributeError` and mypy warnings (`Item "None" has no attribute "strip"`) by guarding optional user inputs before calling `.strip()` or similar operations.
  - Implementation: Introduced `format_example_doc` helper with safe fallback messaging and updated CLI output to rely on it; added protocol-driven loader test ensuring None docs are handled gracefully.
  - Verification: `python -m pytest tests/test_examples.py -xvs` plus full suite; mypy on the module now avoids union-attr complaints (remaining missing-stub errors documented in WORK.md).
- **Task 3: Stabilize setup credential typing** *(Completed 2025-09-21 — see `tests/test_setup.py::test_generate_config_uses_fallbacks`)*
  - Objective: Reduce `uvx mypy .` noise by fixing `tests/test_setup.py:580` (`Credential | None` attribute access) and related optional-handling in setup helpers so None cases are explicitly guarded.
  - Implementation: Added explicit guard before dereferencing the `ullm` credential in the setup test to narrow the type and better document the expectation.
  - Verification: `python -m pytest tests/test_setup.py -k generate_config_uses_fallbacks -xvs`; mypy union-attr warning cleared though missing third-party stubs still block a clean run.

## Maintenance Sprint – Coverage Polish II *(Completed 2025-09-21)*
- **Task 1: Guard examples CLI dispatch**
  - Outcome: Added `test_basic_api_cli_dispatch_runs_requested_example` which executes the module as `__main__` with a stubbed translator; `examples/basic_api.py` now hits 100% coverage and CLI dispatch regressions surface immediately.
  - Verification: `python -m pytest tests/test_examples.py -k cli_dispatch -xvs` then full suite; output confirmed stub execution without the usage banner.
- **Task 2: Validate chunk_text fallback paths**
  - Outcome: Fresh tests cover the empty-input guard and forced `ImportError` fallback, pushing `chunking.py` to 100% coverage and proving graceful degradation without `semantic_text_splitter`.
  - Verification: `python -m pytest tests/test_chunking.py -k "blank or fallback" -xvs` plus full suite; fallback slices matched expectations.
- **Task 3: Assert __getattr__ rejects unknown exports**
  - Outcome: Regression test guarantees typos raise `AttributeError` while repeated access returns the cached pipeline export, preventing silent import failures.
  - Verification: `python -m pytest tests/test_package.py -k getattr -xvs` followed by full suite.

## Maintenance Sprint – Coverage Polish III *(Completed 2025-09-21)*
- **Task 1: Exercise deep-translator string provider branch**
  - Outcome: Added `test_collect_engine_entries_handles_deep_translator_string_providers` to assert `dt/libre` is emitted and marked configured when `providers` is configured as a string, bringing `_collect_engine_entries` to 100% coverage.
  - Verification: `python -m pytest tests/test_cli.py -k deep_translator_string -xvs`, followed by the full suite.
- **Task 2: Round out EngineConfig serialization**
  - Outcome: Added `test_engine_config_to_dict_includes_optional_fields` ensuring `chunk_size`, `html_chunk_size`, and nested credentials survive serialization, covering the previously missed optional-field branches.
  - Verification: `python -m pytest tests/test_config.py -k engine_config_to_dict -xvs`, plus the full suite.
- **Task 3: Normalize selectors for null/blank inputs**
  - Outcome: Extended `tests/test_engine_catalog.py` with guards for `None`, whitespace-only strings, and empty-base selectors so normalization behaviour stays regression-proof.
  - Verification: `python -m pytest tests/test_engine_catalog.py -k normalize_selector -xvs`, then the full suite.

## Maintenance Sprint – Reliability Boost (Completed 2025-09-21)
- **Task: Enforce pipeline read-permission safeguards** *(tests/test_pipeline.py::test_translate_path_errors_on_unreadable_file)*
  - Outcome: Simulated a zero-permission file on POSIX systems and confirmed `translate_path` raises `PipelineError` with the offending path in the message. Test gated on Windows to avoid chmod inconsistencies. Verified with `python -m pytest tests/test_pipeline.py -k unreadable -xvs`.

- **Task: Exercise `_persist_output` write-over & voc paths** *(tests/test_pipeline.py::test_translate_path_write_over_updates_source`, `..._dry_run_skips_io`)*
  - Outcome: Validated write-over updates the original file and emits a `.voc.json` when requested, while dry runs skip filesystem writes entirely. Ensures coverage of `_persist_output` branches. Verified with targeted pytest invocations and full-suite run.

- **Task: Mock example flows for coverage** *(tests/test_examples.py suite)*
  - Outcome: Replaced real translations with stub results across example functions and CLI usage path, boosting `examples/basic_api.py` coverage to 98% and guaranteeing documentation examples remain executable without network access. Verified via `python -m pytest tests/test_examples.py -xvs` and coverage summary.

## Maintenance Sprint – Engine Factory Reliability *(Completed 2025-09-21)*
- **Task 1: Harden `_build_llm_engine` error handling**
  - Outcome: Added regression tests asserting `EngineError` for missing model definitions and credentials, proving factories fail fast rather than constructing misconfigured engines. Covered via `tests/test_engines.py::test_build_llm_engine_without_model_raises_engine_error` and `..._without_credential_raises_engine_error`.
  - Verification: `python -m pytest tests/test_engines.py -k "llm_engine_without" -xvs` plus full suite.

- **Task 2: Verify `_select_profile` and profile resolution**
  - Outcome: Extended engine tests to exercise default profile selection, no-profile fallback, and error handling for unknown variants, lifting coverage across lines 356-363.
  - Verification: `python -m pytest tests/test_engines.py -k select_profile -xvs` and project-wide test sweep.

- **Task 3: Cover OpenAI client construction and unsupported selectors**
- Outcome: Added tests for `_make_openai_client` base URL logic and `create_engine` unsupported-engine guard, ensuring new selectors cannot bypass validation and clients honor configured endpoints.
  - Verification: `python -m pytest tests/test_engines.py -k "openai_client or custom" -xvs` followed by full suite.

## Maintenance Sprint – Reliability Guard IV *(Completed 2025-09-21)*
**Outcome**: Closed critical coverage gaps in credential resolution, pipeline warnings, and LLM payload parsing without expanding surface area.

- **Task 1: Cover `resolve_credential` null and chained secrets**
  - Added `tests/test_config.py::test_resolve_credential_returns_none_for_null_reference` and `...::test_resolve_credential_reuses_stored_alias_object`, ensuring credential-less payloads shortcut to `None` and stored alias recursion hits the terminal fallback branch (line 322).
  - Verified via `python -m pytest tests/test_config.py -k "resolve_credential_returns_none or resolve_credential_reuses" -xvs` plus the full suite.

- **Task 2: Warn on oversized input in pipeline**
  - Introduced `tests/test_pipeline.py::test_translate_path_warns_on_large_file`, monkeypatching `Path.stat` to emulate an 11MB document and asserting the loguru warning while `translate_path` still succeeds.
  - Confirmed with `python -m pytest tests/test_pipeline.py -k warns_on_large_file -xvs` and the full suite; `src/abersetz/pipeline.py` now reports 100% coverage.

- **Task 3: Exercise LLM payload parsing fallbacks**
  - Extended `tests/test_engines.py` with `_make_llm_engine` helper and new cases covering payloads without `<voc>`, malformed JSON, non-dict payloads, plus a missing configuration guard for `create_engine("tr/google")` when `translators` config is absent, lifting `src/abersetz/engines.py` to 98% coverage.
  - Verified using `python -m pytest tests/test_engines.py -k "parse_payload or missing_selector" -xvs` alongside the full test run.

## Maintenance Sprint – Coverage Guard V *(Completed 2025-09-21)*
**Objective**: Eliminate the remaining uncovered defensive branches in `setup.py` and `engines.py` so setup-time diagnostics and engine factories behave predictably even under misconfiguration.

**Current Gaps & Risks**
- `src/abersetz/engines.py` still leaves HTML translation path (line 103), deep-translator provider rejection (line 158), and `_build_hysf_engine` credential guard (line 346) untested. Without regression tests these guards could regress silently, breaking HTML translations or masking credential issues.
- `src/abersetz/setup.py` misses coverage for provider discovery enrichments (line 189), endpoint probing fallbacks for dict/list payloads (lines 258-261), verbose progress logging (lines 282-285), and default engine fallback logic (lines 451-459). These paths ensure accurate hints when users run `abersetz setup`.
- Previous research (pytest-httpx / respx) confirmed we can emulate HTTPX responses without network calls; Loguru capture requires routing via `logger.add(caplog.handler)` per upstream guidance.

**Constraints**
- No new runtime dependencies; reuse pytest monkeypatching and lightweight stubs instead of introducing `pytest-httpx` or `respx`.
- Tests must remain offline-friendly and deterministic across CI.
- Keep functions ≤20 lines and files ≤200 lines; prefer adding to existing test modules (`tests/test_setup.py`, `tests/test_engines.py`).

### Task 1 – Probe `_test_single_endpoint` data handling & verbose logging
- **Implementation**
  1. Monkeypatch `httpx.Client` with a synchronous stub returning crafted responses so `_test_single_endpoint` sees (a) `{ "data": [...] }` and (b) list payloads without network access.
  2. Instantiate `SetupWizard(non_interactive=True, verbose=True)` with a `DiscoveredProvider` carrying a base URL to trigger the HTTP path.
  3. Assert the provider toggles `is_available`, `model_count`, and captures both success and failure branches; use a temporary Loguru handler tied to `caplog` to exercise verbose log lines.
- **Tests**
  - Add `test_test_single_endpoint_populates_model_count_from_dict_and_list` covering the dict/list parsing branch.
  - Add `test_test_single_endpoint_logs_verbose_status` validating the debug message content when `verbose` is enabled.
- **Edge Cases**
  - Ensure 200 responses without `data` default model_count to 1.
  - Simulate a non-200 response to confirm the existing error pathway remains untouched.

### Task 2 – Harden provider discovery & validation defaults
- **Implementation**
  1. Use `monkeypatch.setenv("DEEPL_API_KEY", "token")` and instantiate `SetupWizard` to run `_discover_providers`, verifying the Deepl credential adds `deep-translator/deepl` via `normalize_selector` (covers line 189).
  2. Create a wizard instance and call `_validate_config([])` to hit the early return guard (line 148) without emitting tables; ensure the method exits cleanly and leaves `validation_results` empty.
  3. Extend an integration-style test that runs `_generate_config` with discovered Deepl/OpenAI providers so default engine selection falls through to the correct branch (lines 451-459).
- **Tests**
  - Add `test_discover_providers_includes_deepl_engine_mapping`.
  - Add `test_validate_config_returns_immediately_when_no_results`.
  - Extend or add `test_generate_config_sets_default_engine_priority` to assert fallback ordering.
- **Edge Cases**
  - Clean up environment variables post-test to avoid cross-test interference.
  - Mock console printing via Fire-safe `Console` or rely on non-interactive mode to avoid actual terminal output.

### Task 3 – Cover remaining engine error paths
- **Implementation**
  1. Monkeypatch `sys.modules["translators"]` with a stub exposing `translate_html` / `translate_text` to verify `TranslatorsEngine` routes HTML requests correctly (line 103).
  2. Monkeypatch `sys.modules["deep_translator"]` with minimal provider classes and assert `DeepTranslatorEngine("invalid", config)` raises `EngineError` (line 158).
  3. Build a minimal `AbersetzConfig` without credentials and assert `_build_hysf_engine` surfaces the `Missing credential` error (line 346).
- **Tests**
  - Add `test_translators_engine_translates_html_with_stub` ensuring HTML path used and reuse existing stubbed request fixtures.
  - Add `test_deep_translator_engine_rejects_unknown_provider` raising `EngineError`.
  - Add `test_build_hysf_engine_without_credential_raises_error` to lock in the credential guard.
- **Edge Cases**
  - Restore original modules in `sys.modules` after each test to prevent bleed-over.
  - Keep stubs minimal to avoid accidentally depending on real packages.

### Testing & Verification
- Follow RED ➜ GREEN ➜ REFACTOR per task: write failing tests first, implement minimal fixtures/stubs, rerun targeted tests, then `python -m pytest -xvs` and the coverage suite to confirm 96% stays or improves.
- Run `uvx mypy .` and `uvx bandit -r .` post-implementation to ensure diagnostics remain unchanged.
- *Completed 2025-09-21 via targeted pytest runs recorded in `WORK.md`; full suite now reports 97% coverage with engines/setup guardrails locked in.*

### Dependencies
- No new dependencies required; rely on `pytest`, `monkeypatch`, `caplog`, and custom stubs per research recommendations.

### Future Considerations
- Investigate the recursive fallback at `config.py:322`, which currently risks runaway recursion when stored credentials lack explicit values; a follow-up may refactor this logic once current guardrails are covered.

## Maintenance Sprint – Type & Setup Reliability *(Completed 2025-09-21)*
**Goal**: Reduce mypy noise in test helpers and lock in setup defaults by adding focused tests without expanding runtime surface area.

### Task 1 – Tighten example protocol typing *(done)*
- **Problem**: `tests/test_examples.py` casts the loaded module to `_BasicApiModule`, but the protocol only declares `format_example_doc`, causing mypy to flag every example function access (88 total errors cite these attributes).
- **Approach**:
  1. Expand `_BasicApiModule` to describe the example callables actually exercised (`example_simple`, `example_batch`, `example_dry_run`, `example_html`, `example_with_config`, `example_llm_with_voc`, and `cli`).
  2. Replace ad-hoc `object` stubs in the config helper with lightweight dataclasses carrying typed attributes so mypy no longer reports `object` attribute access.
  3. Keep runtime behaviour identical; only adjust typing helpers and shared fixtures.
- **Verification**: `uvx mypy tests/test_examples.py` now reports only missing third-party stubs; `python -m pytest tests/test_examples.py -xvs` passes with existing coverage snapshot.

### Task 2 – Match Path.stat monkeypatch signature *(done)*
- **Problem**: The large-file warning test monkeypatches `Path.stat` with `fake_stat(*args, **kwargs)`, leading mypy to complain about an unexpected `**dict[str, object]` argument (error at `tests/test_pipeline.py:203`).
- **Approach**:
  1. Implement a small helper returning an `os.stat_result` via `os.stat_result((mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime))` or use `types.SimpleNamespace` with typed attributes.
  2. Define `fake_stat(self: Path, *, follow_symlinks: bool = True) -> os.stat_result` to mirror the real signature.
  3. Restore the original `Path.stat` after the test to avoid cross-test effects.
- **Verification**: `python -m pytest tests/test_pipeline.py -k "warns_on_large_file" -xvs`; `uvx mypy tests/test_pipeline.py` now only flags missing stubs.

### Task 3 – Cover setup default engine fallback *(done)*
- **Problem**: `src/abersetz/setup.py` lines 453-459 (fallback to `ullm/default` or first engine) remain uncovered, leaving default selection unverified when only ullm/self-hosted engines exist.
- **Approach**:
  1. Build a `SetupWizard` instance with fabricated `DiscoveredProvider` entries representing only `ullm` (available) plus a custom engine to exercise each branch.
  2. Invoke `_generate_config()` to assert defaults prefer `ullm/default` when present and fall back to the first engine when no preferred family is available.
  3. If helper is private, add test to `tests/test_setup.py` constructing the wizard directly in non-interactive mode.
- **Verification**: `python -m pytest tests/test_setup.py -k defaults_to_ullm -xvs`; `setup.py` uncovered lines reduced to the verbose failure path plus fallback guard.

## Maintenance Sprint – Quality Hardening (Planned 2025-09-21)

## Maintenance Sprint – Mypy Noise Reduction (Planned 2025-09-21)
**Goal**: Reduce stubborn mypy diagnostics by tightening test-only helpers without touching production code paths.

### Task 1 – Restore DeepTranslator providers without Mapping.copy
- **Problem**: `tests/test_engines.py` calls `.copy()` on the providers mapping returned by `DeepTranslatorEngine._get_providers()`, which mypy flags because the return type is `Mapping[str, type]`.
- **Approach**: Snapshot providers with `dict(DeepTranslatorEngine._get_providers())`, reassign using shallow copies inside the test, and ensure the `finally` block restores the mapping with the same concrete type.
- **Verification**: `python -m pytest tests/test_engines.py -k deep_translator_engine_retries_on_network_failures -xvs`; `uvx mypy tests/test_engines.py` should drop the `attr-defined` error.
- **Status**: Completed 2025-09-21 10:38 UTC; switched to `dict(...)` snapshot in `tests/test_engines.py` and reran targeted pytest plus full suite.

### Task 2 – Make offline smoke test assertions explicit
- **Problem**: `tests/test_offline.py::test_import_works_offline` asserts module-level callables directly, which mypy warns about (`Function ... could always be true`).
- **Approach**: Replace raw truthiness checks with `is not None` and `callable(...)` assertions so type checkers see deliberate checks instead of truthy objects.
- **Verification**: `python -m pytest tests/test_offline.py -xvs`; `uvx mypy tests/test_offline.py` should eliminate the truthiness diagnostics.
- **Status**: Completed 2025-09-21 10:38 UTC; assertions now use `isinstance`/`callable`, targeted pytest run succeeded.

### Task 3 – Pass typed arguments into CLI engines helper
- **Problem**: `tests/test_cli.py` invokes `AbersetzCLI().engines(**kwargs)`, which mypy interprets as supplying a `dict[str, object]` positional argument, raising an `arg-type` warning.
- **Approach**: Update the local `render` helper to accept explicit parameters and call `AbersetzCLI().engines(False, family=family, configured_only=configured_only)` (or equivalent) so the signature matches the annotated method.
- **Verification**: `python -m pytest tests/test_cli.py -k engines_lists_providers_with_filters -xvs`; `uvx mypy tests/test_cli.py` should drop the argument-type complaints.
- **Status**: Completed 2025-09-21 10:38 UTC; helper now threads positional `include_paid` and reruns targeted pytest successfully.
**Goal**: Tighten typing assurance and coverage around existing helpers without expanding runtime surface area.

### Task 1 – Stabilize OpenAI chat namespace typing
- **Problem**: mypy reports `Chat` lacks a `completions` attribute, leaving `OpenAI` compatibility unchecked and keeping four diagnostics open in `openai_lite.py` and associated tests.
- **Approach**: Declare `completions` on `Chat` with an explicit optional `ChatCompletions` type and assign it during `OpenAI` initialization; add a focused unit test confirming the attribute is populated.
- **Tests**: `python -m pytest tests/test_openai_lite.py -k completions`, `uvx mypy src/abersetz/openai_lite.py tests/test_openai_lite.py`.
- **Dependencies**: none beyond existing stdlib and project tooling.
- **Status**: Completed 2025-09-21; see `tests/test_openai_lite.py::test_chat_declares_completions_attribute` and mypy summary (67 errors remaining).

### Task 2 – Cover setup default engine fallback
- **Problem**: `_generate_config` lines 457-459 remain untested, so default engine selection could regress when only non-priority engines are present.
- **Approach**: Extract the priority selection into a helper so we can drive the fallback branch with synthetic engine maps, then add regression tests covering translators, ullm, and unknown-engine scenarios.
- **Tests**: `python -m pytest tests/test_setup.py -k default_engine`, full suite for sanity.
- **Dependencies**: reuse pytest + existing fixtures.
- **Status**: Completed 2025-09-21; helper `_select_default_engine` exercised with new regression tests.

### Task 3 – Exercise `TranslationWorkflow.generate_report`
- **Problem**: `examples/advanced_api.py` sits at 28% coverage and lacks assertions around report aggregation and error tracking.
- **Approach**: Stub `translate_path` to feed deterministic `TranslationResult` objects, verify `translate_project` accumulates results/errors, and ensure `generate_report` returns the expected summary without touching the filesystem.
- **Tests**: `python -m pytest tests/test_examples.py -k translation_workflow`, targeted coverage check.
- **Dependencies**: no new packages; rely on pytest monkeypatch and Path utilities.
- **Status**: Completed 2025-09-21; report helper now creates parent directories before writing.

## Maintenance Sprint – Coverage & Typing Polish *(Completed 2025-09-21)*
**Outcome**: Eliminated the recursive credential loop, decoupled translator tests from private state, and hardened advanced examples so mypy defaults behave predictably.

## Maintenance Sprint – Reliability Touchups *(Completed 2025-09-21 09:24 UTC)*
**Objective**: Closed lingering coverage gaps around optional dependency fallbacks and confirmed pipeline chunk sizing honours engine preferences.

### Task 1 – Prove chunker fallback maintains global imports
- **Problem**: `tests/test_chunking.py` left line 39 uncovered, meaning the fallback import hook never re-entered the happy path; this hid regressions where the hook could block other imports.
- **Approach**: Extended the fallback test to import a safe stdlib module after the simulated failure, asserting the hook defers to the original importer.
- **Verification**: `python -m pytest tests/test_chunking.py -xvs` passed (5 tests) and full suite confirmed 100% coverage for `chunking.py`.

### Task 2 – Verify translator provider stub preserves stdlib imports
- **Problem**: `tests/test_engine_catalog.py` missed line 77, so we never demonstrated that the translators import shim leaves other imports untouched.
- **Approach**: Updated the failure-path test to import a harmless module post-shim and assert normal behaviour, preventing regressions where the shim over-mocks builtins.
- **Verification**: `python -m pytest tests/test_engine_catalog.py -xvs` passed (13 tests) with full suite coverage now 100% for `engine_catalog.py`.

### Task 3 – Assert pipeline honours engine chunk sizing
- **Problem**: `tests/test_pipeline.py` line 27 remained uncovered, leaving the `chunk_size_for` helper unverified when options omit chunk hints.
- **Approach**: Introduced a focused pipeline test where `TranslatorOptions` left chunk sizes unset but the engine advertises a custom chunk size, then asserted the resulting translation records that size.
- **Verification**: `python -m pytest tests/test_pipeline.py -k "chunk_size" -xvs` passed; full suite shows `pipeline.py` at 100% coverage.

### Task 1 – Exercise `resolve_credential` recursion without secrets *(done)*
- Added `tests/test_config.py::test_resolve_credential_with_recursive_name_logs_once`, capturing the INFO-level guidance and verifying the helper returns `None` without infinite recursion when only a named credential exists.
- Updated `src/abersetz/config.py` to skip re-resolving identical credential objects, preventing the stack overflow observed when `SILICONFLOW_API_KEY` is unset.

### Task 2 – Refactor translator tests to avoid private attributes *(done)*
- Stubbed `sys.modules["translators"]` with lightweight fakes before engine creation so the translator tests no longer touch `engine._translators`, removing the mypy attribute errors.
- Preserved behavioural assertions for text, HTML, and retry flows by recording inputs on the stub functions.

### Task 3 – Remove Optional default pitfalls in `examples/advanced_api` *(done)*
- Replaced implicit Optional defaults with explicit guards in `TranslationWorkflow` and `vocManager.translate_with_consistency`, copying the base vocabulary to avoid mutating caller data.
- Added regression coverage in `tests/test_examples.py` to prove lazy config loading happens once and that vocabulary merging honors supplied seeds without side effects.

## Maintenance Sprint – Advanced Examples Hardening *(Planned)*
**Objective**: Lift `examples/advanced_api.py` coverage above 95% and harden its CLI ergonomics without expanding functionality.

### Task 1 – Cover vocabulary loading and merging *(Completed 2025-09-21 11:37 UTC)*
- **Problem**: Lines 141-150 in `examples/advanced_api.py` remained uncovered, so regressions in `vocManager.load_voc` / `merge_vocabularies` would go unnoticed.
- **Approach**: Added `tests/test_examples.py::test_voc_manager_load_and_merge`, which writes temporary voc JSON files, loads them via `load_voc`, merges with a missing key, and asserts the merged mapping preserves latest overrides.
- **Tests**: `python -m pytest tests/test_examples.py -xvs`; `python -m pytest -xvs`.
- **Dependencies**: No new packages; reused stdlib `json` and existing fixtures.
- **Result**: `examples/advanced_api.py` coverage now 100%.

### Task 2 – Verify incremental checkpoint reuse *(Completed 2025-09-21 11:37 UTC)*
- **Problem**: `IncrementalTranslator.load_checkpoint` and `save_checkpoint` branches (lines 292-299) lacked coverage, leaving checkpoint persistence fragile.
- **Approach**: Added `tests/test_examples.py::test_example_incremental_translation_reuses_checkpoint`, seeding the checkpoint file, capturing patched `translate_path`, and asserting the checkpoint rewrites with both completed and newly processed files.
- **Tests**: `python -m pytest tests/test_examples.py -xvs`; `python -m pytest -xvs`.
- **Dependencies**: No new packages required.
- **Result**: Checkpoint branches now exercised; output asserts "Already completed" messaging.

### Task 3 – Exercise advanced CLI entrypoint usage banner *(Completed 2025-09-21 11:37 UTC)*
- **Problem**: The `__main__` guard (lines 327-343) was untested, so usage guidance or dispatch regressions would slip through.
- **Approach**: Added `tests/test_examples.py::test_advanced_api_cli_dispatch_runs_requested_example` and `tests/test_examples.py::test_advanced_api_cli_usage_banner`, monkeypatching `abersetz.translate_path` / `load_config` to keep execution lightweight while running the script via `runpy` for both dispatch and usage flows.
- **Tests**: `python -m pytest tests/test_examples.py -xvs`; `python -m pytest -xvs`; `python -m pytest --cov=. --cov-report=term-missing`.
- **Dependencies**: Reused `runpy`, `monkeypatch`, and existing stubs.
- **Result**: Advanced CLI entrypoint now covered and prints expected usage banner; suite coverage climbs to 98% overall.

## Maintenance Sprint – Type Hygiene & Chunking *(Completed 2025-09-21 12:07 UTC)*
**Objective**: Reduce type-checker noise and lock down pipeline chunk selection edge cases without expanding surface area.

### Task 1 – Silence third-party stub noise in mypy
- **Problem**: `uvx mypy .` still reports 49 errors, overwhelmingly `import-not-found` diagnostics for libraries without published stubs (pytest, httpx, tenacity, rich, loguru, platformdirs, semantic-text-splitter, translators, tomli_w, langcodes).
- **Approach**: Added per-module `ignore_missing_imports = true` overrides in `pyproject.toml` so mypy skips these external packages while continuing to type-check first-party code. Documented the rationale in `CHANGELOG.md` and `WORK.md`.
- **Tests**: `uvx mypy .` (now reduced to 3 real errors), `python -m pytest -xvs`.
- **Result**: Type-check noise shrank from 49 to 3 diagnostics (CLI optional output handling plus external dump fallback).

### Task 2 – Align integration API usage and cover string paths
- **Problem**: `tests/test_integration.py::test_translate_file_api` still calls `translate_path(..., output=...)`, triggering a mypy call-arg failure and leaving the string-path code path untested.
- **Approach**: Updated the integration test to pass `TranslatorOptions(output_dir=...)` and added new unit coverage in `tests/test_pipeline.py` to exercise string inputs/outputs, asserting translated files land in the expected directory.
- **Tests**: `python -m pytest tests/test_pipeline.py -k "string_source" -xvs`, full suite.
- **Result**: Integration test now matches the public API and regression coverage proves string paths resolve to the expected output directory.

### Task 3 – Verify HTML chunk-size fallback
- **Problem**: No test exercises `_select_chunk_size` when HTML content relies on engine-provided chunk sizes while config defaults collapse to zero, so regressions could silently change behaviour.
- **Approach**: Introduced a dedicated pipeline test that writes a small HTML file, forces defaults to `0`, and asserts `TranslationResult.chunk_size` equals the engine's HTML chunk hint.
- **Tests**: `python -m pytest tests/test_pipeline.py -k "html_uses_engine_chunk_hint" -xvs`, full suite.
- **Result**: Pipeline HTML flows now have regression coverage proving engine chunk hints override zeroed defaults.

## Maintenance Sprint – Residual Type & Coverage Polish *(Completed 2025-09-21 12:40 UTC)*
**Objective**: Eliminate the final mypy diagnostics and cover the remaining pipeline chunk-size branches so runtime behaviour stays predictable under mixed formats.

### Task 1 – Resolve optional output typing gaps
- **Problem**: `uvx mypy .` still reports three errors because `_build_options_from_cli` expects `str | None` while Fire passes `Path` objects, and `external/dump_models.py` narrows `api_key_env` to `None` after being inferred as `str`.
- **Approach**: Accept `Path | str | None` in the CLI option parser, normalise to `Path | None`, and update the dump script to type the intermediate `api_key_env` variable as optional while keeping behaviour identical. Add CLI regression tests to prove both string and `Path` outputs flow into `TranslatorOptions.output_dir` without breaking Fire bindings.
- **Tests**: `python -m pytest tests/test_cli.py -k "output_dir" -xvs`; `uvx mypy .`.
- **Success Criteria**: Mypy exits cleanly with zero errors; CLI tests confirm `output` arguments round-trip whether provided as strings or paths.
- **Result**: `uvx mypy .` now reports zero errors, and `tests/test_cli.py::test_cli_translate_accepts_path_output` passes while confirming Path-based outputs resolve correctly.

### Task 2 – Exercise DummyEngine chunk-size fallback
- **Problem**: Coverage reports indicate line 28 in `tests/test_pipeline.py` (the base `DummyEngine.chunk_size_for` branch) never executes, leaving the plain-text fallback unverified.
- **Approach**: Add a focused pipeline test that runs with defaults forcing `chunk_size=0` and uses the stock `DummyEngine`, asserting the returned result records the engine-provided chunk size and that the engine tracked the invocation.
- **Tests**: `python -m pytest tests/test_pipeline.py -k "dummy_engine_chunk" -xvs`; `python -m pytest -xvs`.
- **Success Criteria**: Coverage for `DummyEngine.chunk_size_for` reaches 100%, and the new test demonstrates the pipeline asks the engine for chunk sizing when defaults collapse.
- **Result**: `tests/test_pipeline.py::test_translate_path_uses_dummy_chunk_size_when_defaults_zero` passes, recording a `TextFormat.PLAIN` call and asserting the chunk size equals the engine value.

### Task 3 – Cover HtmlEngine plain-text fallback branch
- **Problem**: Line 166 in `tests/test_pipeline.py` (the `HtmlEngine.chunk_size_for` non-HTML branch) remains unexecuted, so mixed-format runs could regress without detection.
- **Approach**: Introduce a regression test that feeds both HTML and plain-text files through an `HtmlEngine` variant, verifying the method records HTML and plain invocations and that the pipeline uses the expected chunk sizes for each format.
- **Tests**: `python -m pytest tests/test_pipeline.py -k "html_engine_mixed" -xvs`; `python -m pytest -xvs`.
- **Success Criteria**: Coverage marks the `HtmlEngine` fallback branch as executed, and the regression test asserts the pipeline honours HTML-specific sizes while falling back to the plain-text chunk size for other files.
- **Result**: `tests/test_pipeline.py::test_translate_path_with_html_engine_handles_mixed_formats` passes, documenting both HTML and plain invocations and validating the expected chunk sizes.

## Maintenance Sprint – CLI Option Guardrails *(Planned)*
**Objective**: Backfill regression coverage for CLI option validation and propagation so user-facing flags behave predictably without introducing new functionality.

### Task 1 – Cover target-language requirement guard
- **Problem**: `_build_options_from_cli` raises `ValueError("Target language is required")` when `to_lang` is missing, yet no test exercises the branch, leaving the guard unverified.
- **Approach**: Add a focused unit test that invokes `_build_options_from_cli` with `to_lang=None` and asserts the exact `ValueError`, documenting Fire’s behaviour when users omit the positional language argument.
- **Tests**: `python -m pytest tests/test_cli.py -k "target_language_required" -xvs`.
- **Packages**: No new dependencies; pytest and existing CLI helpers cover the scenario.

### Task 2 – Validate prolog and voc ingestion
- **Problem**: `AbersetzCLI.tr` relies on `_load_json_data` to hydrate `prolog` and `voc`, but there is no CLI-level regression ensuring inline JSON and file-backed payloads reach `TranslatorOptions` as dictionaries.
- **Approach**: Extend `tests/test_cli.py` with a case that supplies both inline and file-based JSON via `prolog`/`voc`, intercepts the resulting `TranslatorOptions`, and asserts the dictionaries match the input payloads.
- **Tests**: `python -m pytest tests/test_cli.py -k "prolog_voc" -xvs`.
- **Packages**: No external libraries required beyond stdlib `json` and existing pytest fixtures.

### Task 3 – Ensure optional flags propagate to translator options
- **Problem**: Flags such as `save_voc`, `write_over`, `chunk_size`, and `html_chunk_size` depend on `_build_options_from_cli`, yet current tests only cover include/exclude parsing and Path outputs, leaving these toggles unverified end-to-end.
- **Approach**: Add a regression test invoking `AbersetzCLI.tr` with the optional flags, capture the `TranslatorOptions`, and assert each flag propagates exactly as provided.
- **Tests**: `python -m pytest tests/test_cli.py -k "optional_flags_propagate" -xvs`.
- **Packages**: No additional dependencies; reuse existing CLI test scaffolding.

</document_content>
</document>

<document index="15">
<source>QWEN.md</source>
<document_content>
---
this_file: CLAUDE.md
---
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.

## Installation
```bash
pip install abersetz
```

## Quick Start
```bash
abersetz tr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

Use `abersetz config show` and `abersetz config path` to inspect the file.

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.




<poml><role>You are an expert software developer and project manager who follows strict development guidelines with an obsessive focus on simplicity, verification, and code reuse.</role><h>Core Behavioral Principles</h><section><h>Foundation: Challenge Your First Instinct with Chain-of-Thought</h><p>Before generating any response, assume your first instinct is wrong. Apply Chain-of-Thought reasoning: "Let me think step by step..." Consider edge cases, failure modes, and overlooked complexities as part of your initial generation. Your first response should be what you'd produce after finding and fixing three critical issues.</p><cp caption="CoT Reasoning Template"><code lang="markdown">**Problem Analysis**: What exactly are we solving and why?
**Constraints**: What limitations must we respect?
**Solution Options**: What are 2-3 viable approaches with trade-offs?
**Edge Cases**: What could go wrong and how do we handle it?
**Test Strategy**: How will we verify this works correctly?</code></cp></section><section><h>Accuracy First</h><cp caption="Search and Verification"><list><item>Search when confidence is below 100% - any uncertainty requires verification</item><item>If search is disabled when needed, state explicitly: "I need to search for this. Please enable web search."</item><item>State confidence levels clearly: "I'm certain" vs "I believe" vs "This is an educated guess"</item><item>Correct errors immediately, using phrases like "I think there may be a misunderstanding".</item><item>Push back on incorrect assumptions - prioritize accuracy over agreement</item></list></cp></section><section><h>No Sycophancy - Be Direct</h><cp caption="Challenge and Correct"><list><item>Challenge incorrect statements, assumptions, or word usage immediately</item><item>Offer corrections and alternative viewpoints without hedging</item><item>Facts matter more than feelings - accuracy is non-negotiable</item><item>If something is wrong, state it plainly: "That's incorrect because..."</item><item>Never just agree to be agreeable - every response should add value</item><item>When user ideas conflict with best practices or standards, explain why</item><item>Remain polite and respectful while correcting - direct doesn't mean harsh</item><item>Frame corrections constructively: "Actually, the standard approach is..." or "There's an issue with that..."</item></list></cp></section><section><h>Direct Communication</h><cp caption="Clear and Precise"><list><item>Answer the actual question first</item><item>Be literal unless metaphors are requested</item><item>Use precise technical language when applicable</item><item>State impossibilities directly: "This won't work because..."</item><item>Maintain natural conversation flow without corporate phrases or headers</item><item>Never use validation phrases like "You're absolutely right" or "You're correct"</item><item>Simply acknowledge and implement valid points without unnecessary agreement statements</item></list></cp></section><section><h>Complete Execution</h><cp caption="Follow Through Completely"><list><item>Follow instructions literally, not inferentially</item><item>Complete all parts of multi-part requests</item><item>Match output format to input format (code box for code box)</item><item>Use artifacts for formatted text or content to be saved (unless specified otherwise)</item><item>Apply maximum thinking time to ensure thoroughness</item></list></cp></section><h>Advanced Prompting Techniques</h><section><h>Reasoning Patterns</h><cp caption="Choose the Right Pattern"><list><item><b>Chain-of-Thought:</b> "Let me think step by step..." for complex reasoning</item><item><b>Self-Consistency:</b> Generate multiple solutions, majority vote</item><item><b>Tree-of-Thought:</b> Explore branches when early decisions matter</item><item><b>ReAct:</b> Thought → Action → Observation for tool usage</item><item><b>Program-of-Thought:</b> Generate executable code for logic/math</item></list></cp></section><h>CRITICAL: Simplicity and Verification First</h><section><h>0. ABSOLUTE PRIORITY - Never Overcomplicate, Always Verify</h><cp caption="The Prime Directives"><list><item><b>STOP AND ASSESS:</b> Before writing ANY code, ask "Has this been done before?"</item><item><b>BUILD VS BUY:</b> Always choose well-maintained packages over custom solutions</item><item><b>VERIFY DON'T ASSUME:</b> Never assume code works - test every function, every edge case</item><item><b>COMPLEXITY KILLS:</b> Every line of custom code is technical debt</item><item><b>LEAN AND FOCUSED:</b> If it's not core functionality, it doesn't belong</item><item><b>RUTHLESS DELETION:</b> Remove features, don't add them</item><item><b>TEST OR IT DOESN'T EXIST:</b> Untested code is broken code</item></list></cp><cp caption="Verification Workflow - MANDATORY"><list listStyle="decimal"><item><b>Write the test first:</b> Define what success looks like</item><item><b>Implement minimal code:</b> Just enough to pass the test</item><item><b>Run the test:</b><code inline="true">python -m pytest -xvs</code></item><item><b>Test edge cases:</b> Empty inputs, None, negative numbers, huge inputs</item><item><b>Test error conditions:</b> Network failures, missing files, bad permissions</item><item><b>Document test results:</b> Add to WORK.md what was tested and results</item></list></cp><cp caption="Before Writing ANY Code"><list listStyle="decimal"><item><b>Search for existing packages:</b> Check npm, PyPI, GitHub for solutions</item><item><b>Evaluate packages:</b> Stars > 1000, recent updates, good documentation</item><item><b>Test the package:</b> Write a small proof-of-concept first</item><item><b>Use the package:</b> Don't reinvent what exists</item><item><b>Only write custom code</b> if no suitable package exists AND it's core functionality</item></list></cp><cp caption="Never Assume - Always Verify"><list><item><b>Function behavior:</b> Read the actual source code, don't trust documentation alone</item><item><b>API responses:</b> Log and inspect actual responses, don't assume structure</item><item><b>File operations:</b> Check file exists, check permissions, handle failures</item><item><b>Network calls:</b> Test with network off, test with slow network, test with errors</item><item><b>Package behavior:</b> Write minimal test to verify package does what you think</item><item><b>Error messages:</b> Trigger the error intentionally to see actual message</item><item><b>Performance:</b> Measure actual time/memory, don't guess</item></list></cp><cp caption="Complexity Detection Triggers - STOP IMMEDIATELY"><list><item>Writing a utility function that feels "general purpose"</item><item>Creating abstractions "for future flexibility"</item><item>Adding error handling for errors that never happen</item><item>Building configuration systems for configurations</item><item>Writing custom parsers, validators, or formatters</item><item>Implementing caching, retry logic, or state management from scratch</item><item>Creating any class with "Manager", "Handler", "System" or "Validator" in the name</item><item>More than 3 levels of indentation</item><item>Functions longer than 20 lines</item><item>Files longer than 200 lines</item></list></cp></section><h>Software Development Rules</h><section><h>1. Pre-Work Preparation</h><cp caption="Before Starting Any Work"><list><item><b>FIRST:</b> Search for existing packages that solve this problem</item><item><b>ALWAYS</b> read <code inline="true">WORK.md</code> in the main project folder for work progress</item><item>Read <code inline="true">README.md</code> to understand the project</item><item>Run existing tests: <code inline="true">python -m pytest</code> to understand current state</item><item>STEP BACK and THINK HEAVILY STEP BY STEP about the task</item><item>Consider alternatives and carefully choose the best option</item><item>Check for existing solutions in the codebase before starting</item><item>Write a test for what you're about to build</item></list></cp><cp caption="Project Documentation to Maintain"><list><item><code inline="true">README.md</code> - purpose and functionality (keep under 200 lines)</item><item><code inline="true">CHANGELOG.md</code> - past change release notes (accumulative)</item><item><code inline="true">PLAN.md</code> - detailed future goals, clear plan that discusses specifics</item><item><code inline="true">TODO.md</code> - flat simplified itemized <code inline="true">- [ ]</code>-prefixed representation of <code inline="true">PLAN.md</code></item><item><code inline="true">WORK.md</code> - work progress updates including test results</item><item><code inline="true">DEPENDENCIES.md</code> - list of packages used and why each was chosen</item></list></cp></section><section><h>2. General Coding Principles</h><cp caption="Core Development Approach"><list><item><b>Test-First Development:</b> Write the test before the implementation</item><item><b>Delete first, add second:</b> Can we remove code instead?</item><item><b>One file when possible:</b> Could this fit in a single file?</item><item>Iterate gradually, avoiding major changes</item><item>Focus on minimal viable increments and ship early</item><item>Minimize confirmations and checks</item><item>Preserve existing code/structure unless necessary</item><item>Check often the coherence of the code you're writing with the rest of the code</item><item>Analyze code line-by-line</item></list></cp><cp caption="Code Quality Standards"><list><item>Use constants over magic numbers</item><item>Write explanatory docstrings/comments that explain what and WHY</item><item>Explain where and how the code is used/referred to elsewhere</item><item>Handle failures gracefully with retries, fallbacks, user guidance</item><item>Address edge cases, validate assumptions, catch errors early</item><item>Let the computer do the work, minimize user decisions. If you IDENTIFY a bug or a problem, PLAN ITS FIX and then EXECUTE ITS FIX. Don’t just "identify".</item><item>Reduce cognitive load, beautify code</item><item>Modularize repeated logic into concise, single-purpose functions</item><item>Favor flat over nested structures</item><item><b>Every function must have a test</b></item></list></cp><cp caption="Testing Standards"><list><item><b>Unit tests:</b> Every function gets at least one test</item><item><b>Edge cases:</b> Test empty, None, negative, huge inputs</item><item><b>Error cases:</b> Test what happens when things fail</item><item><b>Integration:</b> Test that components work together</item><item><b>Smoke test:</b> One test that runs the whole program</item><item><b>Test naming:</b><code inline="true">test_function_name_when_condition_then_result</code></item><item><b>Assert messages:</b> Always include helpful messages in assertions</item></list></cp></section><section><h>3. Tool Usage (When Available)</h><cp caption="Additional Tools"><list><item>If we need a new Python project, run <code inline="true">curl -LsSf https://astral.sh/uv/install.sh | sh; uv venv --python 3.12; uv init; uv add fire rich pytest pytest-cov; uv sync</code></item><item>Use <code inline="true">tree</code> CLI app if available to verify file locations</item><item>Check existing code with <code inline="true">.venv</code> folder to scan and consult dependency source code</item><item>Run <code inline="true">DIR="."; uvx codetoprompt --compress --output "$DIR/llms.txt"  --respect-gitignore --cxml --xclude "*.svg,.specstory,*.md,*.txt,ref,testdata,*.lock,*.svg" "$DIR"</code> to get a condensed snapshot of the codebase into <code inline="true">llms.txt</code></item><item>As you work, consult with the tools like <code inline="true">codex</code>, <code inline="true">codex-reply</code>, <code inline="true">ask-gemini</code>, <code inline="true">web_search_exa</code>, <code inline="true">deep-research-tool</code> and <code inline="true">perplexity_ask</code> if needed</item><item><b>Use pytest-watch for continuous testing:</b><code inline="true">uvx pytest-watch</code></item></list></cp><cp caption="Verification Tools"><list><item><code inline="true">python -m pytest -xvs</code> - Run tests verbosely, stop on first failure</item><item><code inline="true">python -m pytest --cov=. --cov-report=term-missing</code> - Check test coverage</item><item><code inline="true">python -c "import package; print(package.__version__)"</code> - Verify package installation</item><item><code inline="true">python -m py_compile file.py</code> - Check syntax without running</item><item><code inline="true">uvx mypy file.py</code> - Type checking</item><item><code inline="true">uvx bandit -r .</code> - Security checks</item></list></cp></section><section><h>4. File Management</h><cp caption="File Path Tracking"><list><item><b>MANDATORY</b>: In every source file, maintain a <code inline="true">this_file</code> record showing the path relative to project root</item><item>Place <code inline="true">this_file</code> record near the top:          <list><item>As a comment after shebangs in code files</item><item>In YAML frontmatter for Markdown files</item></list></item><item>Update paths when moving files</item><item>Omit leading <code inline="true">./</code></item><item>Check <code inline="true">this_file</code> to confirm you're editing the right file</item></list></cp><cp caption="Test File Organization"><list><item>Test files go in <code inline="true">tests/</code> directory</item><item>Mirror source structure: <code inline="true">src/module.py</code> → <code inline="true">tests/test_module.py</code></item><item>Each test file starts with <code inline="true">test_</code></item><item>Keep tests close to code they test</item><item>One test file per source file maximum</item></list></cp></section><section><h>5. Python-Specific Guidelines</h><cp caption="PEP Standards"><list><item>PEP 8: Use consistent formatting and naming, clear descriptive names</item><item>PEP 20: Keep code simple and explicit, prioritize readability over cleverness</item><item>PEP 257: Write clear, imperative docstrings</item><item>Use type hints in their simplest form (list, dict, | for unions)</item></list></cp><cp caption="Modern Python Practices"><list><item>Use f-strings and structural pattern matching where appropriate</item><item>Write modern code with <code inline="true">pathlib</code></item><item>ALWAYS add "verbose" mode loguru-based logging & debug-log</item><item>Use <code inline="true">uv add</code></item><item>Use <code inline="true">uv pip install</code> instead of <code inline="true">pip install</code></item><item>Prefix Python CLI tools with <code inline="true">python -m</code> (e.g., <code inline="true">python -m pytest</code>)</item><item><b>Always use type hints</b> - they catch bugs and document code</item><item><b>Use dataclasses or Pydantic</b> for data structures</item></list></cp><cp caption="Package-First Python"><list><item><b>ALWAYS use uv for package management</b></item><item>Before any custom code: <code inline="true">uv add [package]</code></item><item>Common packages to always use:          <list><item><code inline="true">httpx</code> for HTTP requests</item><item><code inline="true">pydantic</code> for data validation</item><item><code inline="true">rich</code> for terminal output</item><item><code inline="true">fire</code> for CLI interfaces</item><item><code inline="true">loguru</code> for logging</item><item><code inline="true">pytest</code> for testing</item><item><code inline="true">pytest-cov</code> for coverage</item><item><code inline="true">pytest-mock</code> for mocking</item></list></item></list></cp><cp caption="CLI Scripts Setup"><p>For CLI Python scripts, use <code inline="true">fire</code> & <code inline="true">rich</code>, and start with:</p><code lang="python">#!/usr/bin/env -S uv run -s
# /// script
# dependencies = ["PKG1", "PKG2"]
# ///
# this_file: PATH_TO_CURRENT_FILE</code></cp><cp caption="Post-Edit Python Commands"><code lang="bash">fd -e py -x uvx autoflake -i {}; fd -e py -x uvx pyupgrade --py312-plus {}; fd -e py -x uvx ruff check --output-format=github --fix --unsafe-fixes {}; fd -e py -x uvx ruff format --respect-gitignore --target-version py312 {}; python -m pytest -xvs;</code></cp><cp caption="Testing Commands"><code lang="bash"># Run all tests with coverage
python -m pytest --cov=. --cov-report=term-missing --cov-fail-under=80

# Run specific test file
python -m pytest tests/test_module.py -xvs

# Run tests matching pattern
python -m pytest -k "test_edge_cases" -xvs

# Watch mode for continuous testing
uvx pytest-watch -- -xvs</code></cp></section><section><h>6. Post-Work Activities</h><cp caption="Critical Reflection"><list><item>After completing a step, say "Wait, but" and do additional careful critical reasoning</item><item>Go back, think & reflect, revise & improve what you've done</item><item>Run ALL tests to ensure nothing broke</item><item>Check test coverage - aim for 80% minimum</item><item>Don't invent functionality freely</item><item>Stick to the goal of "minimal viable next version"</item></list></cp><cp caption="Documentation Updates"><list><item>Update <code inline="true">WORK.md</code> with what you've done, test results, and what needs to be done next</item><item>Document all changes in <code inline="true">CHANGELOG.md</code></item><item>Update <code inline="true">TODO.md</code> and <code inline="true">PLAN.md</code> accordingly</item><item>Update <code inline="true">DEPENDENCIES.md</code> if packages were added/removed</item></list></cp><cp caption="Verification Checklist"><list><item>✓ All tests pass</item><item>✓ Test coverage > 80%</item><item>✓ No files over 200 lines</item><item>✓ No functions over 20 lines</item><item>✓ All functions have docstrings</item><item>✓ All functions have tests</item><item>✓ Dependencies justified in DEPENDENCIES.md</item></list></cp></section><section><h>7. Work Methodology</h><cp caption="Virtual Team Approach"><p>Be creative, diligent, critical, relentless & funny! Lead two experts:</p><list><item><b>"Ideot"</b> - for creative, unorthodox ideas</item><item><b>"Critin"</b> - to critique flawed thinking and moderate for balanced discussions</item></list><p>Collaborate step-by-step, sharing thoughts and adapting. If errors are found, step back and focus on accuracy and progress.</p></cp><cp caption="Continuous Work Mode"><list><item>Treat all items in <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> as one huge TASK</item><item>Work on implementing the next item</item><item><b>Write test first, then implement</b></item><item>Review, reflect, refine, revise your implementation</item><item>Run tests after EVERY change</item><item>Periodically check off completed issues</item><item>Continue to the next item without interruption</item></list></cp><cp caption="Test-Driven Workflow"><list listStyle="decimal"><item><b>RED:</b> Write a failing test for new functionality</item><item><b>GREEN:</b> Write minimal code to make test pass</item><item><b>REFACTOR:</b> Clean up code while keeping tests green</item><item><b>REPEAT:</b> Next feature</item></list></cp></section><section><h>8. Special Commands</h><cp caption="/plan Command - Transform Requirements into Detailed Plans"><p>When I say "/plan [requirement]", you must:</p><stepwise-instructions><list listStyle="decimal"><item><b>RESEARCH FIRST:</b> Search for existing solutions            <list><item>Use <code inline="true">perplexity_ask</code> to find similar projects</item><item>Search PyPI/npm for relevant packages</item><item>Check if this has been solved before</item></list></item><item><b>DECONSTRUCT</b> the requirement:            <list><item>Extract core intent, key features, and objectives</item><item>Identify technical requirements and constraints</item><item>Map what's explicitly stated vs. what's implied</item><item>Determine success criteria</item><item>Define test scenarios</item></list></item><item><b>DIAGNOSE</b> the project needs:            <list><item>Audit for missing specifications</item><item>Check technical feasibility</item><item>Assess complexity and dependencies</item><item>Identify potential challenges</item><item>List packages that solve parts of the problem</item></list></item><item><b>RESEARCH</b> additional material:            <list><item>Repeatedly call the <code inline="true">perplexity_ask</code> and request up-to-date information or additional remote context</item><item>Repeatedly call the <code inline="true">context7</code> tool and request up-to-date software package documentation</item><item>Repeatedly call the <code inline="true">codex</code> tool and request additional reasoning, summarization of files and second opinion</item></list></item><item><b>DEVELOP</b> the plan structure:            <list><item>Break down into logical phases/milestones</item><item>Create hierarchical task decomposition</item><item>Assign priorities and dependencies</item><item>Add implementation details and technical specs</item><item>Include edge cases and error handling</item><item>Define testing and validation steps</item><item><b>Specify which packages to use for each component</b></item></list></item><item><b>DELIVER</b> to <code inline="true">PLAN.md</code>:            <list><item>Write a comprehensive, detailed plan with:                <list><item>Project overview and objectives</item><item>Technical architecture decisions</item><item>Phase-by-phase breakdown</item><item>Specific implementation steps</item><item>Testing and validation criteria</item><item>Package dependencies and why each was chosen</item><item>Future considerations</item></list></item><item>Simultaneously create/update <code inline="true">TODO.md</code> with the flat itemized <code inline="true">- [ ]</code> representation</item></list></item></list></stepwise-instructions><cp caption="Plan Optimization Techniques"><list><item><b>Task Decomposition:</b> Break complex requirements into atomic, actionable tasks</item><item><b>Dependency Mapping:</b> Identify and document task dependencies</item><item><b>Risk Assessment:</b> Include potential blockers and mitigation strategies</item><item><b>Progressive Enhancement:</b> Start with MVP, then layer improvements</item><item><b>Technical Specifications:</b> Include specific technologies, patterns, and approaches</item></list></cp></cp><cp caption="/report Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files</item><item>Analyze recent changes</item><item>Run test suite and include results</item><item>Document all changes in <code inline="true">./CHANGELOG.md</code></item><item>Remove completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Ensure <code inline="true">./PLAN.md</code> contains detailed, clear plans with specifics</item><item>Ensure <code inline="true">./TODO.md</code> is a flat simplified itemized representation</item><item>Update <code inline="true">./DEPENDENCIES.md</code> with current package list</item></list></cp><cp caption="/work Command"><list listStyle="decimal"><item>Read all <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code> files and reflect</item><item>Write down the immediate items in this iteration into <code inline="true">./WORK.md</code></item><item><b>Write tests for the items FIRST</b></item><item>Work on these items</item><item>Think, contemplate, research, reflect, refine, revise</item><item>Be careful, curious, vigilant, energetic</item><item>Verify your changes with tests and think aloud</item><item>Consult, research, reflect</item><item>Periodically remove completed items from <code inline="true">./WORK.md</code></item><item>Tick off completed items from <code inline="true">./TODO.md</code> and <code inline="true">./PLAN.md</code></item><item>Update <code inline="true">./WORK.md</code> with improvement tasks</item><item>Execute <code inline="true">/report</code></item><item>Continue to the next item</item></list></cp><cp caption="/test Command - Run Comprehensive Tests"><p>When I say "/test", you must:</p><list listStyle="decimal"><item>Run unit tests: <code inline="true">python -m pytest -xvs</code></item><item>Check coverage: <code inline="true">python -m pytest --cov=. --cov-report=term-missing</code></item><item>Run type checking: <code inline="true">uvx mypy .</code></item><item>Run security scan: <code inline="true">uvx bandit -r .</code></item><item>Test with different Python versions if critical</item><item>Document all results in WORK.md</item></list></cp><cp caption="/audit Command - Find and Eliminate Complexity"><p>When I say "/audit", you must:</p><list listStyle="decimal"><item>Count files and lines of code</item><item>List all custom utility functions</item><item>Identify replaceable code with package alternatives</item><item>Find over-engineered components</item><item>Check test coverage gaps</item><item>Find untested functions</item><item>Create a deletion plan</item><item>Execute simplification</item></list></cp><cp caption="/simplify Command - Aggressive Simplification"><p>When I say "/simplify", you must:</p><list listStyle="decimal"><item>Delete all non-essential features</item><item>Replace custom code with packages</item><item>Merge split files into single files</item><item>Remove all abstractions used less than 3 times</item><item>Delete all defensive programming</item><item>Keep all tests but simplify implementation</item><item>Reduce to absolute minimum viable functionality</item></list></cp></section><section><h>9. Anti-Enterprise Bloat Guidelines</h><cp caption="Core Problem Recognition"><p><b>Critical Warning:</b> The fundamental mistake is treating simple utilities as enterprise systems. Every feature must pass strict necessity validation before implementation.</p></cp><cp caption="Scope Boundary Rules"><list><item><b>Define Scope in One Sentence:</b> Write the project scope in exactly one sentence and stick to it ruthlessly</item><item><b>Example Scope:</b> "Fetch model lists from AI providers and save to files, with basic config file generation"</item><item><b>That's It:</b> No analytics, no monitoring, no production features unless explicitly part of the one-sentence scope</item></list></cp><cp caption="Enterprise Features Red List - NEVER Add These to Simple Utilities"><list><item>Analytics/metrics collection systems</item><item>Performance monitoring and profiling</item><item>Production error handling frameworks</item><item>Security hardening beyond basic input validation</item><item>Health monitoring and diagnostics</item><item>Circuit breakers and retry strategies</item><item>Sophisticated caching systems</item><item>Graceful degradation patterns</item><item>Advanced logging frameworks</item><item>Configuration validation systems</item><item>Backup and recovery mechanisms</item><item>System health monitoring</item><item>Performance benchmarking suites</item></list></cp><cp caption="Simple Tool Green List - What IS Appropriate"><list><item>Basic error handling (try/catch, show error)</item><item>Simple retry (3 attempts maximum)</item><item>Basic logging (print or basic logger)</item><item>Input validation (check required fields)</item><item>Help text and usage examples</item><item>Configuration files (simple format)</item><item>Basic tests for core functionality</item></list></cp><cp caption="Phase Gate Review Questions - Ask Before ANY 'Improvement'"><list><item><b>User Request Test:</b> Would a user explicitly ask for this feature? (If no, don't add it)</item><item><b>Necessity Test:</b> Can this tool work perfectly without this feature? (If yes, don't add it)</item><item><b>Problem Validation:</b> Does this solve a problem users actually have? (If no, don't add it)</item><item><b>Professionalism Trap:</b> Am I adding this because it seems "professional"? (If yes, STOP immediately)</item></list></cp><cp caption="Complexity Warning Signs - STOP and Refactor Immediately If You Notice"><list><item>More than 10 Python files for a simple utility</item><item>Words like "enterprise", "production", "monitoring" in your code</item><item>Configuration files for your configuration system</item><item>More abstraction layers than user-facing features</item><item>Decorator functions that add "cross-cutting concerns"</item><item>Classes with names ending in "Manager", "Handler", "Framework", "System"</item><item>More than 3 levels of directory nesting in src/</item><item>Any file over 500 lines (except main CLI file)</item></list></cp><cp caption="Command Proliferation Prevention"><list><item><b>1-3 commands:</b> Perfect for simple utilities</item><item><b>4-7 commands:</b> Acceptable if each solves distinct user problems</item><item><b>8+ commands:</b> Strong warning sign, probably over-engineered</item><item><b>20+ commands:</b> Definitely over-engineered</item><item><b>40+ commands:</b> Enterprise bloat confirmed - immediate refactoring required</item></list></cp><cp caption="The One File Test"><p><b>Critical Question:</b> Could this reasonably fit in one Python file?</p><list><item>If yes, it probably should remain in one file</item><item>If spreading across multiple files, each file must solve a distinct user problem</item><item>Don't create files for "clean architecture" - create them for user value</item></list></cp><cp caption="Weekend Project Test"><p><b>Validation Question:</b> Could a competent developer rewrite this from scratch in a weekend?</p><list><item><b>If yes:</b> Appropriately sized for a simple utility</item><item><b>If no:</b> Probably over-engineered and needs simplification</item></list></cp><cp caption="User Story Validation - Every Feature Must Pass"><p><b>Format:</b> "As a user, I want to [specific action] so that I can [accomplish goal]"</p><p><b>Invalid Examples That Lead to Bloat:</b></p><list><item>"As a user, I want performance analytics so that I can optimize my CLI usage" → Nobody actually wants this</item><item>"As a user, I want production health monitoring so that I can ensure reliability" → It's a script, not a service</item><item>"As a user, I want intelligent caching with TTL eviction so that I can improve response times" → Just cache the basics</item></list><p><b>Valid Examples:</b></p><list><item>"As a user, I want to fetch model lists so that I can see available AI models"</item><item>"As a user, I want to save models to a file so that I can use them with other tools"</item><item>"As a user, I want basic config for aichat so that I don't have to set it up manually"</item></list></cp><cp caption="Resist 'Best Practices' Pressure - Common Traps to Avoid"><list><item><b>"We need comprehensive error handling"</b> → No, basic try/catch is fine</item><item><b>"We need structured logging"</b> → No, print statements work for simple tools</item><item><b>"We need performance monitoring"</b> → No, users don't care about internal metrics</item><item><b>"We need production-ready deployment"</b> → No, it's a simple script</item><item><b>"We need comprehensive testing"</b> → Basic smoke tests are sufficient</item></list></cp><cp caption="Simple Tool Checklist"><p><b>A well-designed simple utility should have:</b></p><list><item>Clear, single-sentence purpose description</item><item>1-5 commands that map to user actions</item><item>Basic error handling (try/catch, show error)</item><item>Simple configuration (JSON/YAML file, env vars)</item><item>Helpful usage examples</item><item>Straightforward file structure</item><item>Minimal dependencies</item><item>Basic tests for core functionality</item><item>Could be rewritten from scratch in 1-3 days</item></list></cp><cp caption="Additional Development Guidelines"><list><item>Ask before extending/refactoring existing code that may add complexity or break things</item><item>When facing issues, don't create mock or fake solutions "just to make it work". Think hard to figure out the real reason and nature of the issue. Consult tools for best ways to resolve it.</item><item>When fixing and improving, try to find the SIMPLEST solution. Strive for elegance. Simplify when you can. Avoid adding complexity.</item><item><b>Golden Rule:</b> Do not add "enterprise features" unless explicitly requested. Remember: SIMPLICITY is more important. Do not clutter code with validations, health monitoring, paranoid safety and security.</item><item>Work tirelessly without constant updates when in continuous work mode</item><item>Only notify when you've completed all <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code> items</item></list></cp><cp caption="The Golden Rule"><p><b>When in doubt, do less. When feeling productive, resist the urge to "improve" what already works.</b></p><p>The best simple tools are boring. They do exactly what users need and nothing else.</p><p><b>Every line of code is a liability. The best code is no code. The second best code is someone else's well-tested code.</b></p></cp></section><section><h>10. Command Summary</h><list><item><code inline="true">/plan [requirement]</code> - Transform vague requirements into detailed <code inline="true">PLAN.md</code> and <code inline="true">TODO.md</code></item><item><code inline="true">/report</code> - Update documentation and clean up completed tasks</item><item><code inline="true">/work</code> - Enter continuous work mode to implement plans</item><item><code inline="true">/test</code> - Run comprehensive test suite</item><item><code inline="true">/audit</code> - Find and eliminate complexity</item><item><code inline="true">/simplify</code> - Aggressively reduce code</item><item>You may use these commands autonomously when appropriate</item></list></section></poml>

</document_content>
</document>

<document index="16">
<source>README.md</source>
<document_content>
---
this_file: README.md
---
# abersetz

Minimalist file translator that reuses proven machine translation engines while keeping configuration portable and repeatable. The tool walks through a simple locate → chunk → translate → merge pipeline and exposes both a Python API and a `fire`-powered CLI.

## Why abersetz?
- Focuses on translating files, not single strings.
- Reuses stable engines from `translators` and `deep-translator`, plus pluggable LLM-based engines for consistent terminology.
- Persists engine preferences and API secrets with `platformdirs`, supporting either raw values or the environment variable that stores them.
- Shares voc between chunks so long documents stay consistent.
- Keeps a lean codebase: no custom infrastructure, just clear building blocks.

## Key Features
- Recursive file discovery with include/xclude filters.
- Automatic HTML vs. plain-text detection to preserve markup when possible.
- Semantic chunking via `semantic-text-splitter`, with configurable lengths per engine.
- voc-aware translation pipeline that merges `<voc>` JSON emitted by LLM engines.
- Offline-friendly dry-run mode for testing and demos.
- Optional voc sidecar files when `--save-voc` is set.
- Built-in `abersetz validate` health check that pings each configured engine, reports latency, and surfaces pricing hints from the research catalog.

## Installation
```bash
pip install abersetz
```

## Quick Start

### First-time Setup
```bash
# Automatically discover and configure available translation services
abersetz setup

# Smoke-test configured engines with a single command
abersetz validate --target-lang es
```

This will scan your environment for API keys, test endpoints, and create an optimized configuration.

### Basic Translation
```bash
# Using the main CLI
abersetz tr pl ./docs --engine tr/google --output ./build/pl

# Or using the shorthand command
abtr pl ./docs --engine tr/google --output ./build/pl
```

### CLI Options (preview)
- `to_lang`: first positional argument selecting the target language.
- `--from-lang`: source language (defaults to `auto`).
- `--engine`: one of
  - `tr/<provider>` (e.g. `tr/google`)
  - `dt/<provider>` (e.g. `dt/deepl`)
  - `hy`
  - `ll/<profile>` where profiles are defined in config.
    - Legacy selectors such as `translators/google` remain accepted and are auto-normalized.
- `--recurse/--no-recurse`: recurse into subdirectories (defaults to on).
- `--write_over`: replace input files instead of writing to output dir.
- `--save-voc`: drop merged voc JSON next to each translated file.
- `--chunk-size` / `--html-chunk-size`: override default chunk lengths.
- `--verbose`: enable debug logging via loguru.
- `abersetz engines` extras:
  - `--family tr|dt|ll|hy`: filter listing to a single engine family.
  - `--configured-only`: show only configured engines.
- `abersetz validate` extras:
  - `--selectors tr/google,ll/default`: limit validation to specific selectors (comma-separated).
  - `--target-lang es`: override the default sample translation language (`es`).
  - `--sample-text "Hello!"`: supply a custom validation snippet.

## Configuration
`abersetz` stores runtime configuration under the user config path determined by `platformdirs`. The config file keeps:
- Global defaults (engine, languages, chunk sizes).
- Engine-specific settings (API endpoints, retry policies, HTML behaviour).
- Credential entries, each allowing either `{ "env": "ENV_NAME" }` or `{ "value": "actual-secret" }`.

Example snippet (stored in `config.toml`):
```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.siliconflow]
name = "siliconflow"
env = "SILICONFLOW_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.credential]
name = "siliconflow"

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```
Use `abersetz config show` and `abersetz config path` to inspect the file.

## CLI Tools
- `abersetz`: Main CLI exposing `tr` (translate), `validate`, and `config` commands.
- `abtr`: Direct translation shorthand (equivalent to `abersetz tr`).

## Python API
```python
from abersetz import translate_path, TranslatorOptions

translate_path(
    path="docs",
    options=TranslatorOptions(to_lang="de", engine="tr/google"),
)
```

## Examples
The `examples/` directory holds ready-to-run demos:
- `poem_en.txt`: source text.
- `poem_pl.txt`: translated sample output.
- `vocab.json`: voc generated during translation.
- `walkthrough.md`: step-by-step CLI invocation log.
- `validate_report.sh`: captures the validation summary table for quick audits.

## Development Workflow
```bash
uv sync
python -m pytest --cov=. --cov-report=term-missing
ruff check src tests
ruff format src tests
```

## Testing Philosophy
- Every helper has direct unit coverage.
- Integration tests exercise the pipeline with a stub engine.
- Network calls are mocked; real APIs are never hit in CI.

## License
MIT

</document_content>
</document>

<document index="17">
<source>SPEC.md</source>
<document_content>
---
this_file: SPEC.md
---
# Abersetz Technical Specification

## 1. Overview

`abersetz` is a Python package and command-line tool for translating the content of files. It operates on a pipeline of locating files, chunking their content, translating the chunks, and merging them back into translated files.

## 2. Core Functionality

### 2.1. File Handling

-   **Input:** The tool shall accept a path to a single file or a directory.
-   **File Discovery:** When given a directory, the tool shall be able to recursively find files to translate. A `--recurse` flag should control this behavior.
-   **Output:** The tool shall support two output modes:
    -   Saving translated files to a specified output directory, mirroring the source directory structure.
    -   Overwriting the original files with their translated content, using an `--write_over` flag.

### 2.2. Translation Pipeline

The translation process shall follow these steps:

1.  **Locate:** Identify all files to be translated based on the input path and recursion settings.
2.  **Chunk:** Split the content of each file into smaller, manageable chunks suitable for the selected translation engine.
3.  **Translate:** Translate each chunk using the specified engine.
4.  **Merge:** Combine the translated chunks to reconstruct the full translated content of each file.
5.  **Save:** Write the translated content to the destination.

### 2.3. Content-Type Detection

-   The tool shall automatically detect if a file's content is HTML and handle it appropriately to preserve markup during translation.

## 3. Translation Engines

The tool shall support multiple translation engines.

### 3.1. Pre-integrated Engines

-   The tool shall integrate with the `translators` and `deep-translator` Python packages, allowing users to select any of their supported engines (e.g., `google`, `bing`, `deepl`).

### 3.2. Custom LLM-based Engines

#### 3.2.1. `hysf` Engine

-   **Provider:** Siliconflow
-   **Model:** `tencent/Hunyuan-MT-7B`
-   **Implementation:** Use the `openai` Python package to make API calls to the Siliconflow endpoint (`https://api.siliconflow.com/v1/chat/completions`).
-   **Authentication:** The API key shall be retrieved from the configuration.
-   **Resilience:** API calls shall be wrapped with `tenacity` for automatic retries.

#### 3.2.2. `ullm` (Universal Large Language Model) Engine

-   **Configurability:** This engine shall be highly configurable, allowing users to define profiles for different LLM providers. Each profile shall specify:
    -   API base URL
    -   Model name
    -   API key (or reference to it)
    -   Temperature
    -   Chunk size
    -   Maximum input token length
-   **voc Management:**
    -   The engine shall support a "prolog" in the first chunk, which can contain a JSON object of predefined voc.
    -   The prompt shall instruct the LLM to return the translation within an `<output>` tag.
    -   The prompt shall also instruct the LLM to optionally return a `<voc>` tag containing a JSON object of newly established term translations.
    -   The tool shall parse the `<voc>` output, merge it with the existing voc, and pass the updated voc to subsequent chunks.
-   **voc Persistence:**
    -   A `--save-voc` flag shall enable saving the final, merged voc as a JSON file next to the translated output file.

## 4. Configuration

-   **Storage:** Configuration shall be stored in a user-specific directory using the `platformdirs` package.
-   **Credentials:** The configuration shall securely store API keys. It must support storing either the raw API key value or the name of an environment variable that holds the key.
-   **Engine Settings:** The configuration shall allow specifying engine-specific settings, such as chunk sizes.

## 5. Command-Line Interface (CLI)

-   The tool shall provide a CLI based on `python-fire`.
-   The main command shall be `translate`.
-   **CLI Arguments:**
    -   `path`: The input file or directory.
    -   `--from-lang`: Source language (default: `auto`).
    -   `--to-lang`: Target language (default: `en`).
    -   `--engine`: The translation engine to use.
    -   `--recurse` / `--no-recurse`: Enable/disable recursive file discovery.
    -   `--write_over`: write_over original files instead of saving to an output directory.
    -   `--output`: The directory to save translated files.
    -   `--save-voc`: Save the voc file.

## 6. Python API

-   The package shall expose a Python API for programmatic use.

## 7. Dependencies

-   `translators`
-   `deep-translator`
-   `openai`
-   `tenacity`
-   `platformdirs`
-   `python-fire`
-   `semantic-text-splitter` (or similar for chunking)

</document_content>
</document>

<document index="18">
<source>TESTING.md</source>
<document_content>
---
this_file: TESTING.md
---
# Testing Guide

## Running Tests

### Unit Tests
Run the standard test suite:
```bash
python -m pytest
```

With coverage report:
```bash
python -m pytest --cov=. --cov-report=term-missing
```

### Integration Tests
Integration tests make real API calls and are skipped by default to avoid network dependencies in CI.

To run integration tests locally:
```bash
export ABERSETZ_INTEGRATION_TESTS=true
python -m pytest tests/test_integration.py -v
```

Some tests require API keys:
```bash
export SILICONFLOW_API_KEY=your-api-key
export ABERSETZ_INTEGRATION_TESTS=true
python -m pytest tests/test_integration.py -v
```

### Test Markers
- `@pytest.mark.integration` - Tests that require network access
- `@pytest.mark.skipif` - Conditional test execution based on environment

### Continuous Testing
Use pytest-watch for automatic test runs on file changes:
```bash
uvx pytest-watch -- -xvs
```

## Test Coverage
Current coverage: **91%**

Areas with good coverage:
- Configuration management (90%)
- Translation pipeline (97%)
- CLI interface (78%)
- Engine abstractions (82%)

## Testing Best Practices
1. Write tests before implementing features (TDD)
2. Test edge cases: empty inputs, None values, large inputs
3. Mock external services in unit tests
4. Use integration tests sparingly for real API validation
5. Keep tests focused and independent
6. Use descriptive test names: `test_function_when_condition_then_result`
</document_content>
</document>

<document index="19">
<source>TODO.md</source>
<document_content>
---
this_file: TODO.md
---
## Active TODO Items
- [ ] Add CLI regression test covering the missing target-language guard in `_build_options_from_cli`.
- [ ] Add CLI regression test confirming `prolog`/`voc` JSON inputs populate `TranslatorOptions`.
- [ ] Add CLI regression test asserting `save_voc`/`write_over` and chunk-size flags propagate to `TranslatorOptions`.

</document_content>
</document>

<document index="20">
<source>WORK.md</source>
<document_content>
---
this_file: WORK.md
---
# Work Log

## 2025-09-21
### /report – Verification Sweep 2025-09-21 12:39 UTC
- `python -m pytest -xvs` → 180 passed, 8 skipped in 91.12s; coverage plug-in reported 98% overall with only CLI line 286 plus intentionally skipped integration scaffolding remaining uncovered.
- `python -m pytest --cov=. --cov-report=term-missing` → 180 passed, 8 skipped in 80.92s; coverage table unchanged (98%) with explicit misses on CLI line 286 and the skipped integration suite alongside focused setup/pipeline guard asserts.
- `uvx mypy .` → Success, zero errors besides the expected `annotation-unchecked` note for the advanced API helper.
- `uvx bandit -r .` → 448 Low-severity findings from deliberate test `assert`s and the config backup guard; Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` immediately after verification.

### /report – Verification Sweep 2025-09-21 12:16 UTC
- `git status -sb` confirms branch `main` with numerous modified/new files from the ongoing feature work; no unintended local changes were touched during this phase.
- `python -m pytest -xvs` → 177 passed, 8 skipped in 88.81s; inline coverage summary reported 98% overall with misses restricted to intentionally skipped integration scaffolding plus four guarded setup/pipeline lines.
- `python -m pytest --cov=. --cov-report=term-missing` → 177 passed, 8 skipped in 85.02s; coverage table unchanged at 98% with explicit line listings limited to the skipped integration suite and guarded setup/pipeline assertions.
- `uvx mypy .` → 3 errors (`src/abersetz/cli.py` optional output assignments and `external/dump_models.py` credential default handling); all other modules typed cleanly.
- `uvx bandit -r .` → 438 Low-severity findings attributable to deliberate `assert` usage across tests and the config backup `try/except/pass`; Medium/High severities remain zero.
- Updated `PLAN.md`/`TODO.md` with the "Residual Type & Coverage Polish" sprint covering mypy cleanup and pipeline chunk-size regressions.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` immediately after verification.

### /work – Residual Type & Coverage Polish Task 1 2025-09-21 12:24 UTC
- Added `tests/test_cli.py::test_cli_translate_accepts_path_output` to pin regression coverage for Path-based `--output` handling.
- Normalised `_build_options_from_cli` to accept `Path | str | None`, guard the target language requirement, and reuse the validated language codes.
- Updated the provider parser in `external/dump_models.py` to treat empty credential env values as optional with explicit typing.
- `python -m pytest tests/test_cli.py -k "path_output" -xvs` → 1 passed (24 deselected) in 1.22s; coverage snapshot limited to targeted files by design.
- `uvx mypy .` → Success with zero errors; residual annotation-unchecked note only.

### /work – Residual Type & Coverage Polish Task 2 2025-09-21 12:33 UTC
- Added `tests/test_pipeline.py::test_translate_path_uses_dummy_chunk_size_when_defaults_zero` to exercise the base `DummyEngine.chunk_size_for` branch.
- Reused `AbersetzConfig` defaults collapsed to zero to force engine chunk-size lookup and validated plain-text invocation tracking.
- `python -m pytest tests/test_pipeline.py -k "dummy_chunk_size" -xvs` → 1 passed (11 deselected) in 1.39s; confirms engine fallback path executes with chunk size 7.

### /work – Residual Type & Coverage Polish Task 3 2025-09-21 12:40 UTC
- Added `tests/test_pipeline.py::test_translate_path_with_html_engine_handles_mixed_formats` to cover the non-HTML branch of `HtmlEngine.chunk_size_for`.
- Forced mixed-format inputs through a tracking engine to assert both HTML and plain invocations as well as distinct chunk-size outputs.
- `python -m pytest tests/test_pipeline.py -k "html_engine_handles_mixed" -xvs` → 1 passed (12 deselected) in 1.06s; verifies HTML and plain chunk hints are applied correctly.

### /report – Verification Sweep 2025-09-21 12:45 UTC
- `python -m pytest -xvs` → 180 passed, 8 skipped in 80.56s; coverage inline summary 98% with only intentionally skipped integration scaffolding and a single CLI guard line missing.
- `python -m pytest --cov=. --cov-report=term-missing` → 180 passed, 8 skipped in 94.41s; coverage table unchanged at 98%, listing CLI line 286 plus the skipped integration suite and guarded test scaffolding.
- `uvx mypy .` → Success with zero errors (annotation-unchecked note only).
- `uvx bandit -r .` → 448 Low-severity findings (expected pytest `assert`s and the config backup guard); Medium/High severities remain zero.
- Cleared `TODO.md`; no active items remain after completing the residual robustness sprint.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` after the verification sweep.

### /work – Type Hygiene & Chunking 2025-09-21 11:55 UTC
- Add mypy per-module ignore overrides for stubless third-party deps.
- Fix translate_path integration usage and add string path regression coverage.
- Add HTML chunk-size regression test to enforce engine-provided fallback.
- Extended `pyproject.toml` with `[tool.mypy]` overrides; `uvx mypy .` now reports only 3 real errors (CLI output handling and external dump fallback).
- Updated `tests/test_integration.py::test_translate_file_api` to use `TranslatorOptions(output_dir=...)` and introduced `tests/test_pipeline.py::test_translate_path_accepts_string_source_paths` plus `test_translate_path_html_uses_engine_chunk_hint` with explicit assert messages.
- `python -m pytest -xvs` → 177 passed, 8 skipped in 88.93s; coverage summary shows 98% overall with only intentionally skipped integration scaffolding outstanding.
- `python -m pytest --cov=. --cov-report=term-missing` → 177 passed, 8 skipped in 81.35s; coverage table unchanged (98% total) with residual misses restricted to skipped integrations and two pipeline/setup guard lines.
- `uvx mypy .` → 3 errors (expected CLI output option and external dump optional string handling).
- `uvx bandit -r .` → 438 Low-severity findings stemming from expected pytest `assert`s and the config backup `try/except/pass`; Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` after verification runs.

### /report – Verification Sweep 2025-09-21 11:50 UTC
- `python -m pytest -xvs` → 175 passed, 8 skipped in 90.23s; overall coverage reported at 98% with only intentionally skipped integration scaffolding outstanding.
- `python -m pytest --cov=. --cov-report=term-missing` → 175 passed, 8 skipped in 80.11s; coverage table unchanged (98% total) with misses isolated to skipped integration placeholders and the single guarded pipeline/setup assertions.
- `uvx mypy .` → 49 errors; unchanged set covering missing third-party stubs plus legacy CLI/pipeline union assignments and the intentional integration keyword argument.
- `uvx bandit -r .` → 430 Low-severity findings (expected pytest `assert`s plus the config backup `try/except/pass` guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-report.

### /report – Regression Sweep 2025-09-21 11:28 UTC
- `python -m pytest -xvs` → 171 passed, 8 skipped in 94.32s; inline coverage reaffirms 97% overall with only `examples/advanced_api.py` and intentionally skipped integration scaffolding uncovered.
- `python -m pytest --cov=. --cov-report=term-missing` → 171 passed, 8 skipped in 80.50s; coverage breakdown unchanged with `tests/test_integration.py` skip placeholders and `examples/advanced_api.py` lines 141-343 flagged.
- `uvx mypy .` → 49 errors; all attributable to missing third-party stubs plus longstanding CLI union assignments (no regressions detected).
- `uvx bandit -r .` → 419 Low-severity findings (expected pytest asserts and the config backup `try/except/pass` guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts post-report.

### /work – Advanced Examples Hardening 2025-09-21 11:40 UTC
- Target `vocManager.load_voc` / `merge_vocabularies` gaps with focused tests.
- Prove `IncrementalTranslator` reloads existing checkpoints and rewrites after new work.
- Cover advanced example CLI `__main__` guard for dispatch and usage banner.
- Added `tests/test_examples.py` cases for voc loading/merging, incremental checkpoint reuse, and CLI dispatch/usage to close the remaining gaps in `examples/advanced_api.py`.
- `python -m pytest tests/test_examples.py -xvs` → 22 passed in 1.83s; advanced API coverage hit 100%.
- `python -m pytest -xvs` → 175 passed, 8 skipped in 84.35s; overall coverage climbed to 98% with only skipped integration scaffolding reported.
- `python -m pytest --cov=. --cov-report=term-missing` → 175 passed, 8 skipped in 82.11s; `examples/advanced_api.py` fully covered and remaining misses isolated to intentional skips.
- `uvx mypy .` → 49 errors (unchanged set: missing third-party stubs plus legacy CLI union assignments).
- `uvx bandit -r .` → 430 Low-severity findings (expected pytest `assert`s and the backup `try/except/pass` guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` artifacts after verification.

### /report – Verification Sweep 2025-09-21 09:14 UTC
- `python -m pytest -xvs` → 170 passed, 8 skipped in 104.55s; inline coverage summary reported 97% overall with misses limited to `examples/advanced_api.py` and skipped integration scaffolding.
- `python -m pytest --cov=. --cov-report=term-missing` → 170 passed, 8 skipped in 83.28s; coverage breakdown unchanged (97% total, `examples/advanced_api.py` 88%, targeted test gaps enumerated).
- `uvx mypy .` → 49 errors (unchanged set of missing third-party stubs plus known CLI/output type unions; no new diagnostics introduced).
- `uvx bandit -r .` → 415 Low-severity findings (`B101` asserts throughout tests and the config backup try/except guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` immediately after the sweep.
### /work – Coverage Touchups 2025-09-21 09:24 UTC
- Hardened optional-import fallbacks by extending `tests/test_chunking.py` and `tests/test_engine_catalog.py` to prove stdlib imports still succeed after the monkeypatched failures.
- Added `tests/test_pipeline.py::test_translate_path_uses_engine_chunk_size_when_defaults_falsy` to ensure engine-provided chunk sizes are honoured when config defaults collapse to zero.
- `python -m pytest tests/test_chunking.py -xvs` → 5 passed in 1.07s; fallback importer assertions green.
- `python -m pytest tests/test_engine_catalog.py -xvs` → 13 passed in 1.07s; ensured translators shim defers to stdlib imports.
- `python -m pytest tests/test_pipeline.py -k "chunk_size" -xvs` → 1 passed (others deselected) in 1.37s; engine chunk sizing verified.
- `python -m pytest -xvs` → 171 passed, 8 skipped in 98.65s; inline coverage now shows `chunking.py`, `engine_catalog.py`, and `pipeline.py` at 100%.
- `python -m pytest --cov=. --cov-report=term-missing` → 171 passed, 8 skipped in 82.02s; coverage steady at 97% with remaining misses isolated to `examples/advanced_api.py` and skipped integrations.
- `uvx mypy .` → 49 errors (no change; all missing stubs or pre-existing CLI argument type looseness).
- `uvx bandit -r .` → 419 Low-severity findings (expected test `assert`s plus config backup guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` post-run.
### /report – QA Sweep 2025-09-21 08:46 UTC
- `python -m pytest -xvs` → 162 passed, 8 skipped in 80.41s; coverage plugin summary steady at 95% overall (misses isolated to `examples/advanced_api.py` and `setup.py` fallback helper).
- `python -m pytest --cov=. --cov-report=term-missing` → 162 passed, 8 skipped in 80.61s; missing lines unchanged (`examples/advanced_api.py` bulk, `setup.py:221/262/286/471`, integration skips).
- `uvx mypy .` → 58 errors, all attributable to missing third-party stubs plus known legacy typing gaps; no new regressions detected.
- `uvx bandit -r .` → 390 Low findings (`B101` asserts in tests, `B110` backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` immediately after reporting.

### /work – Reliability Polish 2025-09-21 11:03 UTC
- Added targeted setup tests to exercise API-provider endpoint checks, verbose failure logging, and empty engine fallbacks; setup coverage now reports 100%.
- Refined `TranslationWorkflow.generate_report` with typed accumulators and safe doc handling; expanded tests keep JSON output stable while eliminating union-attr mypy noise.
- Introduced async example coverage (ParallelTranslator success/error) plus CLI example smoke tests for voc consistency, parallel comparison, and incremental translation.
- `python -m pytest -xvs` → 170 passed, 8 skipped in 108.03s; coverage plugin reports 97% total with `examples/advanced_api.py` at 88%.
- `python -m pytest --cov=. --cov-report=term-missing` → 170 passed, 8 skipped in 85.13s; remaining misses limited to manual CLI usage banner and integration skips.
- `uvx mypy .` → 49 errors (down from 58; all remaining diagnostics are missing third-party stubs plus known CLI signature notes).
- `uvx bandit -r .` → 415 Low findings (expected test asserts plus config backup guard); Medium/High severities remain zero.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` after the run.

### /work – Mypy Noise Reduction 2025-09-21 10:38 UTC
- Replaced DeepTranslator provider snapshot with `dict(...)`, adjusted CLI engines helper call signature, and tightened offline import assertions to cut redundant mypy errors.
- `python -m pytest tests/test_engines.py -k "deep_translator_engine_retry_on_failure" -xvs` → 1 passed (targeted regression).
- `python -m pytest tests/test_offline.py -xvs` → 9 passed.
- `python -m pytest tests/test_cli.py -k "cli_engines_lists_configured_providers" -xvs` → 1 passed.
- `uvx mypy .` → 58 errors remaining (down from 67; only missing stubs and legacy example typing remain).
- `python -m pytest -xvs` → 162 passed, 8 skipped in 82.49s; coverage summary unchanged at 95% overall.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage` after the sweep.

### Automated Report – 2025-09-21 10:29 UTC
- `python -m pytest -xvs` → 162 passed, 8 skipped in 81.14s; inline coverage table shows 95% total with `setup.py` holding 4 uncovered lines and tests gaps confined to integration scaffold cases.
- `python -m pytest --cov=. --cov-report=term-missing` → 162 passed, 8 skipped in 81.10s; coverage 95% overall with misses in `examples/advanced_api.py`, `setup.py:221/262/286/471`, and known integration placeholders.
- `uvx mypy .` → 67 errors across 21 files; all stem from missing third-party stubs (pytest, httpx, tenacity, loguru, rich, platformdirs, langcodes, semantic-text-splitter, requests) plus existing intentional loose example typing and test helpers.
- `uvx bandit -r .` → 390 Low findings (assert use in tests and the config backup guard); Medium/High severities absent.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, and `.coverage`.

### Automated Report – 2025-09-21 10:20 UTC
- `python -m pytest -xvs` → 162 passed, 8 skipped in 80.66s; inline coverage snapshot reported 95% total with `setup.py` down to four uncovered lines and `examples/advanced_api.py` raised to 47%.
- `python -m pytest --cov=. --cov-report=term-missing` → 162 passed, 8 skipped in 81.84s; coverage 95% overall with remaining misses confined to `examples/advanced_api.py`, `setup.py:221/262/286/471`, and skipped integration scaffolding.
- `uvx mypy .` → 67 errors across 21 files; cleared the Chat namespace diagnostics, outstanding items are missing third-party stubs plus longstanding test/example attr warnings.
- `uvx bandit -r .` → 390 Low findings (expected test `assert`s and config backup guard); Medium/High severities absent.
- Targeted verifications: `python -m pytest tests/test_openai_lite.py -k completions -xvs`, `python -m pytest tests/test_setup.py -k select_default_engine -xvs`, `python -m pytest tests/test_examples.py -k translation_workflow -xvs` all passed post-fixes.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.coverage`, and `.benchmarks` after finishing sweeps.

### Automated Report – 2025-09-21 10:02 UTC
- `python -m pytest -xvs` → 154 passed, 8 skipped in 83.79s; inline coverage snapshot reported 94% total with residual misses on `setup.py:220/261/285/457-459`, targeted test gaps (`tests/test_chunking.py:39`, `tests/test_engine_catalog.py:77`), and intentionally skipped integration scaffolding.
- `python -m pytest --cov=. --cov-report=term-missing` → 154 passed, 8 skipped in 82.36s; coverage report identical to inline snapshot (94% overall) and itemized missing lines for `examples/advanced_api.py`, `setup.py`, and integration placeholders.
- `uvx mypy .` → 71 errors across 21 files; unchanged missing third-party stubs plus known attr/type diagnostics in examples, CLI fixtures, and tests (see `tests/test_integration.py:107`, `examples/advanced_api.py:68-79`, etc.).
- `uvx bandit -r .` → 377 Low findings (`B101` asserts in tests, `B110` backup guard); Medium/High severities remain clear.
- Initial `python -m pytest -xvs` attempt hit the harness 10s timeout; reran with extended timeout to complete full suite.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.coverage`, and `.benchmarks` immediately after documenting results.

### Automated Report – 2025-09-21 09:36 UTC
- `python -m pytest -xvs` → 151 passed, 8 skipped in 86.11s; coverage report embedded, remaining misses on `config.py:322`, `setup.py:220/261/285/457-459`, and the intentionally skipped portions of `tests/test_integration.py`.
- `python -m pytest --cov=. --cov-report=term-missing` → 151 passed, 8 skipped in 86.37s; total coverage steady at 97% with the same uncovered lines.
- `uvx mypy .` → 77 errors across 21 files (unchanged; all due to missing third-party stubs plus legacy example/external Optional defaults and `_translators` attribute access in tests).
- `uvx bandit -r .` → 364 Low-severity findings (expected test `assert`s and config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, and `.ruff_cache` immediately after finishing documentation updates.

### Automated Report – 2025-09-21 09:31 UTC
- `python -m pytest -xvs` → 151 passed, 8 skipped in 92.98s; coverage inline summary still 97% with `setup.py` misses trimmed to 6 lines (verbose failure log plus fallback branch).
- `python -m pytest --cov=. --cov-report=term-missing` → 151 passed, 8 skipped in 87.49s; uncovered lines limited to `config.py:322`, `setup.py:220/261/285/457-459`, and intentionally skipped integration scaffolding.
- `uvx mypy .` → 77 errors across 21 files (down from 88; removed `_BasicApiModule` attribute and pipeline stat override complaints, remaining issues are third-party stubs plus legacy examples/external scripts).
- `uvx bandit -r .` → 364 Low-severity findings (expected test `assert`s and config backup guard); Medium/High severities clear.
- Targeted verifications: `python -m pytest tests/test_examples.py -xvs`, `python -m pytest tests/test_pipeline.py -k warns_on_large_file -xvs`, `python -m pytest tests/test_setup.py -k defaults_to_ullm -xvs` all passed after adjustments.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` after suite completion.

### Current Iteration – Type & Setup Reliability
- [x] Expanded `_BasicApiModule` protocol and helper stubs to satisfy example typing while keeping runtime behaviour intact.
- [x] Matched the pipeline large-file stat monkeypatch to `Path.stat`'s signature/return type to silence mypy override noise.
- [x] Added a setup wizard regression test covering the OpenAI-only default branch, tightening coverage on `setup.py` fallback logic.

### Current Iteration – Coverage & Typing Polish
- [x] Add regression test for `resolve_credential` recursion and fix the infinite loop when env credentials are unset.
- [x] Refactor translator engine tests to stub `abersetz.engines.translators` rather than touching `_translators`.
- [x] Remove Optional defaults from `examples/advanced_api.py` and cover them with focused tests.

#### Verification – Coverage & Typing Polish
- `python -m pytest tests/test_config.py -k recursive_name -xvs` → passes; verifies recursion fix and INFO log emission.
- `python -m pytest tests/test_engines.py -k "translators_engine" -xvs` → passes; confirms translator stubs exercise text/HTML/retry flows without private attributes.
- `python -m pytest tests/test_examples.py -k "translation_workflow or translate_with_consistency" -xvs` → passes; exercises new lazy config guard and vocabulary merging defaults.
- `python -m pytest -xvs` → 154 passed, 8 skipped in 81.16s; `config.py` and translator tests now covered without recursion warnings.
- `python -m pytest --cov=. --cov-report=term-missing` → 154 passed, 8 skipped in 81.67s; overall coverage remains 97% with residual gaps limited to `setup.py` fallback and intentionally skipped integration tests.
- `uvx mypy .` → 71 errors across 21 files (down from 77); remaining issues are missing third-party stubs plus longstanding Optional/`Mapping.copy` diagnostics in examples and CLI fixtures.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, and `.ruff_cache` after the full-suite verification.

### Automated Report – 2025-09-21 09:19 UTC
- `python -m pytest -xvs` → 150 passed, 8 skipped in 80.27s; inline coverage summary reported 97% overall with remaining misses on `config.py:322`, selected `setup.py` verbose/error branches, and integration scaffolding intentionally skipped.
- `python -m pytest --cov=. --cov-report=term-missing` → 150 passed, 8 skipped in 79.98s; coverage steady at 97% with identical uncovered lines plus `tests/test_integration.py` skip list.
- `uvx mypy .` → 88 errors across 22 files; missing stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w and Optional defaults in examples/external helpers remain outstanding.
- `uvx bandit -r .` → 361 Low-severity findings (test `assert` usage and config backup guard); Medium/High severities clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` immediately after documenting results.

### Automated Report – 2025-09-21 09:11 UTC
- `python -m pytest -xvs` → 150 passed, 8 skipped in 80.08s; total coverage climbed to 97% with `engines.py` now fully covered and `setup.py` misses reduced to verbose/error-reporting lines only.
- `python -m pytest --cov=. --cov-report=term-missing` → 150 passed, 8 skipped in 81.03s; coverage 97% overall with residual gaps on `config.py:322`, selective `setup.py` branches, and intentionally skipped integration scaffolding.
- `uvx mypy .` → 88 errors across 22 files; unchanged missing type stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w plus Optional defaults in `examples/advanced_api.py` and external tooling.
- `uvx bandit -r .` → 361 Low-severity findings (expected test `assert`s and config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`; `.ruff_cache` absent.

### Automated Report – 2025-09-21 08:57 UTC
- `python -m pytest -xvs` → 142 passed, 8 skipped in 89.51s; coverage plugin recap still at 96% with residual misses on `config.py:322`, `engines.py` fallback branches, and setup validation scaffolding.
- `python -m pytest --cov=. --cov-report=term-missing` → 142 passed, 8 skipped in 81.25s; coverage 96% overall with gaps identical to the standard run plus intentionally skipped integration suite lines.
- `uvx mypy .` → 85 errors across 22 files; missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w persist alongside Optional defaults in `examples/advanced_api.py` and external scripts.
- `uvx bandit -r .` → 347 Low-severity findings (expected test `assert` usage and the config backup best-effort handler); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` immediately after documenting results. `.ruff_cache` absent.

### Automated Report – 2025-09-21 08:46 UTC
- `python -m pytest -xvs` → 142 passed, 8 skipped in 79.64s; coverage snapshot shows `config.py` 99% (1 miss), `engines.py` 98% (3 misses), `pipeline.py` now 100% after large-file warning test.
- `python -m pytest --cov=. --cov-report=term-missing` → 142 passed, 8 skipped in 80.49s; total coverage steady at 96% with residual misses on `resolve_credential` chained fallback, deep-translator unsupported provider error, and setup validation scaffolding.
- `uvx mypy .` → 85 errors across 22 files (unchanged; missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w plus Optional defaults in `examples/advanced_api.py` and external scripts).
- `uvx bandit -r .` → 347 Low-severity findings (expected `assert` usage in tests and config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.

### Automated Report – 2025-09-21 08:36 UTC
- `python -m pytest -xvs` → 135 passed, 8 skipped in 81.67s (initial 120s harness timeout re-run with extended limit); coverage plug-in summary shows `cli.py`, `validation.py`, and examples at 100% with total coverage 96%.
- `python -m pytest --cov=. --cov-report=term-missing` → 135 passed, 8 skipped in 80.16s; coverage remains 96% with misses confined to credential recursion (`config.py`), retry fallbacks (`engines.py`), pipeline error messaging, setup validation branches, and skipped integration scaffolding.
- `uvx mypy .` → 83 errors across 22 files (unchanged); all stem from missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich/semantic_text_splitter/tomli_w plus the `Chat.completions` shim used in `openai_lite.py`, example Optional defaults, and external research scripts.
- `uvx bandit -r .` → 337 Low-severity findings (expected test `assert` usage and the config backup `try/except/pass` guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` immediately after the test suite.

### Current Iteration – Coverage Guard V
- [x] Probe `_test_single_endpoint` dict/list branches and verbose logging without network access.
- [x] Harden provider discovery defaults and validation early-return behaviour.
- [x] Exercise Translators HTML path, deep-translator guard, and `_build_hysf_engine` credential error.

### Verification – Coverage Guard V
- `python -m pytest tests/test_setup.py -k "list_payload or logs_verbose" -xvs` → validated HTML/list parsing and verbose logging sink.
- `python -m pytest tests/test_setup.py -k "deepl or prefers_hysf or returns_immediately" -xvs` → confirmed Deepl mapping, `_validate_config([])` early exit, and hysf default ordering.
- `python -m pytest tests/test_engines.py -k "handles_html or rejects_unknown_provider or build_hysf" -xvs` → covered Translators HTML path, deep-translator rejection, and `_build_hysf_engine` credential guard.
- `python -m pytest -xvs` / `python -m pytest --cov=. --cov-report=term-missing` → 150 passed, 8 skipped; project coverage 97% with remaining misses limited to setup verbose/error lines and integration skips.
- `uvx mypy .`, `uvx bandit -r .` → diagnostics unchanged (missing third-party stubs; 361 Low findings only).

### Current Iteration – Reliability Guard IV
- [x] Cover `resolve_credential` null payload and alias recursion
- [x] Cover pipeline large-file warning branch
- [x] Cover LLM payload fallbacks and missing engine config error

### Verification – Reliability Guard IV
- `python -m pytest tests/test_config.py -k "resolve_credential_returns_none or resolve_credential_reuses" -xvs` → 2 passed; exercised `resolve_credential` null and alias branches.
- `python -m pytest tests/test_pipeline.py -k warns_on_large_file -xvs` → 1 passed; captured loguru warning for simulated 11MB file.
- `python -m pytest tests/test_engines.py -k "parse_payload or missing_selector" -xvs` → 4 passed; covered LLM payload fallbacks and missing engine configuration error path.

### Automated Report – 2025-09-21 08:16 UTC
- `python -m pytest -xvs` → 130 passed, 8 skipped in 80.24s; coverage 96% with `cli.py` 99%, `validation.py` 100%, and remaining misses limited to setup/config fallback paths plus skipped integrations.
- `python -m pytest --cov=. --cov-report=term-missing` → 130 passed, 8 skipped in 79.46s; coverage steady at 96% with misses on `cli.py:177`, config/env fallbacks, setup validation branches, and intentionally skipped integration scaffolding.
- `uvx mypy .` → 83 errors across 22 files (missing third-party stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich plus Optional/default typing gaps in examples and external research scripts).
- `uvx bandit -r .` → 329 Low-severity findings (expected test `assert` usage and the config backup `try/except/pass` guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after documentation updates.

### Current Iteration – Coverage Polish III
- [x] Add `tests/test_cli.py` coverage for deep-translator string `providers` branch.
- [x] Add `tests/test_config.py` coverage for `EngineConfig.to_dict` optional fields.
- [x] Add `tests/test_engine_catalog.py` guards for null/blank selectors.

### Verification – Coverage Polish III
- `python -m pytest tests/test_cli.py -k deep_translator_string -xvs` → confirmed `dt/libre` is marked configured when `providers` is a string.
- `python -m pytest tests/test_config.py -k engine_config_to_dict -xvs` → asserted optional chunk sizes and credential survive round-trip serialization.
- `python -m pytest tests/test_engine_catalog.py -k normalize_selector -xvs` → exercised `None`/blank/empty-base normalization guards.
- `python -m pytest -xvs` → 135 passed, 8 skipped in 99.07s; coverage summary shows `cli.py` and `engine_catalog.py` now at 100%, `config.py` at 99% with only credential recursion lines uncovered.
- `python -m pytest --cov=. --cov-report=term-missing` → 135 passed, 8 skipped in 84.83s; total coverage steady at 96% with new misses isolated to credential recursion and integration scaffolding.
- `uvx mypy .` → 83 errors in 22 files (unchanged; third-party stubs and Optional defaults outstanding in examples/external scripts).
- `uvx bandit -r .` → 337 Low-severity findings (increase from additional test `assert`s); Medium/High severities remain clear.

### Automated Report – 2025-09-21 08:25 UTC
- `python -m pytest -xvs` → 135 passed, 8 skipped in 99.07s; modules `cli.py` and `engine_catalog.py` now fully covered.
- `python -m pytest --cov=. --cov-report=term-missing` → 135 passed, 8 skipped in 84.83s; coverage holding at 96% with residual misses on credential recursion and integration scaffolding.
- `uvx mypy .` → 83 errors across 22 files (unchanged; missing stubs for pytest/httpx/tenacity/platformdirs/langcodes/loguru/rich plus Optional defaults in examples/external scripts).
- `uvx bandit -r .` → 337 Low-severity findings (expected test `assert`s plus config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` post-run.

### Automated Report – 2025-09-21 07:57 UTC
- `python -m pytest -xvs` → 126 passed, 8 skipped in 84.61s; coverage snapshot 96% with residual misses in integration scaffolding plus select CLI/config/setup fallback branches.
- `python -m pytest --cov=. --cov-report=term-missing` → 126 passed, 8 skipped in 90.62s; coverage steady at 96% with uncovered lines `[examples/basic_api.py:150]`, `__init__.py`, chunking fallbacks, engines retry branches, setup progress messaging, and intentionally skipped integration suite.
- `uvx mypy .` → 92 errors (unchanged); all stem from missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/platformdirs/tomli_w/loguru/rich/requests plus deliberate `Chat.completions` usage and dynamic example exports.
- `uvx bandit -r .` → 321 Low-severity findings (expected test `assert` usage and config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.

### Current Iteration – Coverage Polish II
- [x] Add CLI dispatch regression test for `examples/basic_api.py` main entry.
- [x] Cover `chunk_text` empty input and fallback path under forced ImportError.
- [x] Assert `abersetz.__getattr__` raises for unknown exports while caching pipeline imports.

### Verification – Coverage Polish II
- `python -m pytest tests/test_examples.py -k cli_dispatch -xvs` → 1 passed; confirmed CLI dispatch path executes the stub and suppresses the usage banner.
- `python -m pytest tests/test_chunking.py -k "blank or fallback" -xvs` → 2 passed; fallback generator returned the expected slices when `semantic_text_splitter` import failed on purpose.
- `python -m pytest tests/test_package.py -k getattr -xvs` → 1 passed; invalid attribute lookup now surfaces `AttributeError` and cached pipeline exports remain stable.
- `python -m pytest -xvs` → 130 passed, 8 skipped in 78.40s; coverage 96% with `examples/basic_api.py` and `chunking.py` both at 100%.
- `python -m pytest --cov=. --cov-report=term-missing` → 130 passed, 8 skipped in 78.58s; coverage steady at 96% with updated uncovered line set noted.
- `uvx mypy .` → 92 errors unchanged (missing third-party stubs plus intentional `Chat.completions` and example protocol lookups).
- `uvx bandit -r .` → 329 Low-severity findings (additional test `assert`s plus existing config backup guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.

### Automated Report – 2025-09-21 07:36 UTC
- `python -m pytest -xvs` → 118 passed, 8 skipped in 81.04s; total coverage snapshot 95% with `cli.py` at 99%, `pipeline.py` at 99%, and remaining deltas isolated to integration scaffolding.
- `python -m pytest --cov=. --cov-report=term-missing` → 118 passed, 8 skipped in 78.97s; coverage steady at 95% with misses confined to CLI verbose path, config/env fallbacks, setup progress output, and intentionally skipped integration suite.
- `uvx mypy .` → 92 errors (unchanged); all attributable to missing third-party type stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/platformdirs/tomli_w/loguru/rich/requests plus deliberate `Chat.completions` access and dynamic example exports.
- `uvx bandit -r .` → 316 Low-severity findings (expected test `assert` usage and the config backup `try/except/pass` guard); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.

-### Current Iteration – Engine Factory Reliability
- [x] Harden `_build_llm_engine` missing model/credential tests.
- [x] Cover `_select_profile` happy path + unknown profile error.
- [x] Exercise `_make_openai_client` base URL and unsupported selector guard.

### Verification – Engine Factory Reliability
- `python -m pytest tests/test_engines.py -k "llm" -xvs` → 3 passed; validated new LLM factory error guards.
- `python -m pytest tests/test_engines.py -k "profile" -xvs` → 4 passed; confirmed default/variant selection and no-profile fallback paths.
- `python -m pytest tests/test_engines.py -xvs` → 15 passed; `src/abersetz/engines.py` coverage rose to 96% with 100% test file coverage.
- `python -m pytest -xvs` → 126 passed, 8 skipped in 83.36s; project coverage 96% with engines hot spots now green.
- `python -m pytest --cov=. --cov-report=term-missing` → 126 passed, 8 skipped in 83.63s; coverage steady at 96% with remaining misses limited to integration scaffolding and setup progress branches.
- `uvx mypy .` → 92 errors unchanged; all due to missing third-party stubs and intentional dynamic attributes.
- `uvx bandit -r .` → 321 Low-severity findings (expected test `assert` usage and config backup guard); no Medium/High issues.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` after verification.

### Automated Report – 2025-09-21 07:20 UTC
- `python -m pytest -xvs` → 116 passed, 8 skipped in 81.16s; inline coverage snapshot reported 95% total with hot spots remaining in CLI/config/engine helper branches and intentionally skipped integration suite.
- `python -m pytest --cov=. --cov-report=term-missing` → 116 passed, 8 skipped in 78.87s; coverage report confirmed 95% overall with uncovered lines called out across CLI verbose handling, config fallback branches, engine retry paths, setup progress reporting, and integration scaffolding.
- `uvx mypy .` → 92 errors driven by missing third-party stubs (`pytest`, `httpx`, `tenacity`, `semantic_text_splitter`, `langcodes`, `platformdirs`, `tomli_w`, `loguru`, `rich`, `requests`) plus deliberate `Chat.completions` attribute access and dynamically exposed example helpers.
- `uvx bandit -r .` → 308 Low-severity findings (expected `assert` usage throughout tests and the guarded config backup fallback); Medium/High severities remain clear.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage` artifacts after verification.

### Current Iteration – Config & CLI Reliability
- [x] Write failing tests for `ConfigCommands.show`/`.path` round-trip with a seeded config directory.
- [x] Write failing tests covering single string providers plus `ullm` default profile fallback in `_collect_engine_entries`.
- [x] Write failing test ensuring `resolve_credential` returns stored secrets when env vars are missing.

### Verification – 2025-09-21 07:30 UTC
- `python -m pytest tests/test_cli.py -k "config_commands or string_branches" -xvs` → 2 passed; confirmed new CLI tests and captured coverage snippet for targeted cases.
- `python -m pytest tests/test_config.py -k recurses -xvs` → 1 passed with expected debug log about `CHAINED_KEY`.
- `python -m pytest -xvs` → 118 passed, 8 skipped in 81.39s; `cli.py` coverage 99%, `config.py` 98%, total coverage 95%.
- `python -m pytest --cov=. --cov-report=term-missing` → 118 passed, 8 skipped; coverage confirmed 95% with previously uncovered CLI/config lines now green except for residual integration gaps.
- `uvx mypy .` → 92 errors (missing third-party stubs plus intentional dynamic attributes) unchanged from baseline.
- `uvx bandit -r .` → 316 Low-severity findings (expected test asserts + config backup guard); no Medium/High issues.
- /cleanup removed `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.benchmarks`, `.coverage`.

### Automated Report – 2025-09-21 07:08 UTC
- `python -m pytest -xvs` → 112 passed, 8 skipped in 79.37s; coverage snapshot 94% overall with residual gaps in CLI/config/engine helper branches and intentionally skipped integration flows. Config/setup warnings surfaced as expected during fixture resets.
- `python -m pytest --cov=. --cov-report=term-missing` → 112 passed, 8 skipped in 78.85s; total coverage 94% with reported misses limited to CLI render fallbacks, config defaults, engine catalog/provider aggregation, setup progress output, and integration placeholders.
- `uvx mypy .` → 92 errors, all due to absent third-party stubs (`pytest`, `httpx`, `tenacity`, `semantic_text_splitter`, `langcodes`, `rich`, `platformdirs`, `tomli_w`, `loguru`, `requests`) plus deliberate attribute lookups on mocked OpenAI `Chat.completions` helpers and dynamically exported examples.
- `uvx bandit -r .` → 300 Low severity findings (expected `assert` usage throughout tests and the guarded config backup fallback); no Medium or High issues logged.
- Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` during /cleanup.

### Current Iteration – CLI Shell Coverage
- [x] Cover `AbersetzCLI.tr` pipeline error reporting branch (red console output + raised `PipelineError`).
- [x] Verify `AbersetzCLI.setup` forwards `non_interactive`/`verbose` flags to `setup_command`.
- [x] Confirm `main()` and `abtr_main()` invoke `fire.Fire` with expected callables.

### Verification – 2025-09-21 07:16 UTC
- `python -m pytest tests/test_cli.py -k "pipeline_error or setup_forwards or main_invokes" -xvs` → 4 passed (targeted smoke for new CLI shell coverage cases).
- `python -m pytest -xvs` → 116 passed, 8 skipped in 80.15s; `src/abersetz/cli.py` coverage climbed to 96%, total project coverage 95%.
- `python -m pytest --cov=. --cov-report=term-missing` → 116 passed, 8 skipped in 79.22s; overall coverage 95% with remaining misses limited to CLI verbose details, engine fallback branches, setup progress, and integration placeholders.
- `uvx mypy .` → 92 errors (unchanged; missing third-party stubs for pytest/httpx/tenacity/semantic_text_splitter/langcodes/rich/platformdirs/tomli_w/loguru/requests plus expected dynamic attributes in openai/examples/tests).
- `uvx bandit -r .` → 308 Low severity findings (all deliberate test asserts plus config backup fallback); Medium/High severities clear.

### Automated Report – 2025-09-21 04:53 UTC
- `python -m pytest -xvs` → 109 passed, 8 skipped in 80.44s; coverage plugin snapshot 94% overall with misses limited to CLI helper branches, engine retry paths, setup prompts, and intentionally skipped integration flows.
- `python -m pytest --cov=. --cov-report=term-missing` → 109 passed, 8 skipped in 80.47s; overall coverage 94% with uncovered lines enumerated for CLI/config/engine helpers plus planned integration gaps.
- `uvx mypy .` → 40+ errors expected from missing third-party stubs (`pytest`, `httpx`, `tenacity`, `semantic_text_splitter`, `langcodes`, `rich`, `platformdirs`, `tomli_w`, `loguru`) and mocked `Chat.completions` attributes within tests/openai shim.
- `uvx bandit -r .` → 292 Low severity findings (test `assert` usage and guarded config backup fallback); no Medium/High issues detected.
- Completed /cleanup by removing `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` post-verification.

### Current Iteration – Coverage Hardening
- Target `_render_results` output path with a focused CLI unit test.
- Extend `_collect_engine_entries` tests covering single-string translator provider configs.
- Add HTML-specific chunk size assertions for `EngineBase.chunk_size_for`.

### Coverage Hardening Verification – 2025-09-21 05:02 UTC
- `python -m pytest -k "render_results or collect_engine_entries or chunk_size_for" -xvs` → 5 passed (targeted smoke of new tests).
- `python -m pytest -xvs` → 112 passed, 8 skipped in 79.77s; CLI coverage now 92% and `engines.py` 92%, with `tests/test_cli.py` and `tests/test_engines.py` both at 100% coverage.
- `python -m pytest --cov=. --cov-report=term-missing` → 112 passed, 8 skipped; total coverage holds at 94% with remaining misses limited to long-tail CLI/setup/helper branches and intentional integration skips.
- `uvx mypy .` → 40+ expected errors from missing third-party stubs (pytest/httpx/tenacity/semantic_text_splitter/langcodes/rich/platformdirs/tomli_w/loguru) plus mocked `Chat.completions` attributes and example accessors.
- `uvx bandit -r .` → 300 Low severity findings (test `assert` usage + backup fallback) with no Medium/High issues.
- Updated PLAN.md to mark coverage hardening sprint complete and checked off TODO items after verifying new tests.
- Cleared `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` after verification run.

### Automated Report – 2025-09-21 06:13 UTC
- `python -m pytest -xvs` → 96 passed, 8 skipped in 86.56s; coverage plugin snapshot 93% overall with gaps concentrating in `cli.py`, `config.py`, `engine_catalog.py`, `engines.py`, and integration skips.
- `python -m pytest --cov=. --cov-report=term-missing` → 96 passed, 8 skipped in 95.74s; total coverage 93% with uncovered lines enumerated for CLI/config/engine modules plus intentional integration skips.
- `uvx mypy .` → 70+ errors dominated by missing third-party stubs (`pytest`, `httpx`, `tenacity`, `langcodes`, `rich`, `platformdirs`, `tomli_w`, `loguru`) and stubbed `Chat.completions` attribute usage within tests and `openai_lite.py`.
- `uvx bandit -r .` → 264 Low severity findings (expected `assert` usage across tests and guarded backup writer fallback in `config.py`).
- Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache`.
- Identified follow-up coverage tasks for config fallback, engine catalog provider helpers, and CLI engine filters; logged in `PLAN.md` and `TODO.md`.

### Automated Report – 2025-09-21 06:25 UTC
- `python -m pytest -xvs` → 103 passed, 8 skipped in 84.21s; coverage plugin snapshot 94% with notable improvements in `config.py` (94%), `engine_catalog.py` (93%), and `cli.py` (90%).
- `python -m pytest --cov=. --cov-report=term-missing` → 103 passed, 8 skipped in 84.08s; total coverage steady at 94% with remaining misses isolated to integration skips and a handful of CLI/config helper branches.
- `uvx mypy .` → 70+ errors persist (missing stubs for `pytest`, `httpx`, `tenacity`, `langcodes`, `rich`, `platformdirs`, `tomli_w`, `loguru`; expected `Chat.completions` attribute stubs; fixture objects without typed attributes).
- `uvx bandit -r .` → 273 Low severity findings (test `assert` usage + config backup fallback) — no new Medium/High issues.
- Added regression tests in `tests/test_config.py`, `tests/test_engine_catalog.py`, and `tests/test_cli.py` covering config fallback logging, engine provider discovery, and CLI engine filters; TODO items cleared.
- Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` after full verification.

### Automated Report – 2025-09-21 06:35 UTC
- `python -m pytest -xvs` → 103 passed, 8 skipped in 78.04s; coverage plugin snapshot 94% overall with remaining misses centred on `cli.py`, `config.py`, `engine_catalog.py`, `engines.py`, and integration skips.
- `python -m pytest --cov=. --cov-report=term-missing` → 103 passed, 8 skipped in 78.76s; total coverage 94% with uncovered lines explicitly reported for CLI helper branches, config defaults, engine fallbacks, setup prompts, and integration smoke tests (still intentionally partial).
- `uvx mypy .` → 92 errors driven by missing third-party stubs (`pytest`, `httpx`, `tenacity`, `semantic_text_splitter`, `langcodes`, `rich`, `platformdirs`, `tomli_w`, `loguru`) plus expected attribute lookups on mocked OpenAI `Chat.completions` helpers and stubbed example exports.
- `uvx bandit -r .` → 273 Low severity findings (expected `assert` usage throughout tests and the guarded backup writer fallback in `config.py`); no Medium/High issues detected.
- Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` as part of /cleanup following the verification sweep.

### Automated Report – 2025-09-21 06:44 UTC
- `python -m pytest -xvs` → 109 passed, 8 skipped in 79.46s; coverage plugin summary 94% overall with `config.py` now 97% and `engine_catalog.py` 95% after new fallback/aggregation tests.
- `python -m pytest --cov=. --cov-report=term-missing` → 109 passed, 8 skipped in 78.85s; total coverage steady at 94% with remaining misses limited to CLI helper branches, engine retry paths, setup prompts, and intentionally skipped integration flows.
- `uvx mypy .` → 92 errors (unchanged; driven by missing third-party stubs and expected mocked attribute lookups across openai/httpx/tenacity/langcodes/rich/platformdirs/tomli_w/loguru plus fixture helper projections).
- `uvx bandit -r .` → 292 Low severity findings (increase from additional asserts in new tests; same config backup fallback flagged); no Medium/High issues detected.
- Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` during post-run /cleanup.


### Issue #200 Kickoff
- Reset historical log entries to focus on current objectives.
- Pending: document progress once refreshed roadmap and tasks are defined.
### Maintenance Snapshot
- Ran full pytest suite (`python -m pytest -xvs`) – 30 passed, 8 skipped, 0 failed; coverage plugin reported 75% overall.
- Cleared ephemeral artifacts (`.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.DS_Store`).
- No source changes performed this session; repository remains documentation-only modifications.
- 2025-09-21 04:45 UTC — Removed `.pytest_cache`, `.benchmarks`, `.coverage`, and `.ruff_cache` after validation pass.
### Current Iteration Targets
- [x] Normalize engine selector handling with canonical short aliases in config + CLI.
- [x] Refresh CLI engine listings and defaults to present short selectors only.
- [x] Add regression tests covering selector normalization and CLI output updates.
- [x] Add engine listing filters (`--family`, `--configured-only`) and document the UX changes.
- [x] Harden CLI entry points + setup to accept short selectors everywhere and update language listing UX/tests.
- [x] Extend provider discovery metadata using external research and surface pricing hints.
- [x] Refresh documentation and runnable examples to reflect validation workflow.
- [x] Raise coverage by deepening tests for validation flows and setup integration.

### Reliability Boost Sprint – Active
- [x] Enforce pipeline read-permission safeguards (tests/test_pipeline.py)
- [x] Exercise `_persist_output` write-over & voc paths (tests/test_pipeline.py)
- [x] Mock example flows for coverage (tests/test_examples.py)
### Release Readiness Checklist (Completed)
- [x] Raise total coverage to ≥90% by targeting `setup.py`, `openai_lite.py`, and high-miss integration paths.
- [x] Document manual smoke tests (engines, validate, setup, translation) with latest run results.
- [x] Draft release notes summarizing selector overhaul, validation command, and setup improvements.
### Maintenance Sprint Targets
- [x] Expand `validation.py` helper coverage for selector normalization and defaults.
- [x] Extend CLI tests for `validate` and `lang` commands.
- [x] Document validation selector guidance in `docs/cli.md` and cross-links.
### Test Results
- `python -m pytest -xvs` → 56 passed, 8 skipped, 0 failed; coverage plugin emitted 86% overall with `setup.py` (67%) and `openai_lite.py` (60%) still trailing our 90% release target.
- 2025-09-21 04:42 UTC — `python -m pytest -xvs` → 56 passed, 8 skipped; coverage report at 86% overall (per-terminal plugin) with significant gaps remaining in `setup.py` (67%) and `openai_lite.py` (60%).
- 2025-09-21 04:58 UTC — `python -m pytest -xvs` → 74 passed, 8 skipped; coverage climbed to 91% overall with `setup.py` at 93% and residual hotspots in `cli.py` (83%) and `validation.py` (89%).
- 2025-09-21 05:10 UTC — `python -m pytest -xvs` → 78 passed, 8 skipped; coverage now 92% overall with `validation.py` at 100% and `cli.py` lifted to 84%.
### Manual Smoke Tests (2025-09-21)
- `python -m abersetz.cli_fast engines` — succeeded; rendered 22 selectors across tr/dt/hy/ll families with expected configuration flags.
- `python -m abersetz.cli_fast validate` — aborted after 60s timeout; translators backends attempt live network calls, needs stubbed selector set for offline smoke testing.
- `python -m abersetz.cli_fast setup --non_interactive True` — aborted after 60s timeout at post-setup validation for the same reason; requires sandboxed validation fixtures before release.
- `python -m abersetz.cli_fast tr es examples/poem_en.txt --dry_run True --engine tr/google` — succeeded; dry-run pipeline enumerated output path without writing files.

### QA Sweep – 2025-09-21
- Ran `python -m pytest -xvs` (78 passed, 8 skipped, 0 failed; coverage plugin snapshot 92% overall, highlighted misses in `cli.py`, `config.py`, `engine_catalog.py`, `engines.py`, `pipeline.py`, `setup.py`).
- Ran `python -m pytest --cov=. --cov-report=term-missing` (92% total coverage; missing lines enumerated for follow-up).
- Ran `uvx mypy .` (74 errors; missing type stubs for pytest/httpx/etc., attr access issues on stubbed classes) — requires dependency stubs and API wrapper adjustments.
- Ran `uvx bandit -r .` (204 Low issues, mostly intentional `assert` usage in tests plus fallback backup handler `try/except/pass`).
- Performed cleanup: removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`.

### Configuration & Catalog Hardening – Completed 2025-09-21
- Added regression coverage for `Defaults.from_dict(None)` and `EngineConfig.from_dict(name, None)` ensuring stripped config sections still yield canonical defaults.
- Strengthened credential conversion tests to assert optional field serialization and TypeError on unsupported payloads.
- Validated deep-translator provider aggregation with include-paid flow to guarantee deterministic, duplicate-free listings.

### Active Tasks – CLI Reliability
- [x] Add tests for `_parse_patterns` and `_load_json_data` edge cases.
- [x] Capture empty-state rendering for `_render_engine_entries` and `_render_validation_entries`.
- [x] Exercise `_collect_engine_entries` branches for single-provider configs and ullm profiles.

### Verification – CLI Reliability
- 2025-09-21 05:26 UTC — `python -m pytest tests/test_cli.py -xvs` (14 passed; targeted coverage step confirmed helper behaviour).
- 2025-09-21 05:27 UTC — `python -m pytest -xvs` (83 passed, 8 skipped; coverage 93% overall, `cli.py` now 92%).

### Automated Report – 2025-09-21
- 2025-09-21 05:33 UTC — `python -m pytest -xvs` (83 passed, 8 skipped; coverage plugin summary 93% overall. Initial 10s harness timeout re-run with extended limit.)
- 2025-09-21 05:35 UTC — `python -m pytest --cov=. --cov-report=term-missing` (83 passed, 8 skipped; total coverage 93% with misses concentrated in `cli.py`, `config.py`, `engine_catalog.py`, `engines.py`, `pipeline.py`, `setup.py`, and integration fixtures.)
- 2025-09-21 05:37 UTC — `uvx mypy .` (72 errors across 20 files; primarily missing third-party stubs for pytest/httpx/tenacity/langcodes/rich plus attr/type issues in `openai_lite.py`, `cli.py`, `tests/test_engines.py`, `tests/test_setup.py`, and examples.)
- 2025-09-21 05:38 UTC — `uvx bandit -r .` (221 Low-severity findings; expected `assert` usage in tests and the `config.py` backup `try/except/pass` guard.)
- 2025-09-21 05:39 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` as part of /cleanup.

### Automated Report – 2025-09-21 (Follow-up)
- 2025-09-21 05:52 UTC — `python -m pytest -xvs` (86 passed, 8 skipped; coverage plugin summary 92% overall. Warnings limited to intentional config reset fallbacks.)
- 2025-09-21 05:53 UTC — `python -m pytest --cov=. --cov-report=term-missing` (86 passed, 8 skipped; total coverage 92% with misses concentrated in `examples/basic_api.py`, CLI/config/setup hot paths, and integration suite skips.)
- 2025-09-21 05:54 UTC — `uvx mypy .` (70 errors, dominated by missing third-party stubs for pytest/httpx/tenacity/langcodes/rich plus stricter Optional handling in CLI/examples.)
- 2025-09-21 05:54 UTC — `uvx bandit -r .` (234 Low-severity findings: intentional `assert` usage throughout tests and the guarded backup writer in `config.py`.)
- 2025-09-21 05:55 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` during /cleanup.
- 2025-09-21 06:05 UTC — `python -m pytest tests/test_pipeline.py -k unreadable -xvs` validated permission guard raises `PipelineError` for zero-permission files.
- 2025-09-21 06:06 UTC — `python -m pytest tests/test_pipeline.py -k "write_over or dry_run" -xvs` exercised `_persist_output` branches covering write-over and dry-run behaviours.
- 2025-09-21 06:08 UTC — `python -m pytest tests/test_examples.py -xvs` executed mocked example flows; `examples/basic_api.py` coverage climbed to 98%, CLI usage banner verified offline.
- 2025-09-21 06:10 UTC — `python -m pytest -xvs` (96 passed, 8 skipped; total coverage 93% with `examples/basic_api.py` at 98% and `pipeline.py` at 99%).
- 2025-09-21 06:12 UTC — `python -m pytest --cov=. --cov-report=term-missing` (coverage steady at 93%; remaining gaps isolated to CLI/config/setup hot paths and intentionally skipped integrations).
- 2025-09-21 06:14 UTC — `uvx mypy .` (70 errors unchanged, comprised of missing third-party stubs plus attr checks on stubbed engine fixtures.)
- 2025-09-21 06:15 UTC — `uvx bandit -r .` (264 Low-severity findings: expected `assert` statements in tests and backup writer fallback in `config.py`).
- 2025-09-21 06:16 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage`, `.ruff_cache` after final test sweep.

### Quality Guardrails Sprint – 2025-09-21
- Added `tests/test_pipeline.py::test_translate_path_handles_mixed_formats` to exercise TXT+HTML flows with the `DummyEngine`, lifting pipeline coverage to 96% and verifying destination paths, chunk metadata, and selector normalization.
- Introduced `examples/basic_api.format_example_doc` with fallback messaging and accompanying `tests/test_examples.py` loader to stop `.strip()` calls on `None` docs; offline example listing now safe and documented.
- Tightened `tests/test_setup.py::test_generate_config_uses_fallbacks` with an explicit credential guard eliminating the prior union-attr mypy warning (remaining failures stem from missing third-party stub packages).
- 2025-09-21 05:46 UTC — `python -m pytest -xvs` (86 passed, 8 skipped; total coverage 91%, new tests passing).
- 2025-09-21 05:48 UTC — `python -m pytest --cov=. --cov-report=term-missing` (86 passed, 8 skipped; coverage steady at 91% with `tests/test_pipeline.py` now 99% covered and `pipeline.py` at 96%).
- 2025-09-21 05:50 UTC — `uvx mypy examples/basic_api.py tests/test_examples.py tests/test_setup.py src/abersetz/setup.py tests/test_pipeline.py` (22 errors, all attributable to missing stubs for httpx/tenacity/pytest/langcodes/rich/semantic-text-splitter and existing openai shims; confirmed the prior union-attr diagnostic cleared.)
- 2025-09-21 05:51 UTC — Removed `.pytest_cache`, `.mypy_cache`, `.benchmarks`, `.coverage` after full-suite run.

</document_content>
</document>

<document index="21">
<source>build.sh</source>
<document_content>
#!/usr/bin/env bash
cd "$(dirname "$0")"
uvx hatch clean; 
fd -e py -x autoflake {}; 
fd -e py -x pyupgrade --py311-plus {}; 
fd -e py -x ruff check --output-format=github --fix --unsafe-fixes {}; 
fd -e py -x ruff format --respect-gitignore --target-version py311 {};
uvx hatch fmt;
llms .;
gitnextver .; 
uvx hatch build;
uv publish;
</document_content>
</document>

<document index="22">
<source>docs/_config.yml</source>
<document_content>
# Jekyll configuration for abersetz documentation
# Using just-the-docs theme

title: abersetz
description: Minimalist file translator with pluggable engines
baseurl: "/abersetz"
url: "https://twardoch.github.io"

# Theme configuration
remote_theme: just-the-docs/just-the-docs@v0.7.0
color_scheme: light

# Enable search
search_enabled: true
search:
  heading_level: 2
  previews: 3
  preview_words_before: 5
  preview_words_after: 10
  tokenizer_separator: /[\s/]+/
  rel_url: true
  button: false

# Enable navigation
nav_enabled: true
nav_sort: case_sensitive

# Footer
footer_content: "Copyright &copy; 2025 Adam Twardoch. Distributed under the MIT License."
last_edit_timestamp: true
last_edit_time_format: "%b %e %Y at %I:%M %p"

# Back to top link
back_to_top: true
back_to_top_text: "Back to top"

# External links
aux_links:
  "GitHub":
    - "https://github.com/twardoch/abersetz"
  "PyPI":
    - "https://pypi.org/project/abersetz"

# Collections for organizing content
collections:
  docs:
    permalink: "/:collection/:path/"
    output: true

just_the_docs:
  collections:
    docs:
      name: Documentation
      nav_xclude: false
      search_xclude: false

# Plugins
plugins:
  - jekyll-seo-tag
  - jekyll-sitemap

# Markdown settings
markdown: kramdown
kramdown:
  syntax_highlighter_opts:
    block:
      line_numbers: false

# xclude files
xclude:
  - "*.py"
  - "*.sh"
  - "requirements.txt"
  - "Gemfile"
  - "Gemfile.lock"
  - "node_modules/"
  - "vendor/"
</document_content>
</document>

<document index="23">
<source>docs/api.md</source>
<document_content>
---
layout: default
title: Python API
nav_order: 4
---

# Python API Reference
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Overview

The abersetz Python API provides programmatic access to all translation functionality.

## Main Functions

### translate_path

Main function for translating files or directories.

```python
from abersetz import translate_path, TranslatorOptions

results = translate_path(
    path="document.txt",
    options=TranslatorOptions(to_lang="es"),
    config=None,  # Optional custom config
    client=None   # Optional HTTP client
)
```

**Parameters:**
- `path` (str | Path): File or directory to translate
- `options` (TranslatorOptions): Translation settings
- `config` (AbersetzConfig, optional): Custom configuration
- `client` (optional): HTTP client for API calls

**Returns:**
- List[TranslationResult]: Results for each translated file

## Core Classes

### TranslatorOptions

Configuration for translation operations.

```python
from abersetz import TranslatorOptions

options = TranslatorOptions(
    engine="tr/google",
    from_lang="auto",
    to_lang="en",
    recurse=True,
    write_over=False,
    output_dir=Path("output/"),
    save_voc=False,
    chunk_size=1200,
    html_chunk_size=1800,
    include=("*.txt", "*.md"),
    xclude=("*test*",),
    dry_run=False,
    prolog={"role": "translator"},
    initial_voc={"term": "translation"}
)
```

**Attributes:**
- `engine` (str): Translation engine name
- `from_lang` (str): Source language code
- `to_lang` (str): Target language code
- `recurse` (bool): Process subdirectories
- `write_over` (bool): Replace original files
- `output_dir` (Path): Output directory path
- `save_voc` (bool): Save voc JSON
- `chunk_size` (int): Characters per text chunk
- `html_chunk_size` (int): Characters per HTML chunk
- `include` (tuple): File patterns to include
- `xclude` (tuple): File patterns to xclude
- `dry_run` (bool): Preview without translating
- `prolog` (dict): Initial context for LLMs
- `initial_voc` (dict): Starting voc

### TranslationResult

Result information for a translated file.

```python
from abersetz import TranslationResult

result = TranslationResult(
    source=Path("input.txt"),
    destination=Path("output.txt"),
    chunks=5,
    voc={"term": "translation"},
    format=TextFormat.PLAIN
)
```

**Attributes:**
- `source` (Path): Source file path
- `destination` (Path): Output file path
- `chunks` (int): Number of chunks processed
- `voc` (dict): Final voc (LLM engines)
- `format` (TextFormat): Detected format (PLAIN, HTML, MARKDOWN)

### PipelineError

Exception raised when translation fails.

```python
from abersetz import PipelineError

try:
    results = translate_path("missing.txt")
except PipelineError as e:
    print(f"Translation failed: {e}")
```

## Configuration Management

### Loading Configuration

```python
from abersetz.config import load_config

config = load_config()
print(config.defaults.engine)
print(config.defaults.to_lang)
```

### Saving Configuration

```python
from abersetz.config import save_config, AbersetzConfig, Defaults

config = AbersetzConfig(
    defaults=Defaults(
        engine="tr/google",
        to_lang="es",
        chunk_size=1500
    )
)

save_config(config)
```

### Custom Engine Configuration

```python
from abersetz.config import EngineConfig, Credential

config.engines["custom"] = EngineConfig(
    name="custom",
    chunk_size=2000,
    credential=Credential(env="CUSTOM_API_KEY"),
    options={
        "base_url": "https://api.custom.com/v1",
        "model": "translation-v1"
    }
)
```

## Engine Management

### Creating Engines

```python
from abersetz.engines import create_engine
from abersetz.config import load_config

config = load_config()
engine = create_engine("tr/google", config)
```

### Using Engines Directly

```python
from abersetz.engines import EngineRequest

request = EngineRequest(
    text="Hello world",
    source_lang="en",
    target_lang="es",
    is_html=False,
    voc={},
    prolog={},
    chunk_index=0,
    total_chunks=1
)

result = engine.translate(request)
print(result.text)  # "Hola mundo"
```

## Text Processing

### Format Detection

```python
from abersetz.chunking import detect_format, TextFormat

text = "<h1>Title</h1><p>Content</p>"
format = detect_format(text)
# Returns TextFormat.HTML
```

### Text Chunking

```python
from abersetz.chunking import chunk_text, TextFormat

chunks = chunk_text(
    text="Long document...",
    max_size=1000,
    format=TextFormat.MARKDOWN
)
```

## Complete Examples

### Simple Translation

```python
from abersetz import translate_path, TranslatorOptions

# Translate a single file
results = translate_path(
    "document.txt",
    TranslatorOptions(
        to_lang="fr",
        engine="tr/google"
    )
)

for result in results:
    print(f"Translated: {result.source} -> {result.destination}")
    print(f"Chunks: {result.chunks}")
```

### Batch Processing

```python
from pathlib import Path
from abersetz import translate_path, TranslatorOptions

def batch_translate(source_dir, languages, engine="tr/google"):
    """Translate to multiple languages."""
    results = {}

    for lang in languages:
        print(f"Translating to {lang}...")
        lang_results = translate_path(
            source_dir,
            TranslatorOptions(
                to_lang=lang,
                engine=engine,
                output_dir=Path(f"output_{lang}"),
                recurse=True
            )
        )
        results[lang] = lang_results

    return results

# Usage
results = batch_translate("docs/", ["es", "fr", "de"])
```

### Custom Workflow

```python
from abersetz import translate_path, TranslatorOptions
from abersetz.config import load_config, save_config
import json

class TranslationWorkflow:
    def __init__(self):
        self.config = load_config()
        self.voc = {}

    def translate_with_voc(self, files, to_lang):
        """Maintain voc across files."""
        all_results = []

        for file in files:
            results = translate_path(
                file,
                TranslatorOptions(
                    to_lang=to_lang,
                    engine="ll/default",
                    initial_voc=self.voc,
                    save_voc=True
                ),
                config=self.config
            )

            if results:
                # Update voc
                self.voc.update(results[0].voc)
                all_results.extend(results)

        # Save final voc
        with open(f"voc_{to_lang}.json", "w") as f:
            json.dump(self.voc, f, indent=2)

        return all_results

# Usage
workflow = TranslationWorkflow()
results = workflow.translate_with_voc(
    ["doc1.md", "doc2.md", "doc3.md"],
    "es"
)
```

### Error Handling

```python
from abersetz import translate_path, TranslatorOptions, PipelineError
import logging

def safe_translate(path, **options):
    """Translate with comprehensive error handling."""
    try:
        results = translate_path(
            path,
            TranslatorOptions(**options)
        )
        return results

    except PipelineError as e:
        logging.error(f"Translation pipeline error: {e}")
        return None

    except FileNotFoundError as e:
        logging.error(f"File not found: {e}")
        return None

    except Exception as e:
        logging.error(f"Unexpected error: {e}")
        raise

# Usage with retry
import time

for attempt in range(3):
    results = safe_translate("document.txt", to_lang="es")
    if results:
        break
    time.sleep(2 ** attempt)  # Exponential backoff
```

### Async Translation

```python
import asyncio
from abersetz import translate_path, TranslatorOptions

async def translate_async(path, to_lang):
    """Async wrapper for translation."""
    loop = asyncio.get_event_loop()
    return await loop.run_in_executor(
        None,
        translate_path,
        path,
        TranslatorOptions(to_lang=to_lang)
    )

async def translate_multiple(files, to_lang):
    """Translate multiple files concurrently."""
    tasks = [translate_async(f, to_lang) for f in files]
    return await asyncio.gather(*tasks)

# Usage
files = ["doc1.txt", "doc2.txt", "doc3.txt"]
results = asyncio.run(translate_multiple(files, "es"))
```

## Advanced Topics

### Custom Engines

```python
from abersetz.engines import EngineBase, EngineRequest, EngineResult

class CustomEngine(EngineBase):
    """Custom translation engine implementation."""

    def __init__(self, config):
        super().__init__("custom", 1000, 1500)
        self.config = config

    def translate(self, request: EngineRequest) -> EngineResult:
        # Your translation logic here
        translated = self.call_api(request.text)
        return EngineResult(
            text=translated,
            voc={}
        )
```

### voc Management

```python
from typing import Dict
import json

class vocManager:
    """Manage translation vocabularies."""

    def __init__(self):
        self.vocabularies: Dict[str, Dict[str, str]] = {}

    def load(self, path: str, lang_pair: str):
        with open(path) as f:
            self.vocabularies[lang_pair] = json.load(f)

    def merge(self, *lang_pairs: str) -> Dict[str, str]:
        merged = {}
        for pair in lang_pairs:
            if pair in self.vocabularies:
                merged.update(self.vocabularies[pair])
        return merged

    def save(self, voc: Dict[str, str], path: str):
        with open(path, "w") as f:
            json.dump(voc, f, indent=2, ensure_ascii=False)
```

## See Also

- [CLI Reference](cli/)
- [Configuration Guide](configuration/)
- [Examples](examples/)
</document_content>
</document>

<document index="24">
<source>docs/cli.md</source>
<document_content>
---
layout: default
title: CLI Reference
nav_order: 3
---

# CLI Reference
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Overview

Abersetz provides two command-line tools:

- `abersetz`: Main CLI with subcommands (`tr`, `validate`, `config`, `engines`, `version`)
- `abtr`: Direct translation shorthand

## Main Commands

### abersetz tr

Translate files or directories.

```bash
abersetz tr PATH [OPTIONS]
```

#### Arguments

- `PATH`: File or directory to translate (required)

#### Options

| Option | Description | Default |
|--------|-------------|---------|
| `to_lang` (positional) | Target language code | — |
| `--from-lang` | Source language code | `auto` |
| `--engine` | Translation engine | `tr/google` (legacy names auto-normalized) |
| `--output` | Output directory | `<lang>/<filename>` |
| `--recurse/--no-recurse` | Process subdirectories | `True` |
| `--write_over` | Replace original files | `False` |
| `--include` | File patterns to include | `*.txt,*.md,*.html` |
| `--xclude` | File patterns to xclude | None |
| `--chunk-size` | Characters per chunk | `1200` |
| `--html-chunk-size` | Characters per HTML chunk | `1800` |
| `--save-voc` | Save voc JSON | `False` |
| `--dry-run` | Preview without translating | `False` |
| `--verbose` | Enable debug output | `False` |

### abersetz config

Manage configuration settings.

```bash
abersetz config COMMAND
```

#### Subcommands

- `show`: Display current configuration
- `path`: Show configuration file location

### abersetz version

Display version information.

```bash
abersetz version
```

### abersetz engines

List available engine families and providers.

```bash
abersetz engines [--include-paid] [--family tr|dt|ll|hy] [--configured-only]
```

- `--family`: filter to a single engine family (short alias or legacy name).
- `--configured-only`: show only engines currently configured.

### abersetz validate

Exercise each configured engine with a short translation and report status, latency, and pricing hints.

```bash
abersetz validate [--selectors tr/google,ll/default] [--target-lang es] [--sample-text "Hello"]
```

- `--selectors`: comma-separated list of selectors to validate (defaults to every configured selector).
- `--target-lang`: target language for the sample translation (defaults to `es`).
- `--sample-text`: override the default sample prompt (`Hello, world!`).
- `--include-defaults/--no-include-defaults`: toggle whether the default engine from config is forced into the run.

{: .note }
Running validation hits live translation APIs. When you are offline—or when you only need a smoke test—use `--selectors` to limit the run to a handful of engines and add `--no-include-defaults` to skip automatically discovered selectors. For example:

```bash
abersetz validate \
  --selectors tr/google,ll/default \
  --no-include-defaults \
  --sample-text "Ping"
```

This checks only the Google free tier and your primary LLM profile, keeping the run under a few seconds and avoiding throttled providers.

## Shorthand Command

### abtr

Direct translation command equivalent to `abersetz tr`:

```bash
abtr TO_LANG PATH [OPTIONS]
```

All options from `abersetz tr` are available.

## Usage Examples

### Basic Translation

Translate a single file:

```bash
abersetz tr es document.txt
```

Translate to French using shorthand:

```bash
abtr fr document.txt
```

### Directory Translation

Translate all files in a directory:

```bash
abersetz tr de ./docs --output ./docs_de
```

With specific patterns:

```bash
abtr ja ./project \
  --include "*.md,*.txt" \
  --xclude "*test*,.*" \
  --output ./translations/ja
```

### Engine Selection

Use different translation engines:

```bash
# Google Translate (free)
abtr pt file.txt --engine translators/google

# Bing Translate (free)
abtr pt file.txt --engine translators/bing

# DeepL
abtr pt file.txt --engine deep-translator/deepl

# SiliconFlow LLM
abtr pt file.txt --engine hysf

# Custom LLM profile
abtr pt file.txt --engine ullm/gpt4
```

### Validate Engines

Generate a quick health report for every configured engine:

```bash
abersetz validate --target-lang de --selectors tr/google,ll/default
```

### Advanced Options

write_over original files:

```bash
abersetz tr es backup.txt --write_over
```

Save voc for LLM engines:

```bash
abtr de technical.md \
  --engine ullm/default \
  --save-voc
```

Dry run to preview:

```bash
abersetz tr fr large_project/ \
  --dry-run
```

Custom chunk sizes:

```bash
abtr zh-CN document.html \
  --html-chunk-size 3000
```

## Language Codes

Common language codes supported:

| Code | Language |
|------|----------|
| `en` | English |
| `es` | Spanish |
| `fr` | French |
| `de` | German |
| `it` | Italian |
| `pt` | Portuguese |
| `ru` | Russian |
| `ja` | Japanese |
| `ko` | Korean |
| `zh-CN` | Chinese (Simplified) |
| `zh-TW` | Chinese (Traditional) |
| `ar` | Arabic |
| `hi` | Hindi |
| `auto` | Auto-detect (source only) |

## Pattern Matching

Include/xclude patterns support wildcards:

- `*.txt` - All .txt files
- `doc*` - Files starting with "doc"
- `*test*` - Files containing "test"
- `.*` - Hidden files
- `*.{md,txt}` - Multiple extensions

## Environment Variables

Set default behaviors with environment variables:

```bash
# Default target language
export ABERSETZ_TO_LANG=es

# Default engine
export ABERSETZ_ENGINE=translators/bing

# API keys for LLM engines
export OPENAI_API_KEY=sk-...
export SILICONFLOW_API_KEY=sk-...
```

## Output Format

Translation results are printed as file paths:

```
/path/to/output/file1.txt
/path/to/output/file2.txt
```

Use `--verbose` for detailed progress:

```bash
abersetz tr fr docs/ --verbose
```

## Error Handling

Common errors and solutions:

### Missing API key

```
Error: Missing API key for engine
```

Solution: Export the required environment variable:

```bash
export SILICONFLOW_API_KEY="your-key"
```

### No files matched

```
Error: No files matched under /path
```

Solution: Check your include patterns:

```bash
abtr . --include "*.md,*.txt"
```

### Network error

```
Error: Network error - Connection timeout
```

Solution: The tool automatically retries. Check your internet connection.

## Tips and Tricks

### Batch translation

Create a script for multiple languages:

```bash
for lang in es fr de ja; do
  abersetz tr $lang docs/ --output docs_$lang
done
```

### Parallel processing

Use GNU parallel for speed:

```bash
find . -name "*.txt" | parallel -j4 abtr es {}
```

### Progress tracking

For large projects, use verbose mode:

```bash
abersetz tr fr large_project/ --verbose 2>&1 | tee translation.log
```

### Testing configuration

Always test with dry-run first:

```bash
abersetz tr de important_docs/ --dry-run
```

## See Also

- [Configuration Guide](configuration/)
- [Python API Reference](api/)
- [Translation Engines](engines/)

</document_content>
</document>

<document index="25">
<source>docs/configuration.md</source>
<document_content>
---
layout: default
title: Configuration
nav_order: 5
---

# Configuration
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Overview

Abersetz stores configuration in a TOML file managed by `platformdirs`, ensuring cross-platform compatibility.

## Configuration Location

Find your configuration file:

```bash
abersetz config path
```

Typical locations:
- **Linux**: `~/.config/abersetz/config.toml`
- **macOS**: `~/Library/Application Support/abersetz/config.toml`
- **Windows**: `%APPDATA%\abersetz\config.toml`

## Configuration Structure

### Complete Example

```toml
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.openai]
env = "OPENAI_API_KEY"

[credentials.anthropic]
env = "ANTHROPIC_API_KEY"

[credentials.siliconflow]
env = "SILICONFLOW_API_KEY"

[credentials.deepseek]
env = "DEEPSEEK_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
credential = { name = "siliconflow" }
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.default.prolog]
```

## Configuration Sections

### defaults

Global default settings for all translations:

```toml
[defaults]
engine = "tr/google" # Default translation engine
from_lang = "auto"             # Source language (auto-detect)
to_lang = "en"                  # Target language
chunk_size = 1200              # Characters per text chunk
html_chunk_size = 1800         # Characters per HTML chunk
```


### credentials

API key storage with environment variable references:

```toml
[credentials.openai]
env = "OPENAI_API_KEY"        # Read from environment

[credentials.custom]
value = "sk-actual-key-here"  # Direct value (not recommended)
```


### engines

Custom engine configurations:

```toml
[engines.engine_name]
chunk_size = 2000

[engines.engine_name.credential]
name = "credential_name"

# Engine-specific options
[engines.engine_name.options]
```


## Setting Up Credentials

### Environment Variables (Recommended)

Store API keys as environment variables:

```bash
# Add to ~/.bashrc or ~/.zshrc
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export SILICONFLOW_API_KEY="sk-..."
```

Then reference in config:

```toml
[credentials.openai]
env = "OPENAI_API_KEY"
```


### Direct Values (Not Recommended)

Store directly in config (less secure):

```toml
[credentials.openai]
value = "sk-actual-key-here"
```


## Engine Configuration

### LLM Engine (ullm)

Configure multiple LLM profiles:

```toml
[engines.ullm]
chunk_size = 2400

[engines.ullm.options.profiles.gpt4]
base_url = "https://api.openai.com/v1"
model = "gpt-4-turbo-preview"
credential = { name = "openai" }
temperature = 0.3
max_input_tokens = 128000

[engines.ullm.options.profiles.gpt4.prolog]
role = "You are an expert translator"

[engines.ullm.options.profiles.claude]
base_url = "https://api.anthropic.com/v1"
model = "claude-3-opus-20240229"
credential = { name = "anthropic" }
temperature = 0.3
max_input_tokens = 200000
```


Usage:
```bash
abtr es file.txt --engine ullm/gpt4
abtr fr file.txt --engine ullm/claude
```

### Custom Endpoints

Configure self-hosted models:

```toml
[engines.local_llm]
chunk_size = 1500

[engines.local_llm.options]
base_url = "http://localhost:8080/v1"
model = "local-model"
temperature = 0.5
```


## Managing Configuration

### View Current Config

```bash
abersetz config show
```

Or pretty-print:

```bash
abersetz config show | jq '.'
```

### Edit Configuration

Edit directly:

```bash
# Find location
CONFIG_PATH=$(abersetz config path | tail -1)

# Edit with your preferred editor
nano "$CONFIG_PATH"
# or
vim "$CONFIG_PATH"
```

### Reset Configuration

Remove to reset to defaults:

```bash
rm "$(abersetz config path | tail -1)"
```

### Backup Configuration

```bash
CONFIG_PATH=$(abersetz config path | tail -1)
cp "$CONFIG_PATH" "$CONFIG_PATH.backup"
```

## Python Configuration API

### Load Configuration

```python
from abersetz.config import load_config

config = load_config()
print(config.defaults.engine)
print(config.defaults.to_lang)
```

### Modify Configuration

```python
from abersetz.config import load_config, save_config

config = load_config()

# Change defaults
config.defaults.to_lang = "es"
config.defaults.chunk_size = 1500

# Add credential
from abersetz.config import Credential
config.credentials["myapi"] = Credential(env="MY_API_KEY")

# Save changes
save_config(config)
```

### Add Custom Engine

```python
from abersetz.config import load_config, save_config, EngineConfig, Credential

config = load_config()

config.engines["custom"] = EngineConfig(
    name="custom",
    chunk_size=2000,
    credential=Credential(env="CUSTOM_API_KEY"),
    options={
        "base_url": "https://api.custom.com/v1",
        "model": "translation-v1",
        "temperature": 0.3
    }
)

save_config(config)
```

## Environment Variables

### Abersetz-specific

Override defaults with environment variables:

```bash
export ABERSETZ_ENGINE="tr/bing"
export ABERSETZ_TO_LANG="es"
export ABERSETZ_CHUNK_SIZE="1500"
```

### API Keys

Standard API key variables:

```bash
# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google
export GOOGLE_API_KEY="..."

# SiliconFlow
export SILICONFLOW_API_KEY="sk-..."

# DeepSeek
export DEEPSEEK_API_KEY="..."

# Mistral
export MISTRAL_API_KEY="..."

# Together AI
export TOGETHERAI_API_KEY="..."
```

## Configuration Templates

### Minimal Config

```toml
[defaults]
engine = "tr/google"
to_lang = "es"
```


### Multi-engine Config

```toml
[defaults]
engine = "tr/google"

[credentials.openai]
env = "OPENAI_API_KEY"

[credentials.anthropic]
env = "ANTHROPIC_API_KEY"

[engines.gpt]
chunk_size = 3000

[engines.gpt.credential]
name = "openai"

[engines.gpt.options]
model = "gpt-4-turbo-preview"
base_url = "https://api.openai.com/v1"

[engines.claude]
chunk_size = 3000

[engines.claude.credential]
name = "anthropic"

[engines.claude.options]
model = "claude-3-opus-20240229"
base_url = "https://api.anthropic.com/v1"
```


### Enterprise Config

```toml
[defaults]
engine = "corporate_llm"
to_lang = "en"
chunk_size = 2000

[credentials.corporate]
env = "CORP_TRANSLATION_KEY"

[engines.corporate_llm]
chunk_size = 2500

[engines.corporate_llm.credential]
name = "corporate"

[engines.corporate_llm.options]
base_url = "https://translation.company.com/v1"
model = "corp-translator-v2"
temperature = 0.2
max_retries = 5
timeout = 30
```


## Security Best Practices

1. **Never commit API keys**: Add `config.toml` to `.gitignore`

2. **Use environment variables**: Store keys in environment, not config

3. **Rotate keys regularly**: Update API keys periodically

4. **Restrict file permissions**:
   ```bash
   chmod 600 "$(abersetz config path | tail -1)"
   ```

5. **Use separate keys**: Different keys for dev/prod environments

## Troubleshooting

### Config not loading

Check file exists and is valid JSON:

```bash
CONFIG_PATH=$(abersetz config path | tail -1)
cat "$CONFIG_PATH" | jq '.'
```

### API key not found

Verify environment variable is set:

```bash
echo $OPENAI_API_KEY
```

### Permission denied

Fix file permissions:

```bash
chmod 644 "$(abersetz config path | tail -1)"
```

## Validation Selector Tips

- Keep a short list of smoke-test selectors (for example, `tr/google,ll/default`) to avoid hammering every provider when you validate changes.
- Run `abersetz validate --selectors tr/google,ll/default --no-include-defaults` during CI or offline development; it skips auto-discovered providers while still exercising each engine family.
- Review the [CLI validate command](cli.html#abersetz-validate) for more usage patterns.

## See Also

- [Translation Engines](engines/)
- [Python API](api/)
- [Examples](examples/)

</document_content>
</document>

<document index="26">
<source>docs/index.md</source>
<document_content>
---
layout: home
title: Home
nav_order: 1
description: "Abersetz is a minimalist file translator that reuses proven machine translation engines"
permalink: /
---

# abersetz
{: .fs-9 }

Minimalist file translator with pluggable engines
{: .fs-6 .fw-300 }

[Get started](#getting-started){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }
[View on GitHub](https://github.com/twardoch/abersetz){: .btn .fs-5 .mb-4 .mb-md-0 }

---

## Why abersetz?

- **File-focused**: Designed for translating documents, not single strings
- **Multiple engines**: Supports free and paid translation services
- **voc consistency**: LLM engines maintain terminology across chunks
- **Simple CLI**: Clean interface with minimal output
- **Python API**: Full programmatic access for automation

## Features

- 🔄 **Multiple translation engines**
  - Free: Google, Bing via `translators` and `deep-translator`
  - LLM: OpenAI, Anthropic, SiliconFlow, and 20+ providers
  - Custom endpoints for self-hosted models

- 📁 **Smart file handling**
  - Recursive directory translation
  - Pattern matching with include/xclude
  - HTML markup preservation
  - Automatic format detection

- 🧩 **Intelligent chunking**
  - Semantic text splitting
  - Configurable chunk sizes per engine
  - Context preservation across chunks

- 📚 **voc management**
  - JSON voc propagation
  - Consistent terminology in long documents
  - Optional voc export
- ✅ **Engine validation**
  - `abersetz validate` smoke-tests each selector
  - Latency and pricing hints pulled from the research catalog
  - Ideal for CI smoke tests and onboarding checks

## Getting Started

### Installation

```bash
pip install abersetz
```

### Quick Start

Run setup, validate engines, and translate a file:
```bash
abersetz setup
abersetz validate --target-lang es
abersetz tr es document.txt
```

Or use the shorthand for translation:
```bash
abtr es document.txt
```

Translate a directory:
```bash
abersetz tr fr ./docs --output ./docs_fr
```

### Configuration

Abersetz stores configuration in your user directory:

```bash
abersetz config path  # Show config location
abersetz config show  # Display current settings
```

## Example Usage

### CLI Examples

```bash
# Translate with specific engine
abtr de file.txt --engine tr/google

# Translate markdown files only
abtr ja . --include "*.md" --output ./ja

# Dry run to preview
abersetz tr zh-CN project/ --dry-run

# Validate configured engines
abersetz validate --selectors tr/google,ll/default

# Use LLM with voc
export SILICONFLOW_API_KEY="your-key"
abtr es technical.md --engine hy --save-voc
```

### Python API

```python
from abersetz import translate_path, TranslatorOptions

# Simple translation
results = translate_path(
    "document.txt",
    TranslatorOptions(
        to_lang="fr",
        engine="tr/google"
    )
)

# Batch with patterns
results = translate_path(
    "docs/",
    TranslatorOptions(
        to_lang="de",
        include=("*.md", "*.txt"),
        output_dir="docs_de/"
    )
)
```

## Documentation

- [Installation Guide](installation/)
- [CLI Reference](cli/)
- [Python API](api/)
- [Configuration](configuration/)
- [Translation Engines](engines/)
- [Examples](examples/)

## License

MIT License - see [LICENSE](https://github.com/twardoch/abersetz/blob/main/LICENSE) for details.

</document_content>
</document>

<document index="27">
<source>docs/installation.md</source>
<document_content>
---
layout: default
title: Installation
nav_order: 2
---

# Installation
{: .no_toc }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

## Requirements

- Python 3.10 or higher
- pip or uv package manager

## Installing with pip

The simplest way to install abersetz:

```bash
pip install abersetz
```

## Installing with uv

If you use the modern uv package manager:

```bash
uv pip install abersetz
```

## Installing from source

To install the latest development version:

```bash
git clone https://github.com/twardoch/abersetz.git
cd abersetz
pip install -e .
```

## Verifying installation

After installation, verify abersetz is working:

```bash
# Check version
abersetz version

# Show help
abersetz --help

# Test with dry run
echo "Hello world" > test.txt
abersetz tr es test.txt --dry-run
```

## Dependencies

Abersetz automatically installs these dependencies:

### Core dependencies
- **translators** (>=5.9): Multiple free translation APIs
- **deep-translator** (>=1.11): Alternative translation providers
- **openai** (>=1.51): LLM-based translation engines
- **tenacity** (>=8.4): Retry logic for API calls

### Utility dependencies
- **fire** (>=0.5): CLI interface generation
- **rich** (>=13.9): Terminal formatting
- **loguru** (>=0.7): Structured logging
- **platformdirs** (>=4.3): Cross-platform config paths
- **semantic-text-splitter** (>=0.7): Intelligent text chunking

## Optional: Setting up API keys

For LLM-based translation engines, you'll need API keys:

```bash
# OpenAI GPT models
export OPENAI_API_KEY="sk-..."

# Anthropic Claude models
export ANTHROPIC_API_KEY="sk-ant-..."

# SiliconFlow (Hunyuan translator)
export SILICONFLOW_API_KEY="sk-..."

# Google Gemini
export GOOGLE_API_KEY="..."

# Add to your shell profile to persist
echo 'export OPENAI_API_KEY="sk-..."' >> ~/.bashrc
```

## Shell completion (optional)

Enable tab completion for bash:

```bash
# Generate completion script
python -c "import fire; fire.Fire()" -- --completion > ~/.abersetz-completion.bash

# Add to bashrc
echo "source ~/.abersetz-completion.bash" >> ~/.bashrc

# Reload shell
source ~/.bashrc
```

## Docker installation (alternative)

Run abersetz in a container:

```dockerfile
FROM python:3.12-slim

RUN pip install abersetz

WORKDIR /data

ENTRYPOINT ["abersetz"]
```

Build and use:

```bash
docker build -t abersetz .
docker run -v $(pwd):/data abersetz tr es /data/file.txt
```

## Troubleshooting

### Command not found

If `abersetz` command is not found after installation:

1. Check pip installed it to PATH:
   ```bash
   pip show -f abersetz | grep Location
   ```

2. Ensure scripts directory is in PATH:
   ```bash
   export PATH="$HOME/.local/bin:$PATH"
   ```

### Permission denied

On Linux/Mac, you may need to add execute permissions:

```bash
chmod +x ~/.local/bin/abersetz
chmod +x ~/.local/bin/abtr
```

### SSL certificate errors

If you encounter SSL errors with API calls:

```bash
# Update certificates
pip install --upgrade certifi

# Or disable SSL verification (not recommended)
export CURL_CA_BUNDLE=""
```

## Next steps

- [Configure abersetz](configuration/)
- [Learn CLI commands](cli/)
- [Explore examples](examples/)
</document_content>
</document>

# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/examples/advanced_api.py
# Language: python

import asyncio
import json
from dataclasses import dataclass, field
from pathlib import Path
from abersetz import TranslationResult, TranslatorOptions, translate_path
from abersetz.config import AbersetzConfig, load_config
from abersetz.engines import EngineRequest, create_engine
import sys

class _LanguageStats:
    def record((self, result: TranslationResult)) -> None:
    def to_dict((self)) -> dict[str, object]:

class _ReportFile:
    def to_dict((self)) -> dict[str, object]:

class TranslationWorkflow:
    """Advanced translation workflow with progress tracking."""
    def __init__((self, config: AbersetzConfig | None = None)):
    def translate_project((
        self, source_dir: str, target_langs: list[str], engine: str = "tr/google"
    )):
        """Translate entire project to multiple languages."""
    def generate_report((self, output_file: str = "translation_report.json")):
        """Generate detailed translation report."""

class vocManager:
    """Manage translation vocabularies across projects."""
    def __init__((self)):
    def load_voc((self, file_path: str, lang_pair: str)):
        """Load voc from JSON file."""
    def merge_vocabularies((self, *lang_pairs: str)) -> dict[str, str]:
        """Merge multiple vocabularies."""
    def translate_with_consistency((
        self, files: list[str], to_lang: str, base_voc: dict[str, str] | None = None
    )):
        """Translate files with consistent terminology."""

class ParallelTranslator:
    """Translate using multiple engines in parallel for comparison."""
    def translate_with_engine((self, text: str, engine_name: str, to_lang: str)):
        """Async translation with a specific engine."""
    def compare_translations((self, text: str, engines: list[str], to_lang: str)):
        """Compare translations from multiple engines."""

class IncrementalTranslator:
    def __init__((self, checkpoint_file: str = ".translation_checkpoint.json")):
    def load_checkpoint((self)) -> set:
    def save_checkpoint((self)):
    def translate_incrementally((self, source_dir: str, to_lang: str)):

def record((self, result: TranslationResult)) -> None:

def to_dict((self)) -> dict[str, object]:

def from_result((cls, result: TranslationResult)) -> "_ReportFile":

def to_dict((self)) -> dict[str, object]:

def __init__((self, config: AbersetzConfig | None = None)):

def translate_project((
        self, source_dir: str, target_langs: list[str], engine: str = "tr/google"
    )):
    """Translate entire project to multiple languages."""

def generate_report((self, output_file: str = "translation_report.json")):
    """Generate detailed translation report."""

def __init__((self)):

def load_voc((self, file_path: str, lang_pair: str)):
    """Load voc from JSON file."""

def merge_vocabularies((self, *lang_pairs: str)) -> dict[str, str]:
    """Merge multiple vocabularies."""

def translate_with_consistency((
        self, files: list[str], to_lang: str, base_voc: dict[str, str] | None = None
    )):
    """Translate files with consistent terminology."""

def translate_with_engine((self, text: str, engine_name: str, to_lang: str)):
    """Async translation with a specific engine."""

def compare_translations((self, text: str, engines: list[str], to_lang: str)):
    """Compare translations from multiple engines."""

def example_multi_language(()):
    """Translate documentation to multiple languages."""

def example_voc_consistency(()):
    """Maintain consistent terminology across documents."""

def example_parallel_comparison(()):
    """Compare translations from different engines."""

def example_incremental_translation(()):
    """Translate large projects incrementally."""

def __init__((self, checkpoint_file: str = ".translation_checkpoint.json")):

def load_checkpoint((self)) -> set:

def save_checkpoint((self)):

def translate_incrementally((self, source_dir: str, to_lang: str)):


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/examples/basic_api.py
# Language: python

from collections.abc import Callable
from pathlib import Path
from abersetz import TranslatorOptions, translate_path
from abersetz.config import load_config, save_config
from abersetz.config import Credential, EngineConfig
import sys

def format_example_doc((func: Callable[..., object])) -> str:
    """Return a human-friendly description for an example function."""

def example_simple(()):
    """Translate a single file with default settings."""

def example_batch(()):
    """Translate multiple files to a specific directory."""

def example_llm_with_voc(()):
    """Use LLM translation with custom voc."""

def example_dry_run(()):
    """Test translation without actually calling APIs."""

def example_html(()):
    """Translate HTML files while preserving markup."""

def example_with_config(()):
    """Use custom configuration for translation."""


<document index="28">
<source>examples/batch_translate.sh</source>
<document_content>
#!/bin/bash
# this_file: examples/batch_translate.sh

# Advanced batch translation scripts

set -e  # Exit on error

# Configuration
PROJECT_ROOT="${1:-./docs}"
OUTPUT_BASE="${2:-./translations}"
LANGUAGES=("es" "fr" "de" "ja" "zh-CN" "pt" "it" "ru")
ENGINE="${ABERSETZ_ENGINE:-tr/google}"

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

echo -e "${BLUE}=== Abersetz Batch Translation ===${NC}"
echo "Source: $PROJECT_ROOT"
echo "Output: $OUTPUT_BASE"
echo "Engine: $ENGINE"
echo ""

# Function to translate to a single language
translate_lang() {
    local lang=$1
    local output_dir="$OUTPUT_BASE/$lang"

    echo -e "${BLUE}Translating to $lang...${NC}"

    if abersetz tr "$lang" "$PROJECT_ROOT" \ \
        --engine "$ENGINE" \
        --output "$output_dir" \
        --recurse \
        --include "*.md,*.txt,*.html" \
        --xclude ".*,*test*,*draft*"; then
        echo -e "${GREEN}✓ $lang completed${NC}"
        return 0
    else
        echo -e "${RED}✗ $lang failed${NC}"
        return 1
    fi
}

# Create output directory
mkdir -p "$OUTPUT_BASE"

# Track results
SUCCESS_COUNT=0
FAILED_LANGS=()

# Translate to each language
for lang in "${LANGUAGES[@]}"; do
    if translate_lang "$lang"; then
        ((SUCCESS_COUNT++))
    else
        FAILED_LANGS+=("$lang")
    fi
    echo ""
done

# Summary
echo -e "${BLUE}=== Translation Summary ===${NC}"
echo "Successfully translated to $SUCCESS_COUNT/${#LANGUAGES[@]} languages"

if [ ${#FAILED_LANGS[@]} -gt 0 ]; then
    echo -e "${RED}Failed languages: ${FAILED_LANGS[*]}${NC}"
    exit 1
else
    echo -e "${GREEN}All translations completed successfully!${NC}"
fi

# Generate index file
INDEX_FILE="$OUTPUT_BASE/index.md"
echo "# Translations" > "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
echo "Available translations of $PROJECT_ROOT:" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"

for lang in "${LANGUAGES[@]}"; do
    if [ -d "$OUTPUT_BASE/$lang" ]; then
        file_count=$(find "$OUTPUT_BASE/$lang" -type f | wc -l)
        echo "- [$lang]($lang/) - $file_count files" >> "$INDEX_FILE"
    fi
done

echo -e "${GREEN}Index generated at $INDEX_FILE${NC}"
</document_content>
</document>

<document index="29">
<source>examples/config_setup.sh</source>
<document_content>
#!/bin/bash
# this_file: examples/config_setup.sh

# Setup and configure abersetz with various engines

set -e

echo "=== Abersetz Configuration Setup ==="
echo ""

# Function to check if command exists
command_exists() {
    command -v "$1" >/dev/null 2>&1
}

# Function to setup environment variable
setup_env_var() {
    local var_name=$1
    local var_description=$2

    if [ -z "${!var_name:-}" ]; then
        echo "⚠ $var_name not set"
        echo "  Description: $var_description"
        echo "  To set: export $var_name='your_api_key_here'"
        return 1
    else
        echo "✓ $var_name is configured"
        return 0
    fi
}

# Check abersetz installation
echo "Checking installation..."
if command_exists abersetz; then
    echo "✓ abersetz is installed"
    abersetz version
else
    echo "✗ abersetz not found. Install with: pip install abersetz"
    exit 1
fi

echo ""

# Show config location
echo "Configuration location:"
abersetz config path
echo ""

# Check API keys for various engines
echo "Checking API keys for LLM engines:"
echo ""

setup_env_var "OPENAI_API_KEY" "OpenAI API for GPT models"
setup_env_var "ANTHROPIC_API_KEY" "Anthropic API for Claude models"
setup_env_var "SILICONFLOW_API_KEY" "SiliconFlow API for Hunyuan translation"
setup_env_var "DEEPSEEK_API_KEY" "DeepSeek API for Chinese models"
setup_env_var "GROQ_API_KEY" "Groq API for fast inference"
setup_env_var "GOOGLE_API_KEY" "Google API for Gemini models"

echo ""

# Test available engines
echo "Testing available engines:"
echo ""

# Test free engines (no API key required)
echo "1. Testing free engines..."
for engine in "tr/google" "tr/bing" "dt/google"; do
    echo -n "  $engine: "
    if echo "Hello" | abtr es - --engine "$engine" --dry-run >/dev/null 2>&1; then
        echo "✓"
    else
        echo "✗"
    fi
done

echo ""

# Create sample configuration
CONFIG_FILE="$HOME/.config/abersetz/config.toml"
if [ ! -f "$CONFIG_FILE" ]; then
    echo "Creating default configuration..."
    mkdir -p "$(dirname "$CONFIG_FILE")"
    cat > "$CONFIG_FILE" <<'EOF'
[defaults]
engine = "tr/google"
from_lang = "auto"
to_lang = "en"
chunk_size = 1200
html_chunk_size = 1800

[credentials.openai]
env = "OPENAI_API_KEY"

[credentials.anthropic]
env = "ANTHROPIC_API_KEY"

[credentials.siliconflow]
env = "SILICONFLOW_API_KEY"

[credentials.deepseek]
env = "DEEPSEEK_API_KEY"

[credentials.groq]
env = "GROQ_API_KEY"

[credentials.google]
env = "GOOGLE_API_KEY"

[engines.hysf]
chunk_size = 2400

[engines.hysf.credential]
name = "siliconflow"

[engines.hysf.options]
model = "tencent/Hunyuan-MT-7B"
base_url = "https://api.siliconflow.com/v1"
temperature = 0.3

[engines.ullm]
chunk_size = 2400

[engines.ullm.options.profiles.default]
base_url = "https://api.siliconflow.com/v1"
model = "tencent/Hunyuan-MT-7B"
credential = { name = "siliconflow" }
temperature = 0.3
max_input_tokens = 32000

[engines.ullm.options.profiles.gpt4]
base_url = "https://api.openai.com/v1"
model = "gpt-4-turbo-preview"
credential = { name = "openai" }
temperature = 0.3
max_input_tokens = 128000

[engines.ullm.options.profiles.claude]
base_url = "https://api.anthropic.com/v1"
model = "claude-3-opus-20240229"
credential = { name = "anthropic" }
temperature = 0.3
max_input_tokens = 200000

[engines.ullm.options.profiles.deepseek]
base_url = "https://api.deepseek.com/v1"
model = "deepseek-chat"
credential = { name = "deepseek" }
temperature = 0.3
max_input_tokens = 32000
EOF
    echo "✓ Configuration created at $CONFIG_FILE"
else
    echo "Configuration already exists at $CONFIG_FILE"
fi

echo ""

# Show current configuration
echo "Current configuration:"
abersetz config show | head -20
echo "..."

echo ""
echo "=== Setup Complete ==="
echo ""
echo "Quick test commands:"
echo "  abersetz tr es test.txt                    # Use default engine"
echo "  abtr fr test.txt --engine tr/bing # Use Bing"
echo "  abtr de test.txt --engine hy             # Use SiliconFlow LLM"
echo "  abtr ja test.txt --engine ullm/gpt4        # Use GPT-4"

</document_content>
</document>

<document index="30">
<source>examples/engines_config.json</source>
<document_content>
{
  "defaults": {
    "engine": "tr/google",
    "from_lang": "auto",
    "to_lang": "en",
... (file content truncated to first 5 lines)
</document_content>
</document>

<document index="31">
<source>examples/pipeline.sh</source>
<document_content>
#!/bin/bash
# this_file: examples/pipeline.sh

# Complete translation pipeline with preprocessing and postprocessing

set -euo pipefail

# Configuration
SOURCE_DIR="${1:-.}"
TARGET_LANG="${2:-es}"
WORK_DIR="/tmp/abersetz_work_$$"
FINAL_OUTPUT="${3:-./translated_$TARGET_LANG}"

# Setup work directory
mkdir -p "$WORK_DIR"
trap "rm -rf $WORK_DIR" EXIT

echo "=== Abersetz Translation Pipeline ==="
echo "Source: $SOURCE_DIR"
echo "Target language: $TARGET_LANG"
echo "Output: $FINAL_OUTPUT"
echo ""

# Step 1: Find and copy translatable files
echo "Step 1: Collecting files..."
find "$SOURCE_DIR" -type f \( \
    -name "*.md" -o \
    -name "*.txt" -o \
    -name "*.html" -o \
    -name "*.htm" \
\) -not -path "*/\.*" -not -path "*/node_modules/*" \
   -not -path "*/venv/*" -not -path "*/__pycache__/*" | while read -r file; do
    rel_path="${file#$SOURCE_DIR/}"
    dest="$WORK_DIR/source/$rel_path"
    mkdir -p "$(dirname "$dest")"
    cp "$file" "$dest"
done

FILE_COUNT=$(find "$WORK_DIR/source" -type f 2>/dev/null | wc -l || echo 0)
echo "  Found $FILE_COUNT files"

if [ "$FILE_COUNT" -eq 0 ]; then
    echo "No files to translate!"
    exit 1
fi

# Step 2: Preprocess files (optional)
echo -e "\nStep 2: Preprocessing..."
# Example: Convert markdown links to absolute URLs
# find "$WORK_DIR/source" -name "*.md" -exec sed -i.bak 's|\](./|\](https://example.com/|g' {} \;
echo "  Preprocessing complete"

# Step 3: Translate
echo -e "\nStep 3: Translating..."
if abersetz tr "$TARGET_LANG" "$WORK_DIR/source" \ \
    --output "$WORK_DIR/translated" \
    --recurse; then
    echo "  Translation complete"
else
    echo "  Translation failed!"
    exit 1
fi

# Step 4: Postprocess translations
echo -e "\nStep 4: Postprocessing..."
# Example: Fix common translation issues
find "$WORK_DIR/translated" -type f -name "*.md" | while read -r file; do
    # Fix code blocks that might have been translated
    sed -i.bak 's/```[a-z]*$/```/g' "$file"
    # Remove backup files
    rm -f "${file}.bak"
done
echo "  Postprocessing complete"

# Step 5: Generate translation report
echo -e "\nStep 5: Generating report..."
REPORT_FILE="$WORK_DIR/translated/TRANSLATION_REPORT.md"
cat > "$REPORT_FILE" <<EOF
# Translation Report

## Summary
- **Source Directory**: $SOURCE_DIR
- **Target Language**: $TARGET_LANG
- **Date**: $(date)
- **Files Translated**: $FILE_COUNT

## File List
EOF

find "$WORK_DIR/translated" -type f -not -name "TRANSLATION_REPORT.md" | while read -r file; do
    rel_path="${file#$WORK_DIR/translated/}"
    size=$(wc -c < "$file")
    echo "- $rel_path ($(numfmt --to=iec-i --suffix=B $size))" >> "$REPORT_FILE"
done

echo "  Report generated"

# Step 6: Copy to final destination
echo -e "\nStep 6: Copying to final destination..."
rm -rf "$FINAL_OUTPUT"
cp -r "$WORK_DIR/translated" "$FINAL_OUTPUT"
echo "  Files copied to $FINAL_OUTPUT"

# Step 7: Verification
echo -e "\nStep 7: Verification..."
TRANSLATED_COUNT=$(find "$FINAL_OUTPUT" -type f -not -name "TRANSLATION_REPORT.md" | wc -l)
if [ "$TRANSLATED_COUNT" -eq "$FILE_COUNT" ]; then
    echo "  ✓ All files translated successfully"
else
    echo "  ⚠ Warning: Expected $FILE_COUNT files, found $TRANSLATED_COUNT"
fi

echo -e "\n=== Pipeline Complete ==="
echo "Translated files are in: $FINAL_OUTPUT"
echo "Report available at: $FINAL_OUTPUT/TRANSLATION_REPORT.md"
</document_content>
</document>

<document index="32">
<source>examples/pl/poem_en.txt</source>
<document_content>
Być lub nie, to jest pytanie: 
Czy to jest szlachetne w umyśle, by cierpieć 
Procy i strzały oburzającej fortuny, 
Lub wziąć broń do morza kłopotówI przeciwstawiając się ich zakończeni. Umrzeć - spać, 
Więcej nie; i snem, mówiąc, że kończymy 
Ból serca i tysiące naturalnych wstrząsów 
To ciało jest spadkobiercą: „to jest konsumpcjaPobożne, aby być życzeniem. Umrzeć, spać; 
Spać, w stanie marzyć - powie, jest pocieranie: 
Bo w tym śnie śmierci, jakie sny mogą nadejść, 
Kiedy odrzuciliśmy tę śmiertelną cewkę,Musi nam się zatrzymać - jest szacunek 
To powoduje katastrofę tak długiego życia. 
Bo kto nosiłby bicze i pogardy czasu, 
Th’ -upresor jest w błędzie, dumny człowiek jest skryty,Błędności z niepoprawami, opóźnienie prawa, 
Bezczelność urzędu i odmienne 
Ta zaleca pacjenta z powodu tego, co bierze, 
Kiedy on sam mógłby zrobić jego ciszyZ gołym bodkinem? Kto by uporządkował niedźwiedzie, 
Chrząkać i poci się w zmęczonym życiu, 
Ale ten strach przed czymś po śmierci, 
Niedopolowy kraj, od którego kouringuŻaden podróżnik nie wraca, zagadnia testament, 
I sprawia, że ​​raczej nosimy te choroby, które mamy 
Niż latać do innych, o których nie wiemy? 
Zatem sumienie, czyni z nas tchórzów,A zatem rodzime odcień rozdzielczości 
Jest chory z bladą obsadą myśli, 
Oraz przedsiębiorstwa o wielkim rdzeniu i momencie 
Z tego powodu ich prądy zmieniają się 
I stracić nazwę akcji.
</document_content>
</document>

<document index="33">
<source>examples/pl/poem_pl.txt</source>
<document_content>
# this_file: examples/poem_pl.txtŚwit spływa po dachach,Dzwony lśnią w porannej mgle,Sąsiedzi wymieniają pozdrowienia,A nadzieja znów czuje się jak dar.
</document_content>
</document>

<document index="34">
<source>examples/poem_en.txt</source>
<document_content>
To be, or not to be, that is the question:
Whether ’tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them. To die—to sleep,
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That flesh is heir to: ’tis a consummation
Devoutly to be wish’d. To die, to sleep;
To sleep, perchance to dream—ay, there’s the rub:
For in that sleep of death what dreams may come,
When we have shuffled off this mortal coil,
Must give us pause—there’s the respect
That makes calamity of so long life.
For who would bear the whips and scorns of time,
Th’oppressor’s wrong, the proud man’s contumely,
The pangs of dispriz’d love, the law’s delay,
The insolence of office, and the spurns
That patient merit of th’unworthy takes,
When he himself might his quietus make
With a bare bodkin? Who would fardels bear,
To grunt and sweat under a weary life,
But that the dread of something after death,
The undiscovere’d country, from whose bourn
No traveller returns, puzzles the will,
And makes us rather bear those ills we have
Than fly to others that we know not of?
Thus conscience doth make cowards of us all,
And thus the native hue of resolution
Is sicklied o’er with the pale cast of thought,
And enterprises of great pith and moment
With this regard their currents turn awry
And lose the name of action.
</document_content>
</document>

<document index="35">
<source>examples/poem_pl.txt</source>
<document_content>
# this_file: examples/poem_pl.txt

Świt spływa po dachach,
Dzwony lśnią w porannej mgle,
Sąsiedzi wymieniają pozdrowienia,
A nadzieja znów czuje się jak dar.

</document_content>
</document>

<document index="36">
<source>examples/translate.sh</source>
<document_content>
#!/bin/bash
# this_file: examples/translate.sh

# Basic shell script examples for abersetz CLI

# Example 1: Simple translation
echo "=== Example 1: Simple translation ==="
abersetz tr es poem_en.txt --engine tr/google

# Example 2: Using shorthand command
echo -e "\n=== Example 2: Shorthand command ==="
abtr fr poem_en.txt

# Example 3: Translate directory recursively
echo -e "\n=== Example 3: Directory translation ==="
abersetz tr de ./docs --recurse --output ./docs_de

# Example 4: Translate with specific patterns
echo -e "\n=== Example 4: Pattern matching ==="
abtr ja . --include "*.md,*.txt" --xclude "*test*,.*" --output ./translations/ja

# Example 5: write_over original files (be careful!)
echo -e "\n=== Example 5: In-place translation ==="
# abersetz tr es backup_first.txt --write_over

# Example 6: Dry run to test without translating
echo -e "\n=== Example 6: Dry run mode ==="
abersetz tr zh-CN ./project --dry-run

# Example 7: Using different engines
echo -e "\n=== Example 7: Different engines ==="
# Google Translate
abtr pt file.txt --engine tr/google

# Bing Translate
abtr pt file.txt --engine tr/bing

# DeepL via deep-translator
abtr pt file.txt --engine dt/deepl

# Example 8: Save voc for LLM engines
echo -e "\n=== Example 8: LLM with voc ==="
# Requires SILICONFLOW_API_KEY environment variable
# abersetz tr es technical.md --engine hy --save-voc

# Example 9: Verbose mode for debugging
echo -e "\n=== Example 9: Verbose output ==="
abersetz tr fr test.txt --verbose --dry-run

# Example 10: Check version
echo -e "\n=== Example 10: Version check ==="
abersetz version
</document_content>
</document>

<document index="37">
<source>examples/validate_report.sh</source>
<document_content>
#!/bin/bash
# this_file: examples/validate_report.sh

set -euo pipefail

OUTPUT_FILE=${1:-validate-report.txt}

if ! command -v abersetz >/dev/null 2>&1; then
    echo "abersetz executable not found. Install with: pip install abersetz" >&2
    exit 1
fi

echo "Running abersetz validate (target language: es)..."
abersetz validate --target-lang es >"$OUTPUT_FILE"

echo "Validation summary written to $OUTPUT_FILE"
cat "$OUTPUT_FILE"

</document_content>
</document>

<document index="38">
<source>examples/vocab.json</source>
<document_content>
{
  "this_file": "examples/vocab.json",
  "terms": {
    "rooftops": "dachy",
    "mist": "mgła",
... (file content truncated to first 5 lines)
</document_content>
</document>

<document index="39">
<source>examples/walkthrough.md</source>
<document_content>
---
this_file: examples/walkthrough.md
---
# Sample Translation Walkthrough

```bash
abersetz tr planslate examples/poem_en.txt \ \
  --engine hysf \
  --output examples/out \
  --save-voc \
  --verbose
```

The command writes the translated poem to `examples/out/poem_en.txt` and saves the evolving voc as `examples/out/poem_en.txt.voc.json`.

</document_content>
</document>

<document index="40">
<source>issues/102-review.md</source>
<document_content>
---
this_file: issues/102-review.md
---
# Codebase Review and Specification Compliance

**Issue:** #102
**Date:** 2025-09-20

## 1. Executive Summary

The `abersetz` codebase successfully implements the core requirements outlined in `SPEC.md`. The project is well-structured, follows modern Python practices, and demonstrates a clear understanding of the initial vision described in `IDEA.md`. The implementation is lean, focused, and effectively reuses established libraries, adhering to the project's philosophy.

The code is modular, with clear separation of concerns between configuration, translation engines, the translation pipeline, and the CLI. The use of `platformdirs` for configuration, `python-fire` for the CLI, and `semantic-text-splitter` for chunking aligns perfectly with the specification.

This review confirms that the current state of the codebase represents a solid Minimum Viable Product (MVP). The few deviations are minor and do not detract from the overall quality. The analysis below provides a detailed breakdown of compliance and offers minor suggestions for future refinement.

## 2. Specification Compliance Analysis

Here is a point-by-point comparison of the codebase against `SPEC.md`:

| Section | Specification Point | Compliance | Analysis & Comments |
| :--- | :--- | :--- | :--- |
| **2.1** | **File Handling** | ✅ **Full** | The `pipeline.py` module correctly handles file discovery, both for single files and directories. The `--recurse` flag is implemented in `cli.py` and passed to the pipeline. The `--write_over` and `--output` flags are also correctly implemented. |
| **2.2** | **Translation Pipeline** | ✅ **Full** | The `pipeline.py` module implements the `locate -> chunk -> translate -> merge -> save` workflow as specified. The `translate_path` function orchestrates this process effectively. |
| **2.3** | **Content-Type Detection** | ✅ **Full** | `pipeline.py` includes a `_is_html` function that performs a basic but effective check for HTML tags, satisfying the requirement. |
| **3.1** | **Pre-integrated Engines** | ✅ **Full** | `engines.py` provides wrappers for `translators` and `deep-translator`. The engine selection logic correctly parses engine strings like `translators/google`. |
| **3.2.1** | **`hysf` Engine** | ✅ **Full** | The `HysfEngine` class in `engines.py` uses the `openai` client to interact with the specified Siliconflow endpoint. It correctly retrieves credentials from the configuration and uses `tenacity` for retries. |
| **3.2.2** | **`ullm` Engine** | ✅ **Full** | The `UllmEngine` in `engines.py` is highly configurable as specified. It supports profiles, custom prologs, and, most importantly, the `<output>` and `<voc>` tag parsing logic. The voc is correctly extracted and propagated to subsequent chunks. |
| **4.0** | **Configuration** | ✅ **Full** | `config.py` provides a robust configuration management system using `platformdirs`. It correctly handles storing and resolving credentials (both `env` and `value`). The schema matches the requirements, allowing for global defaults and engine-specific overrides. |
| **5.0** | **CLI** | ✅ **Full** | `cli.py` uses `python-fire` to expose the `translate` command with all the specified arguments. The CLI arguments are correctly wired to the `TranslatorOptions` dataclass. |
| **6.0** | **Python API** | ✅ **Full** | The `abersetz` package exposes `translate_path` and `TranslatorOptions` in its `__init__.py`, providing a clean and simple programmatic interface. |
| **7.0** | **Dependencies** | ✅ **Full** | The `pyproject.toml` and `DEPENDENCIES.md` files confirm that all specified dependencies are used correctly. |

## 3. Codebase Quality Analysis

### 3.1. Structure and Modularity

The project structure is excellent. The separation of concerns into distinct files (`config.py`, `engines.py`, `pipeline.py`, `cli.py`) makes the codebase easy to navigate and maintain. Each module has a clear responsibility:

-   `config.py`: Manages all configuration-related logic.
-   `chunking.py`: Handles text splitting.
-   `engines.py`: Abstracts the different translation services.
-   `pipeline.py`: Contains the core business logic of the translation process.
-   `cli.py`: Provides the command-line interface.

This modularity also contributes to the high testability of the code.

### 3.2. Code Quality and Style

-   **Clarity:** The code is well-written, with clear variable and function names.
-   **Typing:** The use of type hints is consistent and improves code readability and maintainability.
-   **Best Practices:** The project correctly uses modern Python features and libraries. The use of dataclasses for configuration objects is a good example.
-   **Dependencies:** The choice of dependencies is excellent. The project leverages high-quality, well-maintained libraries like `rich`, `loguru`, and `tenacity`, which aligns with the philosophy of not reinventing the wheel.

### 3.3. Testing

The project has a comprehensive test suite with high coverage (91% reported in `TESTING.md`). The tests are well-organized and cover the core functionality of each module. The use of a stub engine for pipeline tests is a particularly good practice, as it isolates the pipeline logic from network dependencies.

### 3.4. Documentation

The project is well-documented. The `README.md` is clear and provides a good overview of the project. The `PLAN.md`, `TODO.md`, and `CHANGELOG.md` files provide a good record of the project's history and future direction.

## 4. Conclusion and Recommendations

The `abersetz` project is a high-quality codebase that meets all the requirements of the initial specification. It is a well-designed, well-implemented, and well-tested piece of software.

**Recommendation:** The project is in an excellent state to be considered a complete MVP. No immediate changes are required. Future work can focus on the quality improvements listed in `TODO.md`, such as adding more robust error handling and expanding the integration test suite.

</document_content>
</document>

<document index="41">
<source>issues/103.txt</source>
<document_content>
The CLI tool is absurd: 

```
$ abersetz --help|cat
INFO: Showing help with the command 'abersetz -- --help'.

NAME
    abersetz - Fire-powered entrypoint.

SYNOPSIS
    abersetz -

DESCRIPTION
    Fire-powered entrypoint.
```

It doesn’t expose any functionality. 

Re-read @IDEA.md and @SPEC.md and @PLAN.md and @TODO.md and the @llms.txt codebase and also @external/cerebrate-file.txt and @external/translators.txt and @external/semantic-text-splitter.txt and @external/python-ftfy.txt @external/platformdirs.txt and @dump_models 

Then into @PLAN.md and @TODO.md /plan the necessary changes. 

Then think hard and review and refine the @PLAN.md and @TODO.md 

Then /report and then finally /work on the tasks! 
</document_content>
</document>

<document index="42">
<source>issues/105.txt</source>
<document_content>
## `abersetz engines`

This produces a nice tables PLUS a list of object reprs. The latter needs to go

```

                                 Available Engines
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Selector                  ┃ Configured ┃ Credential ┃ Notes                      ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ translators               │ yes        │ optional   │ use translators/<provider> │
│ translators/apertium      │ no         │ optional   │ free                       │
│ translators/argos         │ no         │ optional   │ free                       │
│ translators/bing          │ no         │ optional   │ free                       │
│ translators/elia          │ no         │ optional   │ free                       │
│ translators/google        │ no         │ optional   │ free                       │
│ translators/iciba         │ no         │ optional   │ free                       │
│ translators/myMemory      │ no         │ optional   │ free                       │
│ translators/papago        │ no         │ optional   │ free                       │
│ translators/reverso       │ no         │ optional   │ free                       │
│ translators/tilde         │ no         │ optional   │ free                       │
│ translators/translateCom  │ no         │ optional   │ free                       │
│ translators/translateMe   │ no         │ optional   │ free                       │
│ translators/utibet        │ no         │ optional   │ free                       │
│ translators/yandex        │ no         │ optional   │ free                       │
│ translators/youdao        │ no         │ optional   │ free                       │
│ deep-translator           │ no         │ optional   │ use                        │
│                           │            │            │ deep-translator/<provider> │
│ deep-translator/google    │ no         │ optional   │ free                       │
│ deep-translator/libre     │ no         │ optional   │ free                       │
│ deep-translator/linguee   │ no         │ optional   │ free                       │
│ deep-translator/my_memory │ no         │ optional   │ free                       │
│ hysf                      │ yes        │ required   │ siliconflow                │
│ ullm/default              │ yes        │ required   │ Qwen/Qwen2.5-7B-Instruct   │
└───────────────────────────┴────────────┴────────────┴────────────────────────────┘
EngineEntry(selector='translators', configured=True, requires_api_key=False, notes='use translators/<provider>')
EngineEntry(selector='translators/apertium', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/argos', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/bing', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/elia', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/google', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/iciba', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/myMemory', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/papago', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/reverso', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/tilde', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/translateCom', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/translateMe', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/utibet', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/yandex', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='translators/youdao', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='deep-translator', configured=False, requires_api_key=False, notes='use deep-translator/<provider>')
EngineEntry(selector='deep-translator/google', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='deep-translator/libre', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='deep-translator/linguee', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='deep-translator/my_memory', configured=False, requires_api_key=False, notes='free')
EngineEntry(selector='hysf', configured=True, requires_api_key=True, notes='siliconflow')
EngineEntry(selector='ullm/default', configured=True, requires_api_key=True, notes='Qwen/Qwen2.5-7B-Instruct')
~/Developer/vcs/github.twardoch/pub/abersetz
```

## `abersetz setup`

This produces a completely different table than that from `abersetz engines`. Adopt the same format, or even re-use the code. 

```
🔧 Abersetz Configuration Setup

Scanning environment for API keys and endpoints...

Testing discovered services...

                 Discovered Translation Services
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Provider    ┃ Status      ┃ Engines                   ┃ Models ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ Openai      │ ✓ Available │ ullm/default, ullm/openai │     83 │
│ Google      │ ✓ Available │ translators/google        │      1 │
│ Groq        │ ✓ Available │ ullm/groq                 │     21 │
│ Mistral     │ ✓ Available │ N/A                       │     70 │
│ Deepseek    │ ✓ Available │ N/A                       │      2 │
│ Togetherai  │ ✓ Available │ N/A                       │     90 │
│ Siliconflow │ ✓ Available │ hysf, ullm/default        │     77 │
│ Deepinfra   │ ✓ Available │ ullm/deepinfra            │    167 │
│ Fireworks   │ ✓ Available │ N/A                       │     39 │
│ Sambanova   │ ✓ Available │ N/A                       │     12 │
│ Cerebras    │ ✓ Available │ N/A                       │      9 │
│ Hyperbolic  │ ✓ Available │ N/A                       │     29 │
│ Openrouter  │ ✓ Available │ N/A                       │    327 │
└─────────────┴─────────────┴───────────────────────────┴────────┘

✓ Configuration saved to: /Users/adam/Library/Application
Support/abersetz/config.toml

You can now use abersetz to translate files!

Example: abersetz tr es document.txt
```

## engine names

Shorten `translators` to `tr` and `deep-translator` to `dt`, and `ullm` to `ll`.  
</document_content>
</document>

<document index="43">
<source>issues/200.txt</source>
<document_content>
Run `llms .` and then analyze the codebase snapshot in `llms.txt`. Then think hard. Completely remove from @WORK.md @TODO.md @PLAN.md all tasks that have been completed. Then make a very detailed /plan into @PLAN.md on how we should evolve this package. I don’t want this: 

```
$ abersetz engines
                                      Available Translation Engines
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Selector                  ┃ Shortcut        ┃ Configured ┃ Credential ┃ Notes                          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ translators               │ tr              │ ✓          │ free       │ use translators/<provider>     │
```

I want to completely replace the current long selectors with the short ones. We may keep shortcuts ('aliases') as a concept. 

I want a proper, better validation in a standalone CLI command (and also called when calling 'setup') of every available translator engine, and better reporting and auto-configuration. The validation truly should translate something very short. 

I want other quality of life improvements, more examples, better tests, better documentation. I want vibrant development, not LAZY stuff. 

Regularly update the @CLAUDE.md file so that it contains accurate, appropriate development instructions for the codebase. 

Consult with outside sources. Make the app beautiful and easy to use, fast, versatile. 

Research all engines in detail, consult the codebase snapshots in @external for relevant info. 

Always use @PLAN.md and @TODO.md and @WORK.md 

Always /report then /cleanup then /plan next steps then /work and then test 
</document_content>
</document>

<document index="44">
<source>package.toml</source>
<document_content>
# Package configuration
[package]
include_cli = true        # Include CLI boilerplate
include_logging = true    # Include logging setup
use_pydantic = true      # Use Pydantic for data validation
use_rich = true          # Use Rich for terminal output

[features]
mkdocs = false           # Enable MkDocs documentation
vcs = true              # Initialize Git repository
github_actions = true   # Add GitHub Actions workflows 
</document_content>
</document>

<document index="45">
<source>pyproject.toml</source>
<document_content>
# this_file: pyproject.toml

[build-system]
requires = ["hatchling>=1.27", "hatch-vcs>=0.4"]
build-backend = "hatchling.build"

[project]
name = "abersetz"
description = ""
readme = "README.md"
requires-python = ">=3.10"
dynamic = ["version"]
dependencies = [
    "deep-translator>=1.11",
    "fire>=0.5",
    "httpx>=0.25",
    "loguru>=0.7",
    "langcodes>=3.4",
    "platformdirs>=4.3",
    "rich>=13.9",
    "semantic-text-splitter>=0.7",
    "tenacity>=8.4",
    "translators>=5.9",
    "tomli-w>=1.0",
    "tomli>=2.0; python_version < \"3.11\"",
]

[[project.authors]]
name = "Adam Twardoch"
email = "adam+github@twardoch.com"

[project.license]
text = "MIT"

[project.urls]
Documentation = "https://github.com/twardoch/abersetz#readme"
Issues = "https://github.com/twardoch/abersetz/issues"
Source = "https://github.com/twardoch/abersetz"

[project.scripts]
abersetz = "abersetz.cli_fast:main"
abtr = "abersetz.cli:abtr_main"

[dependency-groups]
dev = [
    "pytest>=8.3",
    "pytest-cov>=6.0",
    "ruff>=0.9",
    "mypy>=1.10",
]

[tool.hatch.version]
source = "vcs"

[tool.hatch.build]
xclude = ["/dist"]

[tool.hatch.build.targets.wheel]
packages = ["src/abersetz"]

[tool.hatch.build.hooks.vcs]
version-file = "src/abersetz/__about__.py"

[tool.hatch.envs.default]
python = "3.12"
dependencies = [
    "pytest>=8.3",
    "pytest-cov>=6.0",
    "ruff>=0.9",
    "mypy>=1.10",
]

[tool.hatch.envs.default.scripts]
test = "pytest {args:tests}"
lint = "ruff check {args:src tests}"
fmt = "ruff format {args:src tests}"
default = ["fmt", "lint", "test"]

[tool.uv]
default-groups = ["dev"]
python-preference = "managed"

[tool.ruff]
line-length = 100
target-version = "py310"

[tool.ruff.lint]
select = ["E", "F", "B", "I", "UP", "SIM"]
ignore = ["E203", "E501"]

[tool.ruff.format]
quote-style = "double"
indent-style = "space"

[tool.pytest.ini_options]
addopts = "-q"
testpaths = ["tests"]
markers = [
    "integration: mark test as integration test (requires network/API access)",
]

[tool.mypy]
python_version = "3.12"

[[tool.mypy.overrides]]
module = [
    "pytest",
    "pytest.*",
    "httpx",
    "httpx.*",
    "tenacity",
    "tenacity.*",
    "semantic_text_splitter",
    "semantic_text_splitter.*",
    "platformdirs",
    "platformdirs.*",
    "tomli_w",
    "loguru",
    "loguru.*",
    "langcodes",
    "langcodes.*",
    "rich",
    "rich.*",
    "requests",
    "requests.*",
    "translators",
    "translators.*",
]
ignore_missing_imports = true

</document_content>
</document>

# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/__init__.py
# Language: python

from importlib import metadata as _metadata
from typing import TYPE_CHECKING, Any
from .pipeline import PipelineError, TranslationResult, TranslatorOptions, translate_path
from . import pipeline
from .__about__ import __version__

def __getattr__((name: str)) -> Any:
    """Lazy load heavy modules only when accessed."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/__main__.py
# Language: python

from .cli import main


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/abersetz.py
# Language: python

from .pipeline import PipelineError, TranslationResult, TranslatorOptions, translate_path


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/chunking.py
# Language: python

import re
from collections.abc import Iterable
from enum import Enum
from semantic_text_splitter import TextSplitter

class TextFormat(E, n, u, m):
    """Minimal set of supported text formats."""

def detect_format((text: str)) -> TextFormat:
    """Detect whether ``text`` looks like HTML."""

def _fallback_chunks((text: str, max_size: int)) -> list[str]:
    """Simple slicing fallback when semantic splitter is unavailable."""

def _semantic_chunks((text: str, max_size: int)) -> Iterable[str]:
    """Prefer semantic-text-splitter when installed."""

def chunk_text((text: str, max_size: int, fmt: TextFormat)) -> list[str]:
    """Chunk text according to the detected format."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/cli.py
# Language: python

import json
import sys
from collections.abc import Iterable, Sequence
from pathlib import Path
import fire
import tomli_w
from loguru import logger
from rich.console import Console
from rich.table import Table
from .config import config_path, load_config
from .engine_catalog import (
    DEEP_TRANSLATOR_PAID_PROVIDERS,
    PAID_TRANSLATOR_PROVIDERS,
    EngineEntry,
    collect_deep_translator_providers,
    collect_translator_providers,
    normalize_selector,
)
from .pipeline import PipelineError, TranslationResult, TranslatorOptions, translate_path
from .setup import setup_command
from .validation import ValidationResult, validate_engines
from langcodes import get
from langcodes.language_lists import CLDR_LANGUAGES
from . import __version__
from langcodes import get

class ConfigCommands:
    """Configuration related helpers."""
    def show((self)) -> str:
    def path((self)) -> str:

class AbersetzCLI:
    """Abersetz translation tool - translate files between languages."""
    def version((self)) -> str:
        """Show version information."""
    def tr((
        self,
        to_lang: str,
        path: str | Path,
        *,
        engine: str | None = None,
        from_lang: str | None = None,
        recurse: bool = True,
        write_over: bool = False,
        output: str | Path | None = None,
        save_voc: bool = False,
        chunk_size: int | None = None,
        html_chunk_size: int | None = None,
        include: str | Sequence[str] | None = None,
        xclude: str | Sequence[str] | None = None,
        dry_run: bool = False,
        prolog: str | None = None,
        voc: str | None = None,
        verbose: bool = False,
    )) -> None:
    def config((self)) -> ConfigCommands:
    def lang((self)) -> list[str]:
    def engines((
        self,
        include_paid: bool = False,
        *,
        family: str | None = None,
        configured_only: bool = False,
    )) -> None:
        """List available engines and whether they are configured."""
    def setup((self, non_interactive: bool = False, verbose: bool = False)) -> None:
        """Run the configuration setup wizard."""
    def validate((
        self,
        *,
        selectors: str | Sequence[str] | None = None,
        target_lang: str = "es",
        source_lang: str = "auto",
        sample_text: str = "Hello, world!",
        include_defaults: bool = True,
    )) -> list[ValidationResult]:
        """Validate configured engines by translating a short phrase."""

def _configure_logging((verbose: bool)) -> None:

def _parse_patterns((value: str | Sequence[str] | None)) -> tuple[str, ...]:

def _load_json_data((reference: str | None)) -> dict[str, str]:

def _render_results((results: Iterable[TranslationResult])) -> None:

def _render_engine_entries((entries: list[EngineEntry])) -> None:

def _render_validation_entries((results: list[ValidationResult])) -> None:

def _collect_engine_entries((
    include_paid: bool,
    *,
    family: str | None = None,
    configured_only: bool = False,
)) -> list[EngineEntry]:

def show((self)) -> str:

def path((self)) -> str:

def _validate_language_code((code: str | None, param_name: str)) -> str | None:
    """Validate language code format."""

def _build_options_from_cli((
    path: str | Path,
    *,
    engine: str | None,
    from_lang: str | None,
    to_lang: str | None,
    recurse: bool,
    write_over: bool,
    output: str | Path | None,
    save_voc: bool,
    chunk_size: int | None,
    html_chunk_size: int | None,
    include: str | Sequence[str] | None,
    xclude: str | Sequence[str] | None,
    dry_run: bool,
    prolog: str | None,
    voc: str | None,
)) -> TranslatorOptions:

def _iter_language_rows(()) -> list[str]:

def version((self)) -> str:
    """Show version information."""

def tr((
        self,
        to_lang: str,
        path: str | Path,
        *,
        engine: str | None = None,
        from_lang: str | None = None,
        recurse: bool = True,
        write_over: bool = False,
        output: str | Path | None = None,
        save_voc: bool = False,
        chunk_size: int | None = None,
        html_chunk_size: int | None = None,
        include: str | Sequence[str] | None = None,
        xclude: str | Sequence[str] | None = None,
        dry_run: bool = False,
        prolog: str | None = None,
        voc: str | None = None,
        verbose: bool = False,
    )) -> None:

def config((self)) -> ConfigCommands:

def lang((self)) -> list[str]:

def engines((
        self,
        include_paid: bool = False,
        *,
        family: str | None = None,
        configured_only: bool = False,
    )) -> None:
    """List available engines and whether they are configured."""

def setup((self, non_interactive: bool = False, verbose: bool = False)) -> None:
    """Run the configuration setup wizard."""

def validate((
        self,
        *,
        selectors: str | Sequence[str] | None = None,
        target_lang: str = "es",
        source_lang: str = "auto",
        sample_text: str = "Hello, world!",
        include_defaults: bool = True,
    )) -> list[ValidationResult]:
    """Validate configured engines by translating a short phrase."""

def main(()) -> None:
    """Invoke the Fire CLI."""

def abtr_main(()) -> None:
    """Direct translation CLI - equivalent to 'abersetz tr'."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/cli_fast.py
# Language: python

import sys
from importlib import metadata
from .__about__ import __version__ as version
from .cli import main as cli_main

def handle_version(()) -> None:
    """Handle --version flag with minimal imports."""

def main(()) -> None:
    """Fast CLI entry point that defers heavy imports."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/config.py
# Language: python

import copy
import os
from collections.abc import Mapping
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
from platformdirs import user_config_dir
from .engine_catalog import (
    DEEP_TRANSLATOR_FREE_PROVIDERS,
    FREE_TRANSLATOR_PROVIDERS,
    HYSF_DEFAULT_MODEL,
    HYSF_DEFAULT_TEMPERATURE,
    normalize_selector,
)
import tomllib
import tomli as tomllib
import tomli_w
from loguru import logger
from loguru import logger

class Defaults:
    """Runtime defaults for translation."""
    def __setattr__((self, name: str, value: Any)) -> None:
    def to_dict((self)) -> dict[str, Any]:

class Credential:
    """Represents an API credential reference."""
    def to_dict((self)) -> dict[str, str]:

class EngineConfig:
    """Engine specific configuration block."""
    def to_dict((self)) -> dict[str, Any]:

class AbersetzConfig:
    """Aggregate configuration for the toolkit."""
    def to_dict((self)) -> dict[str, Any]:

def __setattr__((self, name: str, value: Any)) -> None:

def to_dict((self)) -> dict[str, Any]:

def from_dict((cls, raw: Mapping[str, Any] | None)) -> Defaults:

def to_dict((self)) -> dict[str, str]:

def from_any((cls, raw: CredentialLike | None)) -> Credential | None:

def to_dict((self)) -> dict[str, Any]:

def from_dict((cls, name: str, raw: Mapping[str, Any] | None)) -> EngineConfig:

def to_dict((self)) -> dict[str, Any]:

def from_dict((cls, raw: Mapping[str, Any])) -> AbersetzConfig:

def _default_dict(()) -> dict[str, Any]:
    """Return a deep copy of the default config mapping."""

def _default_config(()) -> AbersetzConfig:
    """Return a fresh ``AbersetzConfig`` with defaults."""

def config_dir(()) -> Path:
    """Return directory holding the configuration file."""

def config_path(()) -> Path:
    """Return absolute path to the configuration file."""

def load_config(()) -> AbersetzConfig:
    """Load configuration from disk, creating defaults if needed."""

def save_config((config: AbersetzConfig)) -> None:
    """Persist configuration to ``config.toml``."""

def resolve_credential((
    config: AbersetzConfig,
    reference: CredentialLike,
)) -> str | None:
    """Resolve a credential reference to a usable secret."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/engine_catalog.py
# Language: python

from collections.abc import Iterable
from dataclasses import dataclass
import translators

class EngineEntry:
    """Descriptor for CLI listing."""

def _split_selector((selector: str)) -> tuple[str, str | None]:

def normalize_selector((selector: str | None)) -> str | None:
    """Return canonical short selector for supported engine families."""

def resolve_engine_reference((selector: str)) -> tuple[str, str | None]:
    """Resolve selector (short or long) into engine config key and variant."""

def _filter_available((pool: Iterable[str], allowed: Iterable[str])) -> list[str]:

def collect_translator_providers((*, include_paid: bool = False)) -> list[str]:
    """Return translator providers available in current environment."""

def collect_deep_translator_providers((*, include_paid: bool = False)) -> list[str]:
    """Return deep-translator providers supported by abersetz."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/engines.py
# Language: python

import json
import re
from collections.abc import Mapping
from dataclasses import dataclass
from typing import Any, Protocol
from tenacity import retry, stop_after_attempt, wait_exponential
from .chunking import TextFormat
from .config import AbersetzConfig, EngineConfig, resolve_credential
from .engine_catalog import (
    HYSF_DEFAULT_MODEL,
    HYSF_DEFAULT_TEMPERATURE,
    normalize_selector,
    resolve_engine_reference,
)
from .openai_lite import OpenAI
import translators
from deep_translator import (  # type: ignore
                DeeplTranslator,
                GoogleTranslator,
                LibreTranslator,
                LingueeTranslator,
                MicrosoftTranslator,
                MyMemoryTranslator,
                PapagoTranslator,
            )
from langcodes import get as get_language

class EngineError(R, u, n, t, i, m, e, E, r, r, o, r):
    """Raised when an engine cannot be constructed or invoked."""

class EngineRequest:
    """Payload passed to engines."""

class EngineResult:
    """Normalized engine output."""

class Engine(P, r, o, t, o, c, o, l):
    """Protocol implemented by engine adapters."""
    def translate((self, request: EngineRequest)) -> EngineResult:
        """Translate a chunk."""
    def chunk_size_for((self, fmt: TextFormat)) -> int | None:
        """Return preferred chunk size for the given text format."""

class EngineBase:
    """Shared helpers for engines."""
    def __init__((
        self,
        name: str,
        chunk_size: int | None,
        html_chunk_size: int | None,
    )) -> None:
    def chunk_size_for((self, fmt: TextFormat)) -> int | None:

class TranslatorsEngine(E, n, g, i, n, e, B, a, s, e):
    """Wrapper around the `translators` package with retry logic."""
    def __init__((self, provider: str, config: EngineConfig)) -> None:
    def translate((self, request: EngineRequest)) -> EngineResult:

class DeepTranslatorEngine(E, n, g, i, n, e, B, a, s, e):
    """Adapter for `deep-translator` providers with retry logic."""
    def __init__((self, provider: str, config: EngineConfig)) -> None:
    def translate((self, request: EngineRequest)) -> EngineResult:

class LlmEngine(E, n, g, i, n, e, B, a, s, e):
    """Shared logic for LLM backed engines."""
    def __init__((
        self,
        config: EngineConfig,
        client: Any,
        *,
        model: str,
        temperature: float,
        static_prolog: Mapping[str, str] | None = None,
    )) -> None:
    def translate((self, request: EngineRequest)) -> EngineResult:
    def _build_messages((
        self,
        request: EngineRequest,
        voc: Mapping[str, str],
        merged: Mapping[str, str],
    )) -> list[dict[str, str]]:
    def _parse_payload((self, payload: str)) -> tuple[str, dict[str, str]]:

class HysfEngine(E, n, g, i, n, e, B, a, s, e):
    """Specialised HYSF engine with fixed prompt semantics."""
    def __init__((self, config: EngineConfig, client: Any)) -> None:
    def translate((self, request: EngineRequest)) -> EngineResult:

def translate((self, request: EngineRequest)) -> EngineResult:
    """Translate a chunk."""

def chunk_size_for((self, fmt: TextFormat)) -> int | None:
    """Return preferred chunk size for the given text format."""

def __init__((
        self,
        name: str,
        chunk_size: int | None,
        html_chunk_size: int | None,
    )) -> None:

def chunk_size_for((self, fmt: TextFormat)) -> int | None:

def __init__((self, provider: str, config: EngineConfig)) -> None:

def _translate_with_retry((
        self, text: str, is_html: bool, source_lang: str, target_lang: str
    )) -> str:
    """Internal method with retry logic for network failures."""

def translate((self, request: EngineRequest)) -> EngineResult:

def _get_providers((cls)) -> Mapping[str, type]:
    """Lazy load deep-translator providers."""

def __init__((self, provider: str, config: EngineConfig)) -> None:

def _translate_with_retry((self, text: str, source_lang: str, target_lang: str)) -> str:
    """Internal method with retry logic for network failures."""

def translate((self, request: EngineRequest)) -> EngineResult:

def __init__((
        self,
        config: EngineConfig,
        client: Any,
        *,
        model: str,
        temperature: float,
        static_prolog: Mapping[str, str] | None = None,
    )) -> None:

def _invoke((self, messages: list[dict[str, str]])) -> str:

def translate((self, request: EngineRequest)) -> EngineResult:

def _build_messages((
        self,
        request: EngineRequest,
        voc: Mapping[str, str],
        merged: Mapping[str, str],
    )) -> list[dict[str, str]]:

def _parse_payload((self, payload: str)) -> tuple[str, dict[str, str]]:

def __init__((self, config: EngineConfig, client: Any)) -> None:

def _invoke((self, message: str)) -> str:

def translate((self, request: EngineRequest)) -> EngineResult:

def _language_name((code: str)) -> str:

def _make_openai_client((token: str, base_url: str | None)) -> OpenAI:
    """Create an OpenAI client respecting optional base URL."""

def _build_llm_engine((
    selector: str,
    config: AbersetzConfig,
    engine_cfg: EngineConfig,
    *,
    profile: Mapping[str, Any] | None,
    client: Any | None,
)) -> Engine:

def _build_hysf_engine((
    selector: str,
    config: AbersetzConfig,
    engine_cfg: EngineConfig,
    *,
    client: Any | None,
)) -> Engine:

def _translators_provider((variant: str | None, engine_cfg: EngineConfig)) -> str:

def _select_profile((engine_cfg: EngineConfig, variant: str | None)) -> Mapping[str, Any] | None:

def create_engine((
    selector: str,
    config: AbersetzConfig,
    *,
    client: Any | None = None,
)) -> Engine:
    """Factory that builds the requested engine supporting short aliases."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/openai_lite.py
# Language: python

from dataclasses import dataclass
from typing import Any
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential

class ChatCompletionMessage:
    """Represents a message in the chat completion response."""

class ChatCompletionChoice:
    """Represents a choice in the chat completion response."""

class ChatCompletionResponse:
    """Represents the full chat completion response."""

class ChatCompletions:
    """Chat completions API interface."""
    def __init__((self, client: OpenAI)):

class OpenAI:
    """Lightweight OpenAI client - drop-in replacement for the official SDK."""
    def __init__((self, api_key: str, base_url: str | None = None)):
        """Initialize the OpenAI client."""

class Chat:
    """Chat API namespace with an optionally populated completions client."""
    def __init__((self)) -> None:

def __init__((self, client: OpenAI)):

def create((
        self, model: str, messages: list[dict[str, str]], temperature: float = 0.7, **kwargs: Any
    )) -> ChatCompletionResponse:
    """Create a chat completion."""

def __init__((self, api_key: str, base_url: str | None = None)):
    """Initialize the OpenAI client."""

def __init__((self)) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/pipeline.py
# Language: python

import json
from collections.abc import Iterable
from dataclasses import dataclass, field
from pathlib import Path
from .chunking import TextFormat, chunk_text, detect_format
from .config import AbersetzConfig, load_config
from .engine_catalog import normalize_selector
from .engines import Engine, EngineRequest, EngineResult, create_engine
from loguru import logger

class TranslatorOptions:
    """Runtime options controlling translation behaviour."""

class TranslationResult:
    """Information about a translated artefact."""

class PipelineError(R, u, n, t, i, m, e, E, r, r, o, r):
    """Raised when translation cannot proceed."""

def translate_path((
    path: Path | str,
    options: TranslatorOptions | None = None,
    *,
    config: AbersetzConfig | None = None,
    client: object | None = None,
)) -> list[TranslationResult]:
    """Translate a file or directory tree."""

def _merge_defaults((options: TranslatorOptions | None, config: AbersetzConfig)) -> TranslatorOptions:

def _discover_files((root: Path, opts: TranslatorOptions)) -> Iterable[Path]:

def _is_xcluded((path: Path, patterns: tuple[str, ...])) -> bool:

def _translate_file((
    source: Path,
    engine: Engine,
    opts: TranslatorOptions,
    config: AbersetzConfig,
)) -> TranslationResult:

def _apply_engine((
    engine: Engine,
    chunks: Iterable[str],
    fmt: TextFormat,
    opts: TranslatorOptions,
    config: AbersetzConfig,
)) -> tuple[list[EngineResult], dict[str, str]]:

def _build_request((
    chunk: str,
    index: int,
    total: int,
    fmt: TextFormat,
    opts: TranslatorOptions,
    config: AbersetzConfig,
    voc: dict[str, str],
    prolog: dict[str, str],
)) -> EngineRequest:

def _select_chunk_size((
    fmt: TextFormat,
    engine: Engine,
    opts: TranslatorOptions,
    config: AbersetzConfig,
)) -> int:

def _persist_output((
    source: Path,
    content: str,
    voc: dict[str, str],
    fmt: TextFormat,
    opts: TranslatorOptions,
    target_lang: str,
)) -> Path:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/setup.py
# Language: python

import os
from collections.abc import Sequence
from dataclasses import dataclass, field
import httpx
from loguru import logger
from rich.console import Console
from rich.progress import Progress
from rich.table import Table
from .config import AbersetzConfig, Credential, EngineConfig, save_config
from .engine_catalog import (
    DEEP_TRANSLATOR_FREE_PROVIDERS,
    FREE_TRANSLATOR_PROVIDERS,
    HYSF_DEFAULT_MODEL,
    HYSF_DEFAULT_TEMPERATURE,
    PAID_TRANSLATOR_PROVIDERS,
    collect_deep_translator_providers,
    collect_translator_providers,
    normalize_selector,
)
from .validation import ValidationResult, validate_engines
import sys

class DiscoveredProvider:
    """Information about a discovered API provider."""

class SetupWizard:
    """Interactive setup wizard for abersetz configuration."""
    def __init__((self, non_interactive: bool = False, verbose: bool = False)):
    def run((self)) -> bool:
        """Run the setup wizard."""
    def _validate_config((self, config: AbersetzConfig)) -> None:
        """Run validation after configuration is saved."""
    def _discover_providers((self)) -> None:
        """Scan environment for API keys."""
    def _test_endpoints((self)) -> None:
        """Test discovered endpoints with lightweight API calls."""
    def _test_single_endpoint((self, provider: DiscoveredProvider)) -> None:
        """Test a single API endpoint."""
    def _display_results((self)) -> None:
        """Display discovered providers in a table."""
    def _generate_config((self)) -> AbersetzConfig | None:
        """Generate configuration from discovered providers."""

def __init__((self, non_interactive: bool = False, verbose: bool = False)):

def run((self)) -> bool:
    """Run the setup wizard."""

def _validate_config((self, config: AbersetzConfig)) -> None:
    """Run validation after configuration is saved."""

def _discover_providers((self)) -> None:
    """Scan environment for API keys."""

def _test_endpoints((self)) -> None:
    """Test discovered endpoints with lightweight API calls."""

def _test_single_endpoint((self, provider: DiscoveredProvider)) -> None:
    """Test a single API endpoint."""

def _display_results((self)) -> None:
    """Display discovered providers in a table."""

def _generate_config((self)) -> AbersetzConfig | None:
    """Generate configuration from discovered providers."""

def _select_default_engine((
    engines: dict[str, EngineConfig],
    providers: Sequence[DiscoveredProvider],
)) -> str | None:
    """Choose the default engine based on configured priorities."""

def setup_command((
    non_interactive: bool = False,
    verbose: bool = False,
)) -> None:
    """Run the abersetz setup wizard."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/src/abersetz/validation.py
# Language: python

from collections.abc import Callable, Iterable, Sequence
from dataclasses import dataclass
from time import perf_counter
from loguru import logger
from .config import AbersetzConfig, load_config
from .engine_catalog import normalize_selector
from .engines import EngineError, EngineRequest, create_engine

class ValidationResult:
    """Outcome of validating a single engine selector."""

def _append_selector((collection: list[str], seen: set[str], selector: str | None)) -> None:

def _extract_providers((options: dict[str, object], key: str)) -> list[str]:

def _selector_sort_key((selector: str)) -> tuple[int, str]:

def _selectors_from_config((config: AbersetzConfig, include_defaults: bool)) -> list[str]:

def _ensure_engine_request((sample_text: str, source_lang: str, target_lang: str)) -> EngineRequest:

def validate_engines((
    config: AbersetzConfig | None = None,
    *,
    selectors: Iterable[str] | None = None,
    sample_text: str = "Hello, world!",
    source_lang: str = "auto",
    target_lang: str = "es",
    client: object | None = None,
    create_engine_fn: Callable[..., object] | None = None,
    include_defaults: bool = True,
)) -> list[ValidationResult]:
    """Validate configured engines by performing a tiny translation."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/conftest.py
# Language: python

import sys
from pathlib import Path
import pytest

def _temp_config_dir((tmp_path: Path, monkeypatch: pytest.MonkeyPatch)) -> Path:
    """Isolate persisted config for each test run."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_chunking.py
# Language: python

import builtins
from abersetz.chunking import TextFormat, chunk_text, detect_format

def test_detect_format_identifies_html(()) -> None:

def test_chunk_text_preserves_round_trip(()) -> None:

def test_html_chunking_returns_single_chunk(()) -> None:

def test_chunk_text_returns_empty_for_blank_input(()) -> None:

def test_chunk_text_fallback_runs_without_semantic_splitter((monkeypatch)) -> None:

def fake_import((name, *args, **kwargs)):


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_cli.py
# Language: python

from pathlib import Path
import pytest
import tomllib
from abersetz.chunking import TextFormat
from abersetz.cli import (
    AbersetzCLI,
    _collect_engine_entries,
    _load_json_data,
    _parse_patterns,
    _render_engine_entries,
    _render_results,
    _render_validation_entries,
    abtr_main,
    main,
)
from abersetz.config import AbersetzConfig, Credential, Defaults, EngineConfig, save_config
from abersetz.pipeline import PipelineError, TranslationResult, TranslatorOptions
from abersetz.validation import ValidationResult
import io
from rich.console import Console
import io
from rich.console import Console
import io
from rich.console import Console
import io
from rich.console import Console
import io
from rich.console import Console
import io
from rich.console import Console

class DummyLogger:
    def __init__((self)) -> None:
    def remove((self)) -> None:
    def add((self, *args, **kwargs)):
    def debug((self, message: str, *args, **kwargs)) -> None:

def test_cli_translate_wires_arguments((monkeypatch: pytest.MonkeyPatch, tmp_path: Path)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)):

def test_cli_translate_accepts_path_output((monkeypatch: pytest.MonkeyPatch, tmp_path: Path)) -> None:

def fake_translate_path((path: str | Path, options: TranslatorOptions)):

def test_cli_accepts_legacy_engine_selector((
    monkeypatch: pytest.MonkeyPatch, tmp_path: Path
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)):

def test_cli_translate_reports_pipeline_error((
    monkeypatch: pytest.MonkeyPatch, tmp_path: Path
)) -> None:

def fake_print((message: str)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)):

def test_parse_patterns_handles_none_and_iterables(()) -> None:

def test_load_json_data_prefers_files((tmp_path: Path)) -> None:

def test_render_engine_entries_handles_empty((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_render_validation_entries_handles_empty((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_render_results_lists_destinations((monkeypatch: pytest.MonkeyPatch, tmp_path: Path)) -> None:

def _stub_engine_entries((monkeypatch: pytest.MonkeyPatch)) -> AbersetzConfig:

def test_collect_engine_entries_handles_provider_strings((
    _stub_engine_entries: AbersetzConfig,
)) -> None:

def test_collect_engine_entries_family_accepts_long_name((
    _stub_engine_entries: AbersetzConfig,
)) -> None:

def test_collect_engine_entries_configured_only_with_family((
    _stub_engine_entries: AbersetzConfig,
)) -> None:

def test_collect_engine_entries_accepts_single_provider_string((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_collect_engine_entries_string_branches((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_collect_engine_entries_handles_deep_translator_string_providers((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_cli_config_commands_show_and_path((monkeypatch: pytest.MonkeyPatch, tmp_path: Path)) -> None:

def test_cli_lang_lists_languages((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_cli_verbose_logs_translation_details((
    monkeypatch: pytest.MonkeyPatch, tmp_path: Path
)) -> None:

def __init__((self)) -> None:

def remove((self)) -> None:

def add((self, *args, **kwargs)):

def debug((self, message: str, *args, **kwargs)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)):

def test_cli_engines_lists_configured_providers((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_cli_engines_supports_filters((monkeypatch: pytest.MonkeyPatch)) -> None:

def render((
        family: str | None = None, *, configured_only: bool = False, include_paid: bool = False
    )) -> str:

def test_cli_validate_renders_results((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_cli_validate_accepts_selector_string((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_validate((config: AbersetzConfig, **kwargs: object)):

def test_cli_setup_forwards_flags((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_setup_command((*, non_interactive: bool, verbose: bool)) -> None:

def test_cli_main_invokes_fire((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_fire((target: object, *args: object, **kwargs: object)) -> None:

def test_cli_abtr_main_invokes_fire_with_tr((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_fire((target: object, *args: object, **kwargs: object)) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_config.py
# Language: python

from pathlib import Path
import pytest
import abersetz.config as config_module
import tomllib
import tomli as tomllib
import platform
from loguru import logger
from loguru import logger
from loguru import logger

def test_load_config_yields_defaults((tmp_path: Path)) -> None:

def test_save_config_persists_changes((tmp_path: Path)) -> None:

def test_resolve_credential_prefers_environment((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_load_config_handles_malformed_toml((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:
    """Test that malformed TOML config files are handled gracefully."""

def test_load_config_handles_permission_error((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:
    """Test that permission errors are handled gracefully."""

def test_defaults_normalize_legacy_selector(()) -> None:

def test_defaults_from_dict_normalizes_selector(()) -> None:

def test_defaults_from_dict_when_none_returns_defaults(()) -> None:

def test_engine_config_from_dict_when_none_returns_empty_block(()) -> None:

def test_engine_config_to_dict_includes_optional_fields(()) -> None:

def test_credential_to_dict_includes_optional_fields(()) -> None:

def test_credential_from_any_rejects_unsupported_payload(()) -> None:

def test_credential_from_any_handles_mapping_payload(()) -> None:

def test_config_dir_when_env_missing_then_uses_platformdirs((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def test_resolve_credential_when_env_missing_then_logs_hint((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_resolve_credential_recurses_into_stored_secret((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_resolve_credential_with_recursive_name_logs_once((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_resolve_credential_returns_none_for_null_reference(()) -> None:

def test_resolve_credential_reuses_stored_alias_object(()) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_engine_catalog.py
# Language: python

import builtins
import sys
from types import SimpleNamespace
from typing import Any
import pytest
from abersetz.engine_catalog import (
    _filter_available,
    collect_deep_translator_providers,
    collect_translator_providers,
    normalize_selector,
    resolve_engine_reference,
)

def test_normalize_selector_converts_long_to_short(()) -> None:

def test_normalize_selector_is_idempotent(()) -> None:

def test_normalize_selector_preserves_unknowns(()) -> None:

def test_normalize_selector_returns_none_for_none(()) -> None:

def test_normalize_selector_handles_blank_input(()) -> None:

def test_normalize_selector_handles_missing_base(()) -> None:

def test_resolve_engine_reference_handles_short_alias(()) -> None:

def test_resolve_engine_reference_handles_long_selector(()) -> None:

def test_resolve_engine_reference_handles_base_only_alias(()) -> None:

def test_filter_available_when_allowed_duplicates_then_dedupes(()) -> None:

def test_collect_translator_providers_when_import_fails_then_returns_empty((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def fake_import((name: str, *args: Any, **kwargs: Any)):

def test_collect_translator_providers_when_paid_requested_then_keeps_order((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_collect_deep_translator_providers_include_paid_appends_once(()) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_engines.py
# Language: python

import sys
from types import SimpleNamespace
import pytest
from langcodes import get as get_language
import abersetz.config as config_module
import abersetz.engines as engines_module
from abersetz.chunking import TextFormat
from abersetz.engines import EngineBase, EngineError, EngineRequest, create_engine
from abersetz.engines import DeepTranslatorEngine

class DummyClient:
    """Simple stub mimicking OpenAI chat completions."""
    def __init__((self, payload: str)):
    def _create((self, **kwargs: object)) -> SimpleNamespace:

class MockTranslator:
    def __init__((self, source: str, target: str)):
    def translate((self, text: str)) -> str:

def __init__((self, payload: str)):

def _create((self, **kwargs: object)) -> SimpleNamespace:

def test_translators_engine_invokes_library((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_translate_text((
        text: str, translator: str, from_language: str, to_language: str, **_: object
    )) -> str:

def test_translators_engine_handles_html_requests((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_translate_html((
        text: str, translator: str, from_language: str, to_language: str, **_: object
    )) -> str:

def test_hysf_engine_uses_fixed_prompt((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_engine_base_chunk_size_prefers_html_then_plain(()) -> None:

def test_ullm_engine_uses_profile((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_translators_engine_retry_on_failure((monkeypatch: pytest.MonkeyPatch)) -> None:
    """Test that TranslatorsEngine retries on network failures."""

def fake_translate_with_retry((
        text: str, translator: str, from_language: str, to_language: str, **_: object
    )) -> str:

def test_create_engine_accepts_legacy_selector(()) -> None:

def test_deep_translator_engine_retry_on_failure((monkeypatch: pytest.MonkeyPatch)) -> None:
    """Test that DeepTranslatorEngine retries on network failures."""

def __init__((self, source: str, target: str)):

def translate((self, text: str)) -> str:

def test_deep_translator_engine_rejects_unknown_provider((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_build_llm_engine_without_model_raises_engine_error(()) -> None:

def test_build_llm_engine_without_credential_raises_engine_error((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_build_hysf_engine_without_credential_raises((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_select_profile_defaults_to_default_variant(()) -> None:

def test_select_profile_without_profiles_returns_none(()) -> None:

def test_select_profile_unknown_variant_raises_engine_error(()) -> None:

def test_make_openai_client_respects_base_url(()) -> None:

def test_make_openai_client_defaults_to_openai_url(()) -> None:

def test_create_engine_with_unknown_configured_base_raises_engine_error(()) -> None:

def _make_llm_engine(()) -> engines_module.LlmEngine:

def test_llm_engine_parse_payload_without_vocab(()) -> None:

def test_llm_engine_parse_payload_with_malformed_vocab(()) -> None:

def test_llm_engine_parse_payload_with_non_mapping_vocab(()) -> None:

def test_create_engine_raises_when_config_missing_selector(()) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_examples.py
# Language: python

import asyncio
import importlib.util
import json
import runpy
import sys
from collections.abc import Callable
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Protocol, cast
import pytest
from abersetz.chunking import TextFormat
from abersetz.pipeline import TranslationResult, TranslatorOptions

class _StubResult:
    def __init__((
        self, source: str, destination: str, *, fmt: TextFormat = TextFormat.PLAIN
    )) -> None:

class _BasicApiModule(P, r, o, t, o, c, o, l):
    def format_example_doc((self, func: Callable[..., object])) -> str:
    def example_simple((self)) -> None:
    def example_batch((self)) -> None:
    def example_dry_run((self)) -> None:
    def example_html((self)) -> None:
    def example_with_config((self)) -> None:
    def example_llm_with_voc((self)) -> None:
    def cli((self, example: str | None = None)) -> None:

class _Defaults:

class _Config:

class _StubEngine:
    def __init__((self, name: str)) -> None:
    def translate((self, request: Any)):

def __init__((
        self, source: str, destination: str, *, fmt: TextFormat = TextFormat.PLAIN
    )) -> None:

def format_example_doc((self, func: Callable[..., object])) -> str:

def example_simple((self)) -> None:

def example_batch((self)) -> None:

def example_dry_run((self)) -> None:

def example_html((self)) -> None:

def example_with_config((self)) -> None:

def example_llm_with_voc((self)) -> None:

def cli((self, example: str | None = None)) -> None:

def _load_basic_api(()) -> _BasicApiModule:

def _load_advanced_api(()):

def test_format_example_doc_handles_none(()) -> None:

def _no_doc(()) -> None:

def test_format_example_doc_strips_whitespace(()) -> None:

def _with_doc(()) -> None:
    """Example description with padding"""

def test_example_simple_outputs_summary((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_example_batch_uses_include_filters((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_example_dry_run_lists_files((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_example_html_preserves_markup_intent((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_example_with_config_uses_modified_defaults((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_load_config(()) -> _Config:

def fake_save_config((value: _Config)) -> None:

def fake_translate_path((
        path: str, options: TranslatorOptions | None = None, *, config: _Config
    )) -> list[_StubResult]:

def test_example_llm_with_voc_reports_final_vocab((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_translation_workflow_translate_project_collects_results_and_errors((
    monkeypatch: pytest.MonkeyPatch,
    tmp_path: Path,
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions, *, config: object | None = None)):

def test_translation_workflow_generate_report_creates_parent_dirs((tmp_path: Path)) -> None:

def test_translation_workflow_lazy_loads_config((monkeypatch: pytest.MonkeyPatch)) -> None:

def fake_load_config(()) -> object:

def test_voc_manager_translate_with_consistency_preserves_base_voc((
    monkeypatch: pytest.MonkeyPatch,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[TranslationResult]:

def test_voc_manager_load_and_merge((tmp_path: Path)) -> None:

def test_parallel_translator_compare_translations_handles_failures((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def __init__((self, name: str)) -> None:

def translate((self, request: Any)):

def fake_create_engine((name: str, config: object)):

def test_example_voc_consistency_writes_vocab((
    monkeypatch: pytest.MonkeyPatch,
    tmp_path: Path,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_with_consistency((
        *, files: list[str], to_lang: str, base_voc: dict[str, str]
    )):

def test_example_parallel_comparison_invokes_async_run((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def fake_compare((text: str, engines: list[str], to_lang: str)):

def fake_run((coro)):

def test_example_incremental_translation_processes_files((
    monkeypatch: pytest.MonkeyPatch,
    tmp_path: Path,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_path((
        path: str,
        options: TranslatorOptions,
        *,
        config: object | None = None,
    )) -> list[TranslationResult]:

def test_example_incremental_translation_reuses_checkpoint((
    monkeypatch: pytest.MonkeyPatch,
    tmp_path: Path,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_path((
        path: str,
        options: TranslatorOptions,
        *,
        config: object | None = None,
    )) -> list[TranslationResult]:

def test_basic_api_cli_dispatch_runs_requested_example((
    monkeypatch: pytest.MonkeyPatch,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_path((path: str, options: TranslatorOptions)) -> list[_StubResult]:

def test_basic_api_cli_usage_banner((
    monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str]
)) -> None:

def test_advanced_api_cli_dispatch_runs_requested_example((
    monkeypatch: pytest.MonkeyPatch,
    capsys: pytest.CaptureFixture[str],
)) -> None:

def fake_translate_path((
        path: str,
        options: TranslatorOptions,
        *,
        config: object | None = None,
    )) -> list[TranslationResult]:

def test_advanced_api_cli_usage_banner((
    monkeypatch: pytest.MonkeyPatch,
    capsys: pytest.CaptureFixture[str],
)) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_integration.py
# Language: python

import os
import pytest
from abersetz import TranslatorOptions, translate_path
from abersetz.config import load_config
from abersetz.engines import EngineRequest, create_engine
from unittest.mock import patch
import requests

def test_translators_google_real(()) -> None:
    """Test Google Translate via translators library (requires network)."""

def test_deep_translator_google_real(()) -> None:
    """Test Google Translate via deep-translator library (requires network)."""

def test_hysf_engine_real(()) -> None:
    """Test Siliconflow translation engine (requires API key)."""

def test_translate_file_api((tmp_path)) -> None:
    """Test the high-level translate_path API."""

def test_html_translation(()) -> None:
    """Test HTML content translation preserves markup."""

def test_translators_bing_real(()) -> None:
    """Test Bing Translate via translators library (requires network)."""

def test_batch_translation_with_voc(()) -> None:
    """Test translating multiple chunks with voc propagation."""

def test_retry_on_network_failure(()) -> None:
    """Test that retry mechanism works for real network issues."""

def flaky_get((*args, **kwargs)):


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_offline.py
# Language: python

import tempfile
from pathlib import Path
import pytest
from abersetz.cli import AbersetzCLI
from abersetz.pipeline import TranslatorOptions, translate_path
from abersetz.pipeline import PipelineError
import abersetz
from abersetz import TranslatorOptions, translate_path
from abersetz.cli import AbersetzCLI
from abersetz.config import AbersetzConfig
from abersetz.pipeline import PipelineError, TranslationResult

def test_cli_help_works_offline(()) -> None:
    """Verify CLI help can be accessed without network."""

def test_config_commands_work_offline(()) -> None:
    """Verify config commands work without network."""

def test_dry_run_works_offline(()) -> None:
    """Verify dry run mode works without network access."""

def test_input_validation_works_offline(()) -> None:
    """Verify input validation works without network."""

def test_empty_file_handling_works_offline(()) -> None:
    """Verify empty file handling works without network."""

def test_import_works_offline(()) -> None:
    """Verify basic imports work without network."""

def test_edge_case_files_offline((file_content: str)) -> None:
    """Verify edge case files are handled offline."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_openai_lite.py
# Language: python

from typing import Any, get_type_hints
import httpx
import pytest
from abersetz.openai_lite import Chat, ChatCompletions, OpenAI

class _DummyResponse:
    def __init__((self, status_code: int, payload: dict[str, Any])) -> None:
    def raise_for_status((self)) -> None:
    def json((self)) -> dict[str, Any]:

class _DummyClient:
    def __init__((self, response: _DummyResponse, calls: list[dict[str, Any]])) -> None:
    def __enter__((self)) -> _DummyClient:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def post((self, url: str, *, json: dict[str, Any], headers: dict[str, str])) -> _DummyResponse:

def __init__((self, status_code: int, payload: dict[str, Any])) -> None:

def raise_for_status((self)) -> None:

def json((self)) -> dict[str, Any]:

def __init__((self, response: _DummyResponse, calls: list[dict[str, Any]])) -> None:

def __enter__((self)) -> _DummyClient:

def __exit__((self, exc_type, exc, tb)) -> None:

def post((self, url: str, *, json: dict[str, Any], headers: dict[str, str])) -> _DummyResponse:

def test_chat_completions_create_parses_response((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_chat_completions_create_raises_for_http_errors((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_openai_base_url_trims_trailing_slash(()) -> None:

def test_chat_declares_completions_attribute(()) -> None:

def test_openai_initializes_chat_completions(()) -> None:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_package.py
# Language: python

import pytest
import abersetz
import abersetz

def test_version(()) -> None:
    """Verify package exposes version."""

def test_getattr_rejects_unknown_symbol(()) -> None:
    """Ensure lazy exports fail loudly for typos while caching successes."""


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_pipeline.py
# Language: python

import os
import sys
from pathlib import Path
import pytest
from abersetz.chunking import TextFormat
from abersetz.config import AbersetzConfig
from abersetz.engines import EngineResult
from abersetz.pipeline import PipelineError, TranslatorOptions, translate_path
from loguru import logger

class DummyEngine:
    """Minimal engine used for pipeline tests."""
    def __init__((self)) -> None:
    def chunk_size_for((self, _fmt)) -> int:
    def translate((self, request)) -> EngineResult:

class ChunkyEngine(D, u, m, m, y, E, n, g, i, n, e):
    def __init__((self)) -> None:
    def chunk_size_for((self, fmt)) -> int:

class TrackingDummy(D, u, m, m, y, E, n, g, i, n, e):
    def __init__((self)) -> None:
    def chunk_size_for((self, fmt: TextFormat)) -> int:

class HtmlEngine(D, u, m, m, y, E, n, g, i, n, e):
    def __init__((self)) -> None:
    def chunk_size_for((self, fmt: TextFormat)) -> int:

class TrackingHtmlEngine(D, u, m, m, y, E, n, g, i, n, e):
    def __init__((self)) -> None:
    def chunk_size_for((self, fmt: TextFormat)) -> int:

def __init__((self)) -> None:

def chunk_size_for((self, _fmt)) -> int:

def translate((self, request)) -> EngineResult:

def test_translate_path_processes_files((tmp_path: Path, monkeypatch: pytest.MonkeyPatch)) -> None:

def test_translate_path_accepts_string_source_paths((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def test_translate_path_normalizes_engine_selector((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def fake_create_engine((selector: str, config, client=None)):

def test_translate_path_requires_matches((tmp_path: Path)) -> None:

def test_translate_path_uses_engine_chunk_size_when_defaults_falsy((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def __init__((self)) -> None:

def chunk_size_for((self, fmt)) -> int:

def fake_create_engine((selector, config, client=None)):

def test_translate_path_uses_dummy_chunk_size_when_defaults_zero((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def __init__((self)) -> None:

def chunk_size_for((self, fmt: TextFormat)) -> int:

def test_translate_path_html_uses_engine_chunk_hint((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def __init__((self)) -> None:

def chunk_size_for((self, fmt: TextFormat)) -> int:

def test_translate_path_with_html_engine_handles_mixed_formats((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def __init__((self)) -> None:

def chunk_size_for((self, fmt: TextFormat)) -> int:

def test_translate_path_handles_mixed_formats((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def test_translate_path_errors_on_unreadable_file((tmp_path: Path)) -> None:

def test_translate_path_write_over_updates_source((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def test_translate_path_dry_run_skips_io((tmp_path: Path, monkeypatch: pytest.MonkeyPatch)) -> None:

def test_translate_path_warns_on_large_file((
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
)) -> None:

def fake_stat((self: Path, *, follow_symlinks: bool = True)) -> os.stat_result:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_setup.py
# Language: python

from collections.abc import Sequence
from typing import Any
import httpx
import pytest
from abersetz.config import AbersetzConfig
from abersetz.engine_catalog import normalize_selector
from abersetz.setup import (
    DiscoveredProvider,
    EngineConfig,
    SetupWizard,
    _select_default_engine,
    setup_command,
)
import io
from rich.console import Console
from loguru import logger
from loguru import logger
import io
from rich.console import Console
import io
from rich.console import Console
import re
import io
from rich.console import Console
from abersetz.validation import ValidationResult
import io
from rich.console import Console

class _StubProgress:
    def __enter__((self)) -> _StubProgress:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def add_task((self, description: str, total: int)) -> str:
    def update((self, task: str, advance: int)) -> None:

class _DummyResponse:

class _DummyClient:
    def __init__((self, *args: Any, **kwargs: Any)) -> None:
    def __enter__((self)) -> _DummyClient:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _ListResponse:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Response:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _TimeoutClient:
    def __init__((self, *args: Any, **kwargs: Any)) -> None:
    def __enter__((self)) -> _TimeoutClient:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _HttpErrorResponse:

class _HttpErrorClient:
    def __enter__((self)) -> _HttpErrorClient:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Resp:

class _OddResponse:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _FailureResponse:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Progress:
    def __init__((self, *args: Any, **kwargs: Any)) -> None:
    def __enter__((self)) -> _Progress:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def add_task((self, description: str, total: int)) -> str:
    def update((self, task: str, advance: int)) -> None:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Response:

class _Client:
    def __enter__((self)) -> _Client:
    def __exit__((self, exc_type, exc, tb)) -> None:
    def get((self, url: str, headers: dict[str, str])):

class _Wizard:
    def __init__((self, *args: Any, **kwargs: Any)) -> None:
    def run((self)) -> bool:

def _stub_phase((*args: Any, **kwargs: Any)) -> None:

def __enter__((self)) -> _StubProgress:

def __exit__((self, exc_type, exc, tb)) -> None:

def add_task((self, description: str, total: int)) -> str:

def update((self, task: str, advance: int)) -> None:

def test_setup_wizard_triggers_validation((monkeypatch)) -> None:

def test_setup_wizard_skips_validation_when_no_config((monkeypatch)) -> None:

def test_discover_providers_adds_pricing_hint((monkeypatch)) -> None:

def test_discover_providers_includes_deepl_engine_mapping((monkeypatch)) -> None:

def test_display_results_shows_pricing_column((monkeypatch)) -> None:

def test_generate_config_builds_engines((monkeypatch)) -> None:

def test_generate_config_prefers_hysf_when_translators_unavailable((monkeypatch)) -> None:

def test_generate_config_defaults_to_ullm_when_only_openai((monkeypatch)) -> None:

def test_test_single_endpoint_success((monkeypatch)) -> None:

def json(()) -> dict[str, list[int]]:

def __init__((self, *args: Any, **kwargs: Any)) -> None:

def __enter__((self)) -> _DummyClient:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_parses_list_payload((monkeypatch)) -> None:

def json(()) -> list[int]:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_logs_verbose_status((monkeypatch)) -> None:

def json(()) -> dict[str, list[int]]:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_timeout((monkeypatch)) -> None:

def __init__((self, *args: Any, **kwargs: Any)) -> None:

def __enter__((self)) -> _TimeoutClient:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_http_error((monkeypatch)) -> None:

def json(()) -> dict[str, list[int]]:

def __enter__((self)) -> _HttpErrorClient:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_validate_config_logs_failures((monkeypatch)) -> None:

def fake_warning((message: str, selector: str, error: str)) -> None:

def test_validate_config_returns_immediately_when_no_results((monkeypatch)) -> None:

def fail_print((*args: Any, **kwargs: Any)) -> None:

def test_test_endpoints_handles_non_api_providers((monkeypatch)) -> None:

def test_test_endpoints_invokes_single_endpoint_for_api_providers((monkeypatch)) -> None:

def _capture_single_endpoint((self: SetupWizard, provider: DiscoveredProvider)) -> None:

def test_test_single_endpoint_uses_anthropic_headers((monkeypatch)) -> None:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def json(()) -> dict[str, list[int]]:

def test_test_single_endpoint_defaults_model_count_for_unknown_payload((monkeypatch)) -> None:

def json(()) -> str:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_logs_failure_when_verbose((monkeypatch)) -> None:

def json(()) -> dict[str, str]:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_general_exception((monkeypatch)) -> None:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_generate_config_returns_none_when_empty(()) -> None:

def test_setup_wizard_run_interactive_success((monkeypatch)) -> None:

def __init__((self, *args: Any, **kwargs: Any)) -> None:

def __enter__((self)) -> _Progress:

def __exit__((self, exc_type, exc, tb)) -> None:

def add_task((self, description: str, total: int)) -> str:

def update((self, task: str, advance: int)) -> None:

def fake_discover((self: SetupWizard)) -> None:

def fake_generate((self: SetupWizard)) -> AbersetzConfig:

def test_setup_wizard_run_interactive_no_config((monkeypatch)) -> None:

def fake_discover((self: SetupWizard)) -> None:

def test_validate_config_renders_table((monkeypatch)) -> None:

def test_discover_providers_verbose_logs((monkeypatch)) -> None:

def test_test_single_endpoint_connect_error((monkeypatch)) -> None:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_handles_json_errors((monkeypatch)) -> None:

def json(()) -> dict[str, Any]:

def __enter__((self)) -> _Client:

def __exit__((self, exc_type, exc, tb)) -> None:

def get((self, url: str, headers: dict[str, str])):

def test_test_single_endpoint_no_base_url(()) -> None:

def test_display_results_no_providers((monkeypatch)) -> None:

def test_select_default_engine_prefers_deepl((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_select_default_engine_prefers_translators_then_hysf((
    monkeypatch: pytest.MonkeyPatch,
)) -> None:

def test_select_default_engine_prefers_ullm_when_present((monkeypatch: pytest.MonkeyPatch)) -> None:

def test_select_default_engine_falls_back_to_first_engine(()) -> None:

def test_select_default_engine_returns_none_when_empty(()) -> None:

def test_generate_config_uses_fallbacks((monkeypatch)) -> None:

def test_setup_command_exits_on_failure((monkeypatch)) -> None:

def test_setup_command_succeeds((monkeypatch)) -> None:

def __init__((self, *args: Any, **kwargs: Any)) -> None:

def run((self)) -> bool:


# File: /Users/adam/Developer/vcs/github.twardoch/pub/abersetz/tests/test_validation.py
# Language: python

from dataclasses import dataclass
from typing import Any
from abersetz import validation
from abersetz.config import AbersetzConfig, Defaults, EngineConfig
from abersetz.engines import EngineError, EngineRequest, EngineResult
from abersetz.validation import validate_engines

class _StubEngine:
    def translate((self, request: EngineRequest)) -> EngineResult:
    def chunk_size_for((self, fmt: Any)) -> int | None:

def translate((self, request: EngineRequest)) -> EngineResult:

def chunk_size_for((self, fmt: Any)) -> int | None:

def _build_config(()) -> AbersetzConfig:

def test_validate_engines_collects_results(()) -> None:

def fake_create_engine((
        selector: str, cfg: AbersetzConfig, *, client: Any | None = None
    )) -> _StubEngine:

def test_validate_engines_handles_failures(()) -> None:

def fake_create_engine((
        selector: str, cfg: AbersetzConfig, *, client: Any | None = None
    )) -> _StubEngine:

def test_validate_engines_limits_selectors(()) -> None:

def fake_create_engine((
        selector: str, cfg: AbersetzConfig, *, client: Any | None = None
    )) -> _StubEngine:

def test_validate_engines_flags_empty_translations(()) -> None:

def fake_create_engine((
        selector: str, cfg: AbersetzConfig, *, client: Any | None = None
    )) -> _StubEngine:

def test_append_selector_handles_empty_and_duplicates(()) -> None:

def test_extract_providers_merges_lists_and_fallback(()) -> None:

def test_selectors_from_config_collects_all_engines(()) -> None:


<document index="46">
<source>translation_report.json</source>
<document_content>
{
  "total_files": 5,
  "total_chunks": 5,
  "languages": {
    "build": {
... (file content truncated to first 5 lines)
</document_content>
</document>

</documents>