Metadata-Version: 2.4
Name: ankiazvox
Version: 0.6.0
Summary: A CLI tool to synchronize Anki notes with high-quality Azure TTS audio.
License: MIT
License-File: LICENSE
Author: Eric Xu
Author-email: xulihua2006@gmail.com
Requires-Python: >=3.11,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: azure-cognitiveservices-speech (>=1.47.0,<2.0.0)
Requires-Dist: beautifulsoup4 (>=4.14.3,<5.0.0)
Requires-Dist: click (>=8.3.1,<9.0.0)
Requires-Dist: python-dotenv (>=1.2.1,<2.0.0)
Requires-Dist: pyyaml (>=6.0.3,<7.0.0)
Requires-Dist: requests (>=2.32.5,<3.0.0)
Requires-Dist: tqdm (>=4.67.1,<5.0.0)
Description-Content-Type: text/markdown

# **ankiazvox**

**ankiazvox** is a professional-grade CLI tool that synchronizes Anki notes with high-quality Azure Neural TTS audio. By leveraging cloud-based Neural voices, it automates text extraction, sanitization, and card updates via AnkiConnect, transforming text-only decks into immersive audio-visual learning tools.

## **✨ New in v0.6.0**

* **Concurrency / Performance**: Parallel synthesis with `--workers/-w` to speed up large sync jobs.
* **Overwrite & Debug**: `--overwrite` replaces existing audio; `--debug` prints extra diagnostics for troubleshooting.

## **📌 v0.5.0 Release Highlights**

* **azv init**: An interactive onboarding setup that walks you through connecting your Azure account and setting your preferred default voice.
* **Field Mapping**: Efficiency-focused syncing that processes multiple fields simultaneously via the `--fields` flag (e.g., `Word:Audio;Sent:SentAudio`).
* **Prosody Control**: Fine-tune the listening experience with `--rate` and `--pitch` flags, allowing you to slow down complex phrases or adjust tone for clarity.
* **SSML Support**: Enhanced processing that preserves natural phrasing by converting `<br>` to pauses and providing full support for raw SSML input fields.

## **🚀 Installation**

### **1\. Prerequisites**

* **Anki Desktop**: Must be running with the [AnkiConnect](https://ankiweb.net/shared/info/2055492159) add-on installed and configured.  
* **Azure Speech Service**: An active subscription key and region from the Azure portal (Azure offers a generous free tier for speech services).

### **2\. Setup & Configuration**

Install the package and run the initializer to create your azv\_config.yaml file:

```
pip install ankiazvox  
azv init
```

## **🛠 Usage**

### **1\. Synchronize Audio (sync)**

The sync command generates audio for notes that match a specific Anki search query. It handles the batch processing of voice synthesis and media management automatically.

**Basic Single-Field Sync:**

Sync text from "Front" and save audio tag to "Audio"  
```
azv sync -q "deck:English::Vocabulary" -s "Front" -t "Audio"
```

**Advanced Multi-Field Sync with Prosody:**

Process Word and Sentence fields at 85% speed with a slight pitch increase
```  
azv sync -q "deck:JP::Grammar" -f "Word:WordAud;Sent:SentAud" --rate 0.85 --pitch +5%
```

| Option | Short | Description |
| :---- | :---- | :---- |
| `--config` |  | Path to a config file (yaml or .env). The tool also auto-detects `azv_config.yml` or `.env` if present |
| `--query` | `-q` | Anki search query (standard Anki search syntax) |
| `--fields` | `-f` | Key-value mapping: `source1:target1;source2:target2` |
| `--source` | `-s` | Name of the field containing source text |
| `--target` | `-t` | Name of the field to store the `[sound:...]` tag |
| `--rate` |  | Synthesis speed (1.0 is normal; 0.8 is 80% speed) |
| `--pitch` |  | Pitch adjustment (e.g., `+10%` or `-5%`) |
| `--voice` | `-v` | Override the default neural voice for this session |
| `--overwrite` |  | Replace existing audio in the target field if present |
| `--ssml-source` |  | Treat the source field as raw SSML when it begins with `<speak>` |
| `--workers` | `-w` | Number of concurrent synthesis workers (default: 1) |
| `--debug` |  | Enable debug logging for troubleshooting |
| `--yes` | `-y` | Skip the confirmation prompt and proceed immediately |

### **2\. Sample & List Voices**

Before running a large sync, it is recommended to sample voices to find the best fit for your language material.

Preview a voice at a slower speed to check clarity  
```
azv sample --voice en-US-AndrewNeural --text "The quick brown fox" --rate 0.8 --play
```

List all Japanese neural voices to find a specific dialect or tone  
```
azv list-voices --locale ja-JP
```

## **📝 Formatting & SSML**

* **HTML Sanitization**: The tool cleans up Anki's internal HTML (like `<div>` and `<span>`) to ensure the TTS engine only reads the text.  
* **Smart Pauses**: It preserves line breaks by converting `<br>` tags into 400ms SSML pauses, which helps in separating sentences or definitions.  
* **Raw SSML**: For advanced users, if a field's content starts with the `<speak>` tag, **ankiazvox** treats it as raw SSML. This allows you to manually insert custom breaks, emphasis, or phoneme corrections directly into your Anki notes.

Additional notes:

* **Language detection from voice names**: When wrapping text into SSML the tool extracts the language code from typical voice names (e.g., `en-US-AndrewNeural`) so the TTS engine receives the correct `xml:lang` attribute.
* **Cross-platform playback**: `azv sample --play` uses the system player (`afplay` on macOS, `ffplay` elsewhere) when available.
* **Temporary files cleaned**: Temporary synthesis files are removed after sync to avoid cluttering your project folder.

## **🤝 Contributing**

Contributions are welcome\! Whether it's a bug fix, a new feature, or an improvement to the documentation, feel free to open an issue or submit a Pull Request on GitHub.

## **📄 License**

This project is open-source and released under the **MIT License**.
