Metadata-Version: 2.4
Name: cjm-transcription-plugin-system
Version: 0.0.25
Summary: A flexible plugin system for audio transcription intended to make it easy to add support for multiple backends.
Author-email: "Christian J. Mills" <9126128+cj-mills@users.noreply.github.com>
License: Apache-2.0
Project-URL: Repository, https://github.com/cj-mills/cjm-transcription-plugin-system
Project-URL: Documentation, https://cj-mills.github.io/cjm-transcription-plugin-system
Keywords: nbdev,jupyter,notebook,python
Classifier: Natural Language :: English
Classifier: Intended Audience :: Developers
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: cjm_plugin_system>=0.0.37
Dynamic: license-file

# cjm-transcription-plugin-system


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Install

``` bash
pip install cjm_transcription_plugin_system
```

## Project Structure

    nbs/
    ├── core.ipynb                       # Standardized result DTO for audio transcription plugins
    ├── forced_alignment_core.ipynb      # Data structures for word-level forced alignment results
    ├── forced_alignment_interface.ipynb # Domain-specific plugin interface for word-level audio-text alignment
    ├── forced_alignment_storage.ipynb   # Standardized SQLite storage for forced alignment results with content hashing
    ├── plugin_interface.ipynb           # Domain-specific plugin interface for audio transcription
    └── storage.ipynb                    # Standardized SQLite storage for transcription results with content hashing

Total: 6 notebooks

## Module Dependencies

``` mermaid
graph LR
    core["core<br/>Core Data Structures"]
    forced_alignment_core["forced_alignment_core<br/>Forced Alignment Core"]
    forced_alignment_interface["forced_alignment_interface<br/>Forced Alignment Plugin Interface"]
    forced_alignment_storage["forced_alignment_storage<br/>Forced Alignment Storage"]
    plugin_interface["plugin_interface<br/>Transcription Plugin Interface"]
    storage["storage<br/>Transcription Storage"]

    forced_alignment_interface --> forced_alignment_core
    plugin_interface --> core
```

*2 cross-module dependencies detected*

## CLI Reference

No CLI commands found in this project.

## Module Overview

Detailed documentation for each module in the project:

### Core Data Structures (`core.ipynb`)

> Standardized result DTO for audio transcription plugins

#### Import

``` python
from cjm_transcription_plugin_system.core import (
    TranscriptionResult
)
```

#### Classes

``` python
@dataclass
class TranscriptionResult:
    "Standardized output for all transcription plugins."
    
    text: str  # The transcribed text
    confidence: Optional[float]  # Overall confidence (0.0 to 1.0)
    segments: Optional[List[Dict[str, Any]]]  # Timestamped segments
    metadata: Dict[str, Any] = field(...)  # Additional metadata
```

### Forced Alignment Core (`forced_alignment_core.ipynb`)

> Data structures for word-level forced alignment results

#### Import

``` python
from cjm_transcription_plugin_system.forced_alignment_core import (
    ForcedAlignItem,
    ForcedAlignResult
)
```

#### Classes

``` python
@dataclass
class ForcedAlignItem:
    "A single word-level alignment result."
    
    text: str  # The aligned word (punctuation typically stripped by model)
    start_time: float  # Start time in seconds
    end_time: float  # End time in seconds
```

``` python
@dataclass
class ForcedAlignResult:
    "Standardized output for all forced alignment plugins."
    
    items: List[ForcedAlignItem]  # Word-level alignments
    metadata: Dict[str, Any] = field(...)  # Plugin-specific metadata
```

### Forced Alignment Plugin Interface (`forced_alignment_interface.ipynb`)

> Domain-specific plugin interface for word-level audio-text alignment

#### Import

``` python
from cjm_transcription_plugin_system.forced_alignment_interface import (
    ForcedAlignmentPlugin
)
```

#### Classes

``` python
class ForcedAlignmentPlugin(PluginInterface):
    """
    Abstract base class for all forced alignment plugins.
    
    Extends PluginInterface with forced-alignment-specific requirements:
    - `supported_formats`: List of audio file extensions this plugin can handle
    - `execute`: Accepts an audio file path and transcript text, returns ForcedAlignResult
    
    Input contract: plugins receive a path to a decodable audio file. Producing a
    model-ready file (format / sample-rate / channel normalization) is the caller's
    responsibility — e.g. an upstream ffmpeg step in the orchestration pipeline —
    not the plugin's. This keeps the interface library dependency-light.
    """
    
    def supported_formats(self) -> List[str]:  # e.g., ['wav', 'mp3', 'flac']
            """List of supported audio file extensions (without the dot)."""
            ...
    
        @abstractmethod
        def execute(
            self,
            audio: Union[str, Path],  # Path to a decodable audio file
            text: str,                # Transcript text to align against
            **kwargs
        ) -> ForcedAlignResult:  # Word-level alignment result
        "List of supported audio file extensions (without the dot)."
    
    def execute(
            self,
            audio: Union[str, Path],  # Path to a decodable audio file
            text: str,                # Transcript text to align against
            **kwargs
        ) -> ForcedAlignResult:  # Word-level alignment result
        "Perform forced alignment of text against audio.

`audio` is a path to a decodable audio file; the caller guarantees it is in
a form the plugin/model can consume."
```

### Forced Alignment Storage (`forced_alignment_storage.ipynb`)

> Standardized SQLite storage for forced alignment results with content
> hashing

#### Import

``` python
from cjm_transcription_plugin_system.forced_alignment_storage import (
    ForcedAlignmentRow,
    ForcedAlignmentStorage
)
```

#### Classes

``` python
@dataclass
class ForcedAlignmentRow:
    "A single row from the forced_alignments table."
    
    job_id: str  # Unique job identifier
    audio_path: str  # Path to the source audio file
    audio_hash: str  # Hash of source audio in "algo:hexdigest" format
    text: str  # Input transcript text that was aligned
    text_hash: str  # Hash of input text in "algo:hexdigest" format
    config_hash: str  # Hash of the effective alignment config used
    items: Optional[List[Dict[str, Any]]]  # Serialized ForcedAlignItems
    metadata: Optional[Dict[str, Any]]  # Plugin metadata
    created_at: Optional[float]  # Unix timestamp
```

``` python
class ForcedAlignmentStorage:
    def __init__(
        self,
        db_path: str  # Absolute path to the SQLite database file
    )
    "Standardized SQLite storage for forced alignment results."
    
    def __init__(
            self,
            db_path: str  # Absolute path to the SQLite database file
        )
        "Initialize storage, create table, run migrations, and build indexes."
    
    def save(
            self,
            job_id: str,        # Unique job identifier
            audio_path: str,    # Path to the source audio file
            audio_hash: str,    # Hash of source audio in "algo:hexdigest" format
            text: str,          # Input transcript text
            text_hash: str,     # Hash of input text in "algo:hexdigest" format
            config_hash: str,   # Hash of the effective alignment config
            items: Optional[List[Dict[str, Any]]] = None,  # Serialized ForcedAlignItems
            metadata: Optional[Dict[str, Any]] = None       # Plugin metadata
        ) -> None
        "Save or replace a forced alignment result (upsert by audio_path + text_hash + config_hash)."
    
    def save_with_logging(
            self,
            *,
            job_id: str,        # Unique job identifier
            audio_path: str,    # Path to the source audio file
            audio_hash: str,    # Hash of source audio in "algo:hexdigest" format
            text: str,          # Input transcript text
            text_hash: str,     # Hash of input text in "algo:hexdigest" format
            config_hash: str,   # Hash of the effective alignment config
            items: Optional[List[Dict[str, Any]]] = None,  # Serialized ForcedAlignItems
            metadata: Optional[Dict[str, Any]] = None,      # Plugin metadata
            logger: Optional[logging.Logger] = None         # Optional logger for success/failure messages
        ) -> bool:  # True if saved; False if the save failed (error logged, not raised)
        "Save a result, logging success/failure. Failures are logged and swallowed (returns False).

Centralizes the try/save/log/except block every forced-alignment plugin reimplements.
Returns True on success so callers can gate post-save side effects on the result."
    
    def get_cached(
            self,
            audio_path: str,   # Path to the source audio file
            audio_hash: str,   # Content hash of the audio (cache miss if the file changed)
            text_hash: str,    # Hash of the input transcript text (part of the cache key)
            config_hash: str   # Hash of the effective alignment config
        ) -> Optional[ForcedAlignmentRow]:  # Cached row or None
        "Retrieve a content-correct cached alignment for an (audio, transcript) pair.

Matches on audio_path + audio_hash + text_hash + config_hash. A changed audio
file (new audio_hash) misses even if a stale row exists at the same
(audio_path, text_hash, config_hash) — the next save() replaces it."
    
    def get_by_job_id(
            self,
            job_id: str  # Job identifier to look up
        ) -> Optional[ForcedAlignmentRow]:  # Row or None if not found
        "Retrieve a forced alignment result by job ID."
    
    def list_jobs(
            self,
            limit: int = 100  # Maximum number of rows to return
        ) -> List[ForcedAlignmentRow]:  # List of forced alignment rows
        "List forced alignment jobs ordered by creation time (newest first)."
    
    def verify_audio(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if audio matches, False if tampered, None if job not found
        "Verify the source audio file still matches its stored hash."
    
    def verify_text(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if text matches, False if tampered, None if job not found
        "Verify the input text still matches its stored hash."
```

### Transcription Plugin Interface (`plugin_interface.ipynb`)

> Domain-specific plugin interface for audio transcription

#### Import

``` python
from cjm_transcription_plugin_system.plugin_interface import (
    TranscriptionPlugin
)
```

#### Classes

``` python
class TranscriptionPlugin(PluginInterface):
    """
    Abstract base class for all transcription plugins.
    
    Extends PluginInterface with transcription-specific requirements:
    - `supported_formats`: List of audio file extensions this plugin can handle
    - `execute`: Accepts an audio file path (str or Path), returns TranscriptionResult
    
    Input contract: plugins receive a path to a decodable audio file. Producing a
    model-ready file (format / sample-rate / channel normalization) is the caller's
    responsibility — e.g. an upstream ffmpeg step in the orchestration pipeline —
    not the plugin's. This keeps the interface library dependency-light (no audio
    I/O deps such as numpy/soundfile in the shared consumer environment).
    """
    
    def supported_formats(self) -> List[str]: # e.g., ['wav', 'mp3', 'flac']
            """List of supported audio file extensions (without the dot)."""
            ...
    
        @abstractmethod
        def execute(
            self,
            audio: Union[str, Path], # Path to a decodable audio file
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "List of supported audio file extensions (without the dot)."
    
    def execute(
            self,
            audio: Union[str, Path], # Path to a decodable audio file
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "Transcribe audio to text.

`audio` is a path to a decodable audio file; the caller guarantees it is in
a form the plugin/model can consume."
```

### Transcription Storage (`storage.ipynb`)

> Standardized SQLite storage for transcription results with content
> hashing

#### Import

``` python
from cjm_transcription_plugin_system.storage import (
    TranscriptionRow,
    TranscriptionStorage
)
```

#### Classes

``` python
@dataclass
class TranscriptionRow:
    "A single row from the transcriptions table."
    
    job_id: str  # Unique job identifier
    audio_path: str  # Path to the source audio file
    audio_hash: str  # Hash of source audio in "algo:hexdigest" format
    config_hash: str  # Hash of the effective transcription config used
    text: str  # Transcribed text output
    text_hash: str  # Hash of transcribed text in "algo:hexdigest" format
    segments: Optional[List[Dict[str, Any]]]  # Timestamped segments
    metadata: Optional[Dict[str, Any]]  # Plugin metadata
    created_at: Optional[float]  # Unix timestamp
```

``` python
class TranscriptionStorage:
    def __init__(
        self,
        db_path: str  # Absolute path to the SQLite database file
    )
    "Standardized SQLite storage for transcription results."
    
    def __init__(
            self,
            db_path: str  # Absolute path to the SQLite database file
        )
        "Initialize storage, create table, run migrations, and build indexes."
    
    def save(
            self,
            job_id: str,        # Unique job identifier
            audio_path: str,    # Path to the source audio file
            audio_hash: str,    # Hash of source audio in "algo:hexdigest" format
            config_hash: str,   # Hash of the effective transcription config
            text: str,          # Transcribed text output
            text_hash: str,     # Hash of transcribed text in "algo:hexdigest" format
            segments: Optional[List[Dict[str, Any]]] = None,  # Timestamped segments
            metadata: Optional[Dict[str, Any]] = None         # Plugin metadata
        ) -> None
        "Save or replace a transcription result (upsert by audio_path + config_hash)."
    
    def save_with_logging(
            self,
            *,
            job_id: str,        # Unique job identifier
            audio_path: str,    # Path to the source audio file
            audio_hash: str,    # Hash of source audio in "algo:hexdigest" format
            config_hash: str,   # Hash of the effective transcription config
            text: str,          # Transcribed text output
            text_hash: str,     # Hash of transcribed text in "algo:hexdigest" format
            segments: Optional[List[Dict[str, Any]]] = None,  # Timestamped segments
            metadata: Optional[Dict[str, Any]] = None,        # Plugin metadata
            logger: Optional[logging.Logger] = None           # Optional logger for success/failure messages
        ) -> bool:  # True if saved; False if the save failed (error logged, not raised)
        "Save a result, logging success/failure. Failures are logged and swallowed (returns False).

Centralizes the try/save/log/except block every transcription plugin reimplements.
Returns True on success so callers can gate post-save side effects on the result."
    
    def get_cached(
            self,
            audio_path: str,   # Path to the source audio file
            audio_hash: str,   # Content hash of the audio (cache miss if the file changed)
            config_hash: str   # Hash of the effective transcription config
        ) -> Optional[TranscriptionRow]:  # Cached row or None
        "Retrieve a content-correct cached transcription result.

Matches on audio_path + audio_hash + config_hash. A changed audio file
(new audio_hash) misses even if a stale row exists at the same
(audio_path, config_hash) — the next save() replaces it."
    
    def get_by_job_id(
            self,
            job_id: str  # Job identifier to look up
        ) -> Optional[TranscriptionRow]:  # Row or None if not found
        "Retrieve a transcription result by job ID."
    
    def list_jobs(
            self,
            limit: int = 100  # Maximum number of rows to return
        ) -> List[TranscriptionRow]:  # List of transcription rows
        "List transcription jobs ordered by creation time (newest first)."
    
    def verify_audio(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if audio matches, False if tampered, None if job not found
        "Verify the source audio file still matches its stored hash."
    
    def verify_text(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if text matches, False if tampered, None if job not found
        "Verify the transcription text still matches its stored hash."
```
