Skip to content

settings

Kokoro TTS settings implementation.

Classes:

KokoroDeviceTypes

Bases: StrEnum

Kokoro TTS device types supported by this backend.

I.e. PyTorch device types ⧉.

Attributes:

  • cpu

    Use the computer’s CPU and not any GPU.

  • cuda

    Use the computer’s Nvidia GPU.

cpu class-attribute instance-attribute

cpu = auto()

Use the computer’s CPU and not any GPU.

cuda class-attribute instance-attribute

cuda = auto()

Use the computer’s Nvidia GPU.

KokoroLocales

Bases: StrEnum

Voice locales supported by this backend.

The locales also have to be supported by Kokoro in some way too, of course.

Attributes:

  • en_GB

    British English (works with voices prefixed with bf_ or bm_)

  • en_US

    American English (works with voices prefixed with af_ or am_)

  • fr_FR

    French (works with voices prefixed with ff_ or fm_)

en_GB class-attribute instance-attribute

en_GB = 'en_GB'

British English (works with voices prefixed with bf_ or bm_)

en_US class-attribute instance-attribute

en_US = 'en_US'

American English (works with voices prefixed with af_ or am_)

fr_FR class-attribute instance-attribute

fr_FR = 'fr_FR'

French (works with voices prefixed with ff_ or fm_)

KokoroSettings

Kokoro TTS backend settings.

Note

To work in an offline or air-gapped environment, you must provides local paths for model_path, config_path and voice_path.

Methods:

  • to_dict

    Export all settings as a dictionary of only JSON-serializable types.

Attributes:

  • config_path (FilePath | None) –

    Offline mode local file path to the Kokoro TTS config file.

  • device (KokoroDeviceTypes | None) –

    The compute device to use to generate the speech.

  • lang_code (str) –

    The Kokoro TTS language code for the current locale.

  • locale (str) –

    Used to help specify which language to speak.

  • model_path (FilePath | None) –

    Offline mode local file path to the Kokoro TTS model file.

  • repo_id (str) –

    The HuggingFace repository ID to use to download the Kokoro Model.

  • speed (float) –

    The speed at which to speak.

  • voice (KokoroVoices) –

    The voice in which to speak.

  • voice_path (FilePath | None) –

    Offline mode local file path to the Kokoro TTS voice file.

config_path class-attribute instance-attribute

config_path: FilePath | None = None

Offline mode local file path to the Kokoro TTS config file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

Example

~/my_kokoro_tts_downloads/config.json

Note

If specified, then the file must exist at startup time.

device class-attribute instance-attribute

device: KokoroDeviceTypes | None = None

The compute device to use to generate the speech.

I.e. to use the GPU or only the CPU.

device must be selected from KokoroDeviceTypes or be None. If it set to None, then a GPU will be used if present, with the CPU as the fallback option.

Note

Kokoro TTS does not currently support integer GPU numbers, so if you you multiple GPUs, you will have to specify which one to use in some other way. (E.g. environment variables, etc.)

lang_code property

lang_code: str

The Kokoro TTS language code for the current locale.

E.g. a for American English, b for British English, f for French, etc.

Note

This is not a setting, it is a derived property used by the Kokoro backend.

locale class-attribute instance-attribute

locale: str = 'en_US'

Used to help specify which language to speak.

locale influences pronunciation, inflections, etc. of the specified voice and must be one of the locales supported by this backend.

While locale must be a string to conform with the ITTSSettings interface, the valid / supported options for it are defined in KokoroLocales.

model_path class-attribute instance-attribute

model_path: FilePath | None = None

Offline mode local file path to the Kokoro TTS model file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

Example

~/my_kokoro_tts_downloads/kokoro-v1_0.pth

Note

If specified, then the file must exist at startup time.

repo_id class-attribute instance-attribute

repo_id: str = 'hexgrad/Kokoro-82M'

The HuggingFace repository ID to use to download the Kokoro Model.

This normally does not need to be changed, unless you have an alternative download location that works with the HuggingFace API.

Example

hexgrad/Kokoro-82M points to https://huggingface.co/hexgrad/Kokoro-82M ⧉.

speed class-attribute instance-attribute

speed: float = Field(default=1.0, ge=0.1, le=2.0)

The speed at which to speak.

Speech can be sped up or slowed down with this setting.

speed is must be between 0.1 and 2.0, inclusive.

voice class-attribute instance-attribute

The voice in which to speak.

Voices are either male or female and are optimized for specific languages / dialects. voice must be selected from KokoroVoices.

For best results, use a voice that is optimized for the specified locale.

voice_path class-attribute instance-attribute

voice_path: FilePath | None = None

Offline mode local file path to the Kokoro TTS voice file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

If voice_path is provided, then the voice attribute is ignored.

Example

~/my_kokoro_tts_downloads/voices/af_heart.pt

Note

If specified, then the file must exist at startup time.

to_dict

Export all settings as a dictionary of only JSON-serializable types.

Returns:

  • dict[str, JSONSerializableTypes]

    A dictionary where the keys are the setting names and the values are the setting values converted as necessary to simple base JSON-compatible types.

Example

{
    "locale": "en_US",
    "voice": "af_heart",
    "speed": 1.0,
    "device": "cuda",
    "repo_id": "hexgrad/Kokoro-82M",
    "model_path": "kokoro-v1_0.pth",
    "config_path": "config.json",
    "voice_path": "af_heart.pt"
}
 

KokoroVoices

Bases: StrEnum

Kokoro TTS voices supported by this backend.

Voice grades and details can be found on VOICES.md ⧉.

Attributes:

  • af_bella

    American female voice, grade A- quality.

  • af_heart

    American female voice, grade A quality.

  • af_nicole

    American female voice, grade B- quality.

  • am_fenrir

    American male voice, grade C+ quality.

  • am_michael

    American male voice, grade C+ quality.

  • am_puck

    American male voice, grade C+ quality.

  • bf_emma

    British female voice, grade B- quality.

  • bm_fable

    British male voice, grade C quality.

  • bm_george

    British male voice, grade C quality.

  • ff_siwis

    French female voice, grade B- quality.

af_bella class-attribute instance-attribute

af_bella = auto()

American female voice, grade A- quality.

af_heart class-attribute instance-attribute

af_heart = auto()

American female voice, grade A quality.

af_nicole class-attribute instance-attribute

af_nicole = auto()

American female voice, grade B- quality.

am_fenrir class-attribute instance-attribute

am_fenrir = auto()

American male voice, grade C+ quality.

am_michael class-attribute instance-attribute

am_michael = auto()

American male voice, grade C+ quality.

am_puck class-attribute instance-attribute

am_puck = auto()

American male voice, grade C+ quality.

bf_emma class-attribute instance-attribute

bf_emma = auto()

British female voice, grade B- quality.

bm_fable class-attribute instance-attribute

bm_fable = auto()

British male voice, grade C quality.

bm_george class-attribute instance-attribute

bm_george = auto()

British male voice, grade C quality.

ff_siwis class-attribute instance-attribute

ff_siwis = auto()

French female voice, grade B- quality.