Metadata-Version: 2.4
Name: EveryVoice
Version: 0.4.0
Summary: Text-to-Speech Synthesis for the Speech Generation for Indigenous Language Education Small Teams Project
Project-URL: Homepage, https://github.com/EveryVoiceTTS/EveryVoice
Project-URL: Documentation, https://docs.everyvoice.ca
Project-URL: Repository, https://github.com/EveryVoiceTTS/EveryVoice
Project-URL: Issues, https://github.com/EveryVoiceTTS/EveryVoice/issues
Project-URL: Changelog, https://github.com/EveryVoiceTTS/EveryVoice/releases
Author-email: Aidan Pine <Aidan.Pine@nrc-cnrc.gc.ca>, Eric Joanis <Eric.Joanis@nrc-cnrc.gc.ca>, Marc Tessier <Marc.Tessier@nrc-cnrc.gc.ca>, Mengzhe Geng <Mengzhe.Geng@nrc-cnrc.gc.ca>, Samuel Larkin <Samuel.Larkin@nrc-cnrc.gc.ca>
Maintainer-email: Aidan Pine <Aidan.Pine@nrc-cnrc.gc.ca>, Eric Joanis <Eric.Joanis@nrc-cnrc.gc.ca>, Samuel Larkin <Samuel.Larkin@nrc-cnrc.gc.ca>
License: For everything but the exceptions listed below:
        
        MIT License
        
        Copyright (c) 2022-2025 National Research Council Canada
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
        The above work is derived from the following sources:
        
        MIT License
        
        Copyright (c) 2020 Jungil Kong. https://github.com/jik876/hifi-gan/blob/master/LICENSE
        Copyright (c) 2020, Chung-Ming Chien.  https://github.com/ming024/FastSpeech2
        Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved. https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/FastPitch
        Copyright (c) 2021 Keon Lee. https://github.com/keonlee9420/Comprehensive-Transformer-TTS
        Copyright (c) 2022 Christoph Minixhofer. https://github.com/MiniXC/LightningFastSpeech2/blob/main/LICENSE
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
        Copyright (c) 2022 Rishikesh (ऋषिकेश). https://github.com/rishikksh20/iSTFTNet-pytorch
        
        Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
            http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and limitations under the License.
        
        
        For everyvoice/preprocessor/attention_prior.py:
        
        Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
            * Redistributions of source code must retain the above copyright
              notice, this list of conditions and the following disclaimer.
            * Redistributions in binary form must reproduce the above copyright
              notice, this list of conditions and the following disclaimer in the
              documentation and/or other materials provided with the distribution.
            * Neither the name of the NVIDIA CORPORATION nor the
              names of its contributors may be used to endorse or promote products
              derived from this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
        ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
        WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE FOR ANY
        DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
        (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
        LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
        ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
        (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
        SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
        
        
        For files in everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/attn/:
        
        Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved.
        
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
            http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software
        distributed under the License is distributed on an "AS IS" BASIS,
        WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and
        limitations under the License.
        
        
        For everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/gst/attn.py:
        
        Copyright (c) 2019, Shigeki Karita.  All rights reserved.
        
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
            http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software
        distributed under the License is distributed on an "AS IS" BASIS,
        WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and
        limitations under the License.
        
        
        For everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/gst/model.py (sourced from ESPNet2):
        
        Copyright (c) 2020, Nagoya University (Tomoki Hayashi).  All rights reserved.
        
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
            http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software
        distributed under the License is distributed on an "AS IS" BASIS,
        WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and
        limitations under the License.
        
        
        For the module in everyvoice/text/arpabet.py:
        
        MIT License
        
        Copyright (c) 2015 Waldeilson Eder dos Santos
        Copyright (c) 2024 National Research Council Canada
        
        
        For everyvoice/text/textsplit.py:
        
        Implicit Copyright James Betker (https://github.com/neonbjb/tortoise-tts)
        Implicit Copyright @fakerybakery (https://github.com/fakerybakery/txtsplit)
        Copyright (c) 2025 National Research Council Canada for changes made by the NRC team
        
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
        You may obtain a copy of the License at
        
            http://www.apache.org/licenses/LICENSE-2.0
        
        Unless required by applicable law or agreed to in writing, software
        distributed under the License is distributed on an "AS IS" BASIS,
        WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        See the License for the specific language governing permissions and
        limitations under the License.
        
        
        For everyvoice/dataloader/imbalanced_sampler.py:
        Adapted from https://github.com/ufoym/imbalanced-dataset-sampler
        
        MIT License
        
        Copyright (c) 2018 Ming
        Copyright (c) 2019-2025 National Research Council Canada for changes to adapt
        the code to EveryVoice
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: CLI,TTS
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Other Audience
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Unix Shell
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Typing :: Typed
Requires-Python: <3.13,>=3.10
Requires-Dist: anytree>=2.12.1
Requires-Dist: clipdetect>=0.1.4
Requires-Dist: deepdiff>=6.5.0
Requires-Dist: einops==0.5.0
Requires-Dist: g2p<3,>=2.3.1
Requires-Dist: gradio>=5.9.1
Requires-Dist: grapheme>=0.6.0
Requires-Dist: ilt-panphon<0.21,>=0.20.1
Requires-Dist: ipatok>=0.4.1
Requires-Dist: librosa==0.11.0
Requires-Dist: lightning>=2.1.0
Requires-Dist: loguru==0.6.0
Requires-Dist: matplotlib>=3.9.0
Requires-Dist: merge-args
Requires-Dist: nltk==3.9.3
Requires-Dist: packaging>=22.0
Requires-Dist: pandas~=2.0
Requires-Dist: protobuf~=4.25
Requires-Dist: pydantic[email]<2.8.0,>=2.4.2
Requires-Dist: pympi-ling
Requires-Dist: pyworld-prebuilt==0.3.4.4
Requires-Dist: pyyaml>=6.0
Requires-Dist: questionary>=2.0.0
Requires-Dist: readalongs>=1.2.0
Requires-Dist: setuptools<80
Requires-Dist: simple-term-menu==1.5.2
Requires-Dist: tabulate==0.9.0
Requires-Dist: tensorboard>=2.14.1
Requires-Dist: torch==2.7.1
Requires-Dist: torchaudio==2.7.1
Requires-Dist: torchinfo==1.8.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: typer>=0.15.3
Requires-Dist: yaspin>=3.1.0
Provides-Extra: dev
Requires-Dist: black~=24.3; extra == 'dev'
Requires-Dist: chardet<6; extra == 'dev'
Requires-Dist: coverage; extra == 'dev'
Requires-Dist: diff-cover; extra == 'dev'
Requires-Dist: flake8>=4.0.1; extra == 'dev'
Requires-Dist: gitlint-core>=0.19.0; extra == 'dev'
Requires-Dist: isort>=5.10.1; extra == 'dev'
Requires-Dist: jsonschema>=4.17.3; extra == 'dev'
Requires-Dist: mypy>=1.8.0; extra == 'dev'
Requires-Dist: pep440>=0.1.2; extra == 'dev'
Requires-Dist: playwright>=1.52.0; extra == 'dev'
Requires-Dist: pre-commit>=3.2.0; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0.5; extra == 'dev'
Requires-Dist: types-requests>=2.27.11; extra == 'dev'
Requires-Dist: types-setuptools>=57.4.9; extra == 'dev'
Requires-Dist: types-tabulate==0.9.0; extra == 'dev'
Requires-Dist: types-tqdm>=4.66; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mike>=1.1.2; extra == 'docs'
Requires-Dist: mkdocs-click>=0.8.0; extra == 'docs'
Requires-Dist: mkdocs-macros-plugin>=1.0.4; extra == 'docs'
Requires-Dist: mkdocs-material>=9.2.5; extra == 'docs'
Requires-Dist: mkdocs-typer>=0.0.3; extra == 'docs'
Requires-Dist: mkdocs>=1.5.2; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.22.0; extra == 'docs'
Provides-Extra: test
Requires-Dist: jsonschema>=4.17.3; extra == 'test'
Requires-Dist: pep440>=0.1.2; extra == 'test'
Requires-Dist: playwright>=1.52.0; extra == 'test'
Requires-Dist: pytest; extra == 'test'
Provides-Extra: torch
Requires-Dist: torch==2.7.1; (sys_platform == 'darwin') and extra == 'torch'
Requires-Dist: torchaudio==2.7.1; (sys_platform == 'darwin') and extra == 'torch'
Description-Content-Type: text/markdown

# EveryVoice TTS Toolkit 💬

[![codecov](https://codecov.io/gh/EveryVoiceTTS/EveryVoice/branch/main/graph/badge.svg?token=yErCxf64IU)](https://codecov.io/gh/EveryVoiceTTS/EveryVoice)
[![Documentation](https://github.com/EveryVoiceTTS/EveryVoice/actions/workflows/docs.yml/badge.svg)](https://docs.everyvoice.ca)
[![Build Status](https://github.com/EveryVoiceTTS/EveryVoice/actions/workflows/test.yml/badge.svg)](https://github.com/EveryVoiceTTS/EveryVoice/actions)
[![license](https://img.shields.io/badge/Licence-MIT-green)](LICENSE)
![beta](https://img.shields.io/badge/beta-grey)

This is the Text-to-Speech (TTS) toolkit used by the Small Teams "Speech Generation for Indigenous Language Education" project.

## Quickstart from PyPI

- Install Python 3.10, 3.11, or 3.12 and create a venv or a conda env for EveryVoice.

- Install `sox`.
  - On Ubuntu, `sudo apt-get install sox` should work.
  - Other Linux distros should have equivalent packages.
  - With Conda, `conda install sox -c conda-forge` is reliable.

- Install `ffmpeg`:
  - On Ubuntu, `sudo apt-get install ffmpeg` should work.
  - Other Linux distros should have an equivalent package.
  - With Conda, `conda install ffmpeg` is reliable.
  - Or, use the official bundles from https://www.ffmpeg.org/download.html

- Install `torch` and `torchaudio` version 2.1.0 for your platform and CUDA version: follow the instructions at https://pytorch.org/get-started/locally/ but specify `torch==2.1.0 torchaudio==2.1.0` in the install command and remove `torchvision`.

- Run `pip install everyvoice`

## Quickstart from source

### Install conda

First, you'll need to install `conda`. [Miniforge3](https://github.com/conda-forge/miniforge) is a fully open-source option which is free for all users and works well. You can also use Anaconda3 or Miniconda3 if you have or can get a license.

### Clone the repo

```sh
git clone https://github.com/EveryVoiceTTS/EveryVoice.git
cd EveryVoice
git submodule update --init
```

### Environment and installation – automated

To run EveryVoice, you need to create a new environment using Conda and Python 3.12, install all our dependencies and EveryVoice itself.

We have automated the procedure required to do all this in the script `make-everyvoice-env`, which you can run like this:

```sh
./make-everyvoice-env --path <env-path-of-your-choice>
conda activate <env-path-of-your-choice>
```

Add the option `--cuda CUDA_VERSION` if you need to override the default CUDA version, or `--cpu` to use Torch compiled for CPU use only.

### Environment and installation – manual

If the automated installation process does not work for you, or if you prefer to do the full installation manually, please refer to [EveryVoice / Installation](https://docs.everyvoice.ca/latest/install/#manual-installation).

### Documentation

Read the full [EveryVoice documentation](https://docs.everyvoice.ca/).

In particular, read the [Guides](https://docs.everyvoice.ca/latest/guides/) to get familiar with the whole process.

To build and view the documentation locally:
```
pip install -e '.[docs]'
mkdocs serve
```
and browse to http://127.0.0.1:8000/.

## Contributing

Feel free to dive in! [Open an issue](https://github.com/EveryVoiceTTS/EveryVoice/issues/new) or submit PRs.

This repo follows the [Contributor Covenant](http://contributor-covenant.org/version/1/3/0/) Code of Conduct.

Please make sure our standard Git hooks are activated, by running these commands in your sandbox (if you used our `make-everyvoice-env` script then this step is already done for you):

```sh
pip install -e '.[dev]'
pre-commit install
gitlint install-hook
git submodule foreach 'pre-commit install'
git submodule foreach 'gitlint install-hook'
```

Have a look at [Contributing.md](Contributing.md) for the full details on the
Conventional Commit messages we prefer, our code formatting conventions, our Git
hooks, and recommendations on how to make effective pull requests.

## Publishing Instructions

To publish a new version of the project, follow these steps:

1. **Determine the Version Bump**
   Decide whether your changes constitute a:
   - **Major** version bump (breaking changes),
   - **Minor** version bump (new features, backward-compatible, any change to the schema), or
   - **Patch** version bump (bug fixes, small changes).

2. **Update Version Files**
   - Update the `everyvoice._version` file to reflect the new version.
   - Keep all `submodule._version` files in sync with this version, **except** for the `wav2vec2aligner` submodule (which can be installed separately).
   - Keep the `everyvoice` dependency in all `submodule/pyproject.toml` files in sync with the everyvoice Major.minor version, except in `wav2vec2aligner`.
   - Commit the resulting changes, including all submodules.

3. **Update the Documentation**
   - Make sure the documentation reflects the current state of the code.
   - Look for references to the current or most recent version and update them if necessary.

4. **Update Schema (for Major/Minor bumps)**
   If you bumped a **major** or **minor** version:
   - Run `everyvoice update-schema`. You may need to delete existing schema files if you get an error message, but you should only do so if you are sure that those schema files have not already been published. I.e. we might already have schema files related to an alpha release - those can be overwritten, but we should never change published schema files.
   - Commit the resulting changes.

5. **Open a Pull Request**
   - Create a PR with your changes.
   - Wait for tests to pass and for the PR to be merged into `main`.

6. **Tag the Release**
   After merging:
   ```bash
   git tag -a -m vX.Y.Z vX.Y.Z
   git push 'vX.Y.Z'
   ```

7. **Update SchemaStore (for Major/Minor bumps)**
    Once the CI has built and released your version, if you bumped a major or minor version:

    Submit a PR to [SchemaStore](https://github.com/SchemaStore/schemastore) to update the schema reference.

    The only file you need to change is: `src/api/json/catalog.json`

## Acknowledgements

This repository draws on many other wonderful code bases.
Many thanks to:

- https://github.com/nocotan/pytorch-lightning-gans
- https://github.com/rishikksh20/iSTFTNet-pytorch
- https://github.com/jik876/hifi-gan
- https://github.com/ming024/FastSpeech2
- https://github.com/MiniXC/LightningFastSpeech2
- https://github.com/DigitalPhonetics/IMS-Toucan

## Tests

There are many ways to run the unit tests, if you installed EveryVoice from source:
 - Run all the tests with the most concise output: `pytest`
 - Run all the dev tests: `everyvoice/run_tests.py dev` or `everyvoice test dev`
 - Run the tests with verbose logs: `everyvoice/run_tests.py --verbose dev`
 - Show the names of the other suites you can run: `everyvoice/run_tests.py -h`
 - Run all the tests in one test file: `python -m unittest everyvoice/tests/test_<somefilename>.py`
 - Run one specific test case: `python -m unittest everyvoice.tests.<filename>.<class_name>.<function_name>`
