ANE VoxCPM TTS Playground
Text to Generate
Jittery Jack's jam jars jiggled jauntily, jolting Jack's jumbled jelly-filled jars joyously. Cindy's circular cymbals clanged cheerfully, clashing crazily near Carla's crashing crockery. You think you can just waltz in here and cause chaos? Well, I've got news for you.
Voice Selection (Optional)
(See Samples)
Use Prompt Audio
Prompt WAV Path (Optional for Voice Cloning)
?
Random sounding voice if unspecified.
Prompt Text (Required when using prompt WAV)
?
This text is transcribed from the 'Prompt WAV' and is used to condition the model on the voice's acoustic properties. It
must
be an accurate transcription of the prompt audio.
Max Length (0.04s per unit)
Maximum generated sequence length (1-2048)
CFG Value
Classifier-free guidance value (0.0-10.0)
Inference Timesteps
?
Controls the number of diffusion steps. A higher number (e.g., 20) is slower but may increase quality. A lower number (e.g., 5-10) is faster. This model works well with 10.
Number of inference steps (1-100)
Generate & Play
Generate Full Audio
Stop Generation
Full Audio Playback