VoiceCraft
Open Site
3.00
VoiceCraft is an advanced tool designed for zero-shot speech editing and text-to-speech (TTS) tasks, particularly adept at handling diverse and uncontrolled data sources like audiobooks, internet videos, and podcasts.
Leveraging token infilling neural codec language models, VoiceCraft achieves state-of-the-art performance in both speech editing and zero-shot TTS.With minimal reference, it can clone or edit unseen voices within seconds.
Key features include model weights available on HuggingFace, training guidance, and inference demos for speech editing and TTS.The tool offers multiple ways to run TTS inference, including with and without Docker.
It provides comprehensive environment setup instructions and supports training and fine-tuning of models.Users can train VoiceCraft models using provided datasets and manifest files, preparing utterances, transcripts, and phoneme sequences.
The codebase is licensed under CC BY-NC-SA 4.0, while model weights are under Coqui Public Model License 1.0.0.Acknowledgments are given to related projects and individuals, and a citation for VoiceCraft's paper is provided.
A disclaimer emphasizes the ethical use of the technology, prohibiting unauthorized speech generation or editing.Overall, VoiceCraft offers a sophisticated solution for handling various speech editing and TTS tasks with high accuracy and efficiency.
Leveraging token infilling neural codec language models, VoiceCraft achieves state-of-the-art performance in both speech editing and zero-shot TTS.With minimal reference, it can clone or edit unseen voices within seconds.
Key features include model weights available on HuggingFace, training guidance, and inference demos for speech editing and TTS.The tool offers multiple ways to run TTS inference, including with and without Docker.
It provides comprehensive environment setup instructions and supports training and fine-tuning of models.Users can train VoiceCraft models using provided datasets and manifest files, preparing utterances, transcripts, and phoneme sequences.
The codebase is licensed under CC BY-NC-SA 4.0, while model weights are under Coqui Public Model License 1.0.0.Acknowledgments are given to related projects and individuals, and a citation for VoiceCraft's paper is provided.
A disclaimer emphasizes the ethical use of the technology, prohibiting unauthorized speech generation or editing.Overall, VoiceCraft offers a sophisticated solution for handling various speech editing and TTS tasks with high accuracy and efficiency.
- This Tool is verified
- Added on August 25, 2024
-
Free Trial
What do you think about VoiceCraft
Login to leave a review for the community