VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Vidvoi is described as 'Create professional video voiceovers instantly with Vidvoi AI voiceover generator. No prompting needed. Perfect for content creators and marketers' and is an app in the ai tools & services category. There are more than 10 alternatives to Vidvoi for a variety of platforms, including Web-based, Android, Mac, Windows and Linux apps. The best Vidvoi alternative is VoiceCraft, which is both free and Open Source. Other great apps like Vidvoi are ElevenLabs, X to Voice, RHVoice and Voice Engine.
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




ElevenLabs uses AI to deliver natural, expressive speech for diverse applications such as podcasts and videos. It features a user-friendly interface, customizable intonation, and offers seamless API integration. Privacy, scalability, and multilingual capabilities enhance its adaptability.




Open-source tool that analyzes your X/Twitter profile data to generate a custom voice with ElevenLabs Voice Design API, integrating with Hedra's video API for an innovative audio-visual experience.




Voice Engine is a text-to-voice generation platform from OpenAI, which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.


Transform text into speech with natural synthesis, offering smooth and fine-tuned audio export. Create high-quality voiceovers, download outputs for diverse applications, and experience excellent synthesis. Supports various languages and operates on multiple platforms.




Audiomatic is a web app that seamlessly translates videos into other languages. Our state-of-the-art pipeline delivers contextually-accurate dubbed translations that preserve the tone, style, and emotion of the original speakers.



We're excited to introduce Chatterbox, Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations.
TTSMaker is a free text-to-speech tool that provides speech synthesis services, supports multiple languages: English, French, German, Spanish, Arabic, Chinese, Japanese, Korean, Vietnamese... and a variety of voice styles, you can use it reads text and e-books aloud, and can...

AIVocal is your all-in-one AI assistant for voice tasks—perfect for AI podcasting, speech generation, vocal editing, and voice control. From transcribing meetings to creating high-quality audio content, AIVocal makes voice work smarter and faster.

Dia is a 1.6B parameter text to speech model created by Nari Labs. It was pushed to the Hub using the PytorchModelHubMixin integration.

With SpeakMyVoice, you're always part of the conversation. Your voice, your story, your freedom.

