VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Wondercraft AI is described as 'Tool that allows users to easily create studio-quality podcasts using generative AI technology. It eliminates the need for extensive recording and scripting by allowing users to record just a 60-second sample of their voice, which the AI uses to clone their' and is a Text to Speech service in the audio & music category. There are more than 25 alternatives to Wondercraft AI, not only websites but also apps for a variety of platforms, including SaaS, iPhone, Mac and iPad apps. The best Wondercraft AI alternative is VoiceCraft, which is both free and Open Source. Other great sites and apps similar to Wondercraft AI are X to Voice, Voice Engine, Jellypod and Wondera.
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Open-source tool that analyzes your X/Twitter profile data to generate a custom voice with ElevenLabs Voice Design API, integrating with Hedra's video API for an innovative audio-visual experience.


Voice Engine is a text-to-voice generation platform from OpenAI, which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.


Create AI podcasts by uploading websites, PDFs, or documents, selecting customizable hosts and scripts, generating episodes with outline planning, editing outputs, and publishing audio content—streamlining production for creators without manual recording.




Karaoke and transform any songs in your AI voice. No singing skill required, your AI voice can handle any song even in other languages!.




AI-powered tool for editing audio and video by editing transcribed text, featuring transcription, filler removal, captions, background replacement, rapid translation, collaboration, mastering, recording, and viral content identification features.






Converts printed or digital text, PDFs, and web articles into spoken audio with natural-sounding AI voices, adjustable listening speeds, cross-device syncing, offline access, support for scanning photos, and tools for organizing and managing content.




Choose from 60+ human-like, emotional voices in various accents, languages, and characters to turn any text into a commercial-grade audio. Or Clone your own voice.


Syllaby is an AI-driven solution designed to simplify the process of creating social media videos, offering tools for topic discovery, script creation, video editing, publishing, and storytelling.




Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech, so you can convert articles to speech. With dozens of lifelike voices across a broad set of languages, use Amazon Polly to build speech-activated applications.



Voicebox is a state-of-the-art speech generative model built upon Meta’s non-autoregressive flow matching model. By learning to solve a text-guided speech infilling task with a large scale of data, Voicebox outperforms single purpose AI models across speech tasks through...
