Until now, creating a believable back-and-forth dialogue with AI meant compromise. Either you stitched together individual voice clips and hoped for the best, or you accepted the robotic, turn-taking rhythm that never quite felt natural.
Wondercraft’s new Convo Mode changes that completely.
This is more than just a feature. It’s a complete change in how audio is made—with multiple voices, flowing natural dialogue, and the kind of intonation and pacing that sounds like two humans talking. For satire, interviews, brand storytelling, fiction, and even training simulations, Convo Mode unlocks a completely new kind of creative power.
And the best part? You’re in total control.
From Static Scripts to Lifelike Conversations
We’ve spent the past year obsessing over how to make AI dialogues sound better. We started with narration—one voice, reading your script, polished and clean. Then we introduced tools to customize voices, change tones, and clone your own.
But even with the best voices, something was missing: real conversation.
With Convo Mode, that changes. You can now script entire multi-voice exchanges that flow like a natural conversation. Voices overlap. Characters interrupt each other. The pacing feels dynamic. You can add pauses, breaths, filler words like “um” or “you know,” and even emotional beats like laughter or hesitation.
The experience is powered by an evolution of AI synthesis pipeline behind the NotebookLM podcast, known for its shockingly human tone. But unlike NotebookLM, where the audio is generated automatically from notes and documents, Convo Mode lets you direct the whole show.
Write the script. Choose the voices. Set the tone. And create conversations that actually sound like people talking.
So, What Exactly Is Convo Mode?
Convo Mode is a new tool inside Wondercraft Studio that lets creators generate realistic, multi-voice conversations from a written script. You can write freely—just like a screenplay or podcast episode—and assign different parts to different voices.
Once you hit generate, Wondercraft processes the full conversation as a unified scene. That means each voice reacts in context, not just to the line before it, but to the flow of the conversation overall. No stitching clips together. No “line A, then line B” rhythm.
Convo Mode understands pacing, tone, and even subtle cues like sarcasm or confusion. It’s not just AI reading your words—it’s AI performing them.
Why don't you test it out on our free AI Podcast Generator tool? You can create realistic sounding podcasts between fictional characters!
NotebookLM, but more
Audio Overviews by NotebookLM amazed everyone with their natural conversational rhythm—but they had one drawback: you couldn’t take them much further. Wondercraft, built on the same Gemini-based speech synthesis pipeline, fixes that with Convo Mode, giving you three game-changing benefits:
- Total length control - Decide exactly how long your podcast or audio piece runs.
- Full script editing - Tweak, rewrite or add new content on the fly.
- Voice customization - Swap between languages, accents or voices—or even upload your own.
Perfectly expressive, podcast-style conversations—now with every detail in your hands.
Why we made this?
Podcasting is evolving. Audiences expect more than monologues. They want dynamic formats. Interviews. Debates. Characters with distinct personalities. Dialogue that feels alive.
And until now, producing that meant real people, real mics, and a whole lot of editing.
Convo Mode removes the biggest barriers to entry. You don’t need a co-host. You don’t need a recording setup. You don’t need post-production. All you need is an idea and a script. Wondercraft handles the rest.
This isn’t just useful for podcast creators. It’s great for:
- Marketing and brand storytelling – Create realistic ad scripts, mock interviews, or product explainers with personality
- Internal communications – Build engaging, human-sounding training or onboarding content
- Fiction and drama – Craft entire scenes with multiple characters and voices, all in one place
- Education – Simulate dialogues for language learning or subject-based conversations
And since everything’s editable, you can refine tone, pacing, and delivery just like you would in a studio—without ever leaving your browser.
How to Make the Most of It
Here’s what we’ve learned so far from internal tests and early beta users:
1. Direct the delivery like a pro
Don’t be afraid to write things like [laughs], [sighs], or [awkward pause] into your script. Convo Mode recognizes natural dialogue cues, so the more you treat it like a screenplay, the better it performs.
2. Find the right chemistry
Mix and match voices until it feels real. Try pairing a warm, slow-paced narrator with a punchy, energetic character. Or experiment with accents and gender pairings for global reach.
3. Keep it conversational
Write like people talk, not like they write. That means contractions, filler words, hesitations, and even slight stumbles. Perfect grammar doesn’t always make for perfect audio.
4. Iterate fast
One of the joys of Convo Mode is how quickly you can tweak and regenerate. Got a line that’s too slow? Change the pacing. Want to try a different joke? Edit the script and hit regenerate.
5. Think bigger
You’re not limited to two voices or short snippets. Convo Mode works for full episodes, longform interviews, multi-character scenes, or serialized stories. It scales with your ambition.
Ready to Create Your First Convo?
Convo Mode is now available to all Wondercraft Studio users. Just head to wondercraft.ai/studio/, open a new project, and toggle Convo Mode on.
You’ll be surprised how quickly a script becomes a show.
And the next time someone asks, “Who did your voice acting?”, you’ll get to say: “It’s all AI. I just wrote the script.”