Load pre-recorded voice samples with text as an alternative to uploading your own
For dubbing workflow: This is the TRANSLATED text (target language) that will be synthesized in the speaker's voice.
For voice cloning: This is the EXACT transcript of the reference audio. For dubbing: Original language transcript (what was said in the reference audio) goes here. Leave empty to auto-transcribe using Whisper.
Automatically transcribe the reference audio if original utterance transcript is not provided (useful for dubbing)
Upload reference audio for voice cloning and/or emotion capture. For dubbing: This is the ORIGINAL audio (source language) used for voice cloning.
Emotion blending factor (0.0 to 1.0)
Select enhancement engine to apply. 'none' disables enhancement. dsp: Traditional DSP (fast, CPU). resemble: AI-powered denoising (GPU, 44.1kHz). audiosr: Super-resolution (GPU, 48kHz).
Calculate Energy Similarity and Pitch Similarity scores and prepare data for comparison charts
| ID | Status | Created | Duration |
|---|