🎤 DubaiTTS Voice Cloning

Test TTS services, prosody analysis, and voice conversion

📝 Input

Load Sample (optional)

Load pre-recorded voice samples with text as an alternative to uploading your own

Text to Synthesize

For dubbing workflow: This is the TRANSLATED text (target language) that will be synthesized in the speaker's voice.

Original Utterance Transcript

For voice cloning: This is the EXACT transcript of the reference audio. For dubbing: Original language transcript (what was said in the reference audio) goes here. Leave empty to auto-transcribe using Whisper.

Auto-Transcribe Reference Audio

Automatically transcribe the reference audio if original utterance transcript is not provided (useful for dubbing)

Original Utterance Sample

Upload reference audio for voice cloning and/or emotion capture. For dubbing: This is the ORIGINAL audio (source language) used for voice cloning.

Use for Voice CloningUse for Emotion Capture

🤖 TTS Model Selection

🎭 Voice Cloning

⚙️ Model Configuration

Emotion Alpha: 0.6

Emotion blending factor (0.0 to 1.0)

Match Original Audio Length

Output Sample Rate (Hz)

✨ Post-Processing

Audio Enhancement Engine

Select enhancement engine to apply. 'none' disables enhancement. dsp: Traditional DSP (fast, CPU). resemble: AI-powered denoising (GPU, 44.1kHz). audiosr: Super-resolution (GPU, 48kHz).

Enable Prosody Analysis

Calculate Energy Similarity and Pitch Similarity scores and prepare data for comparison charts

📋 Async Jobs

Project ID

Utterance ID

Transcribed Text ID

Language Code

Translated Text ID

Description

ID	Status	Created	Duration

🔊 Output

Status

🔧 API Request Details

cURL Command

Request/Response Info