TTS Benchmark

Arabic & English Speech Synthesis

6 Arabic Models 3 English Models

Playground

Generate speech from text using any model

Benchmark

Run all models on the same text and compare

Voice Cloning

Upload a reference voice (5–30 s) and clone it

🎤

Drop a WAV/MP3 file here
or click to browse

Tips for best results

  • Use 5–20 seconds of clean audio
  • Single speaker, minimal background noise
  • Providing a transcript improves accuracy
  • F5-TTS, SILMA, and Coqui XTTS-v2 give the strongest cloning options

Leaderboard

User ratings from this session

🇸🇦 Arabic Models

ModelRatingsAvg RatingAvg Gen (s)Avg RTF

🇬🇧 English Models

ModelRatingsAvg RatingAvg Gen (s)Avg RTF