No preview available
British English TTS Voice — Ultra Low Latency
Description
A fine-tuned text-to-speech voice model with a natural, clear British English accent.
**Specifications:**
- Base model: Coqui XTTS v2
- Training data: 40 hours of professional British voice acting
- Latency: <80ms first-token on RTX 3080
- Sample rate: 24kHz
- Languages: English (British) primary, handles mixed UK/US text gracefully
**Quality:**
- MOS score: 4.3/5.0 (human evaluation, n=100)
- Naturalness comparable to ElevenLabs standard voices
- Handles acronyms, numbers, and technical terms correctly
**What you get:**
- Model weights (.pth file, ~1.8GB)
- Python inference script
- FastAPI server wrapper for real-time streaming
- Commercial use licence (no attribution required)
**Requirements:** CUDA GPU recommended, CPU inference supported (slower)