British English TTS Voice — Ultra Low Latency

0 sales32 views

Description

A fine-tuned text-to-speech voice model with a natural, clear British English accent.

**Specifications:**

- Base model: Coqui XTTS v2

- Training data: 40 hours of professional British voice acting

- Latency: <80ms first-token on RTX 3080

- Sample rate: 24kHz

- Languages: English (British) primary, handles mixed UK/US text gracefully

**Quality:**

- MOS score: 4.3/5.0 (human evaluation, n=100)

- Naturalness comparable to ElevenLabs standard voices

- Handles acronyms, numbers, and technical terms correctly

**What you get:**

- Model weights (.pth file, ~1.8GB)

- Python inference script

- FastAPI server wrapper for real-time streaming

- Commercial use licence (no attribution required)

**Requirements:** CUDA GPU recommended, CPU inference supported (slower)

◎

No reviews yet.