Skip to content

Commit 49d9bf0

Browse files
author
fullstack-admin
committed
readme updated as per new fork
1 parent cdab333 commit 49d9bf0

File tree

1 file changed

+95
-2
lines changed

1 file changed

+95
-2
lines changed

README.md

Lines changed: 95 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,25 @@ A live demo is hosted on Hugging Face Spaces. If you'd like to avoid a queue, pl
1414

1515
https://huggingface.co/spaces/Manmay/tortoise-tts
1616

17+
## Quick Start (Recommended)
18+
19+
**New! Web UI for easy TTS generation with visual interface, real-time progress tracking, and audio playlist.**
20+
21+
```shell
22+
conda create --name tortoise python=3.11 numba inflect -y
23+
conda activate tortoise
24+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
25+
git clone https://github.com/neonbjb/tortoise-tts.git
26+
cd tortoise-tts
27+
pip install -e .
28+
pip install flask soundfile
29+
30+
# Start the web interface
31+
python web_ui.py
32+
```
33+
34+
Then open http://localhost:5000 in your browser!
35+
1736
## Install via pip
1837
```bash
1938
pip install tortoise-tts
@@ -68,15 +87,17 @@ pip install transformers==4.29.2
6887
git clone https://github.com/neonbjb/tortoise-tts.git
6988
cd tortoise-tts
7089
pip install -e .
90+
pip install flask soundfile==0.12.1
7191
```
7292

93+
> [!IMPORTANT]
94+
> **Python 3.11 is required** - PyTorch does not support Python 3.13+. Use `soundfile==0.12.1` to avoid compatibility issues.
95+
7396
Optionally, pytorch can be installed in the base environment, so that other conda environments can use it too. To do this, simply send the `conda install pytorch...` line before activating the tortoise environment.
7497

7598
> [!NOTE]
7699
> When you want to use tortoise-tts, you will always have to ensure the `tortoise` conda environment is activated.
77100
78-
If you are on windows, you may also need to install pysoundfile: `conda install -c conda-forge pysoundfile`
79-
80101
### Docker
81102

82103
An easy way to hit the ground running and a good jumping off point depending on your use case.
@@ -134,6 +155,49 @@ Be aware that DeepSpeed is disabled on Apple Silicon since it does not work. The
134155
You may need to prepend `PYTORCH_ENABLE_MPS_FALLBACK=1` to the commands below to make them work since MPS does not support all the operations in Pytorch.
135156

136157

158+
### Web UI (Recommended)
159+
160+
**The easiest way to use Tortoise TTS!** A full-featured web interface with:
161+
162+
- 🎨 **4-column layout**: Settings, content, audio playlist, and debug console
163+
- 🎯 **Stage-based progress tracking**: Real-time progress with accurate stage detection
164+
- 🎵 **Audio playlist**: Persistent playlist with playback controls (localStorage)
165+
- 📁 **Smart file management**: Auto-save to Music folder with intelligent naming
166+
- 🎤 **Voice management**: Upload custom voices, delete voices, batch generate .pth files
167+
- 🔧 **Service controls**: Restart, stop, open output folder
168+
- 🐞 **Debug console**: Real-time color-coded logs
169+
- ⏹️ **Cancel generation**: Stop long-running generations
170+
- 📊 **System monitoring**: CPU, RAM, and GPU usage
171+
172+
```shell
173+
# Start the web UI
174+
python web_ui.py
175+
176+
# Or use the provided scripts
177+
start_webui.bat # Windows batch file
178+
start_webui.ps1 # PowerShell script
179+
```
180+
181+
Then open http://localhost:5000 in your browser.
182+
183+
**Output Location**: Files are automatically saved to `%USERPROFILE%\Music\Tortoise Output`
184+
**File Naming**: `{voice}-{preset}-{candidates}x-{number}.wav` (e.g., `tom-fast-1x-001.wav`)
185+
186+
### Voice Preparation
187+
188+
For faster voice loading, pre-compute conditioning latents (.pth files):
189+
190+
```shell
191+
# Generate .pth for a single voice
192+
python tortoise\get_conditioning_latents.py --voice VOICE_NAME --output_path tortoise\voices\VOICE_NAME
193+
194+
# Batch generate .pth for all voices (Windows)
195+
generate_all_pth.bat # Batch script
196+
generate_all_pth.ps1 # PowerShell script
197+
```
198+
199+
Pre-computed .pth files reduce voice loading time from 10-30 seconds to instant.
200+
137201
### do_tts.py
138202

139203
This script allows you to speak a single phrase with one or more voices.
@@ -211,6 +275,35 @@ tts = api.TextToSpeech(use_deepspeed=True, kv_cache=True, half=True)
211275
pcm_audio = tts.tts_with_preset("your text here", voice_samples=reference_clips, preset='fast')
212276
```
213277

278+
## Performance Notes
279+
280+
> [!WARNING]
281+
> **Tortoise TTS is slow by design.** The autoregressive architecture processes sequentially, making it 10-100x slower than diffusion-only models like Stable Diffusion.
282+
283+
**Expected generation times** (RTX 3060, batch_size=4):
284+
- `ultra_fast`: ~30 seconds (16 autoregressive samples)
285+
- `fast`: ~8 minutes (96 samples, 24 batches)
286+
- `standard`: ~20 minutes (256 samples, 64 batches)
287+
- `high_quality`: ~40+ minutes (256 samples, 100 diffusion steps)
288+
289+
**Optimization tips**:
290+
1. Use `ultra_fast` for quick testing
291+
2. Pre-compute voice .pth files (instant loading vs 10-30s)
292+
3. Reduce `autoregressive_batch_size` if experiencing system instability (default: 4)
293+
4. Monitor GPU memory usage - reduce candidates or preset if OOM errors occur
294+
295+
See `SPEED_GUIDE.md` for detailed performance information.
296+
297+
## Files and Documentation
298+
299+
- `WEB_UI_README.md` - Comprehensive web UI documentation
300+
- `QUICK_START.md` - Quick reference guide
301+
- `SPEED_GUIDE.md` - Performance characteristics explained
302+
- `TROUBLESHOOTING_RESULTS.md` - Common issues and solutions
303+
- `SOUNDFILE_FIX.md` - soundfile compatibility fix
304+
- `voice_customization_guide.md` - Creating custom voices
305+
- `Advanced_Usage.md` - Advanced features and techniques
306+
214307
## Acknowledgements
215308

216309
This project has garnered more praise than I expected. I am standing on the shoulders of giants, though, and I want to

0 commit comments

Comments
 (0)