Style TTS Server

We have developed a Text-to-Speech (TTS) system utilizing StyleTTS2 , which can efficiently process and serve 100 characters in just 410 milliseconds on an AWS G5 large instance. This system is implemented as a simple HTTP server, enabling straightforward integration and usage. With our TTS system, users can leverage advanced features of StyleTTS2, such as voice cloning and text-to-audio conversion. This allows for the creation of high-quality, natural-sounding audio from text input with remarkable speed and accuracy, making it a powerful tool for various applications.

Installation

python3 -m venv env
. ./env/bin/activate
pip install -r requirements.txt

Start Server

python3 main.py

by default it use's port 8700

Making API call

import requests
import json
from base64 import b64decode
headers = {
    'accept': 'application/json',
    'Content-Type': 'application/json',
}

json_data = {
    'text': 'hello world i am R Ansh Joseph whats your name',
    'rate':8000,
    'voice_id': 'default',
    'alpha': 0.3,
    'beta': 0.7,
    'diffusion_steps': 5,
    'embedding_scale': 1,
}
import time
prev = time.time()
response = requests.post('http://127.0.0.1:8700/tts', headers=headers, json=json_data)
response = json.loads(response.text)
print(time.time() - prev)
with open("audio.wav",'wb') as file:
    file.write(b64decode(response['audio']))

note:- you can change the audio sample rate by changing the rate in json_data and you can change the voice by altering voice_id

Adding More Voice's

to add more voice you have to put audio file to voices dir and file name is voice_id for that voice

For Example

by default we have a default.wav in voices folder but if you have to add new voice you have put a new audio file in this folder , some thing like this

now if you want to access new audio you have to simply you this payload according to this example

json_data = {
    'text': 'hello world i am R Ansh Joseph whats your name',
    'rate':8000,
    'voice_id': 'ansh',
    'alpha': 0.3,
    'beta': 0.7,
    'diffusion_steps': 5,
    'embedding_scale': 1,
}

note: if you add new voice at the time server is on then restart the server

Audio Sample

FOR SINGLE WORD MODEL CREATE WEIRD SOUND

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Client		Client
Docker		Docker
TTS		TTS
img		img
sample		sample
voices		voices
.gitignore		.gitignore
Readme.md		Readme.md
ResponseRequestModels.py		ResponseRequestModels.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Style TTS Server

Installation

Start Server

Making API call

Adding More Voice's

Audio Sample

About

Releases

Packages

Languages

bolna-ai/StyleTTS_Server

Folders and files

Latest commit

History

Repository files navigation

Style TTS Server

Installation

Start Server

Making API call

Adding More Voice's

Audio Sample

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages