Skip to content
/ SC-VITS Public
forked from jaywalnut310/vits

VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.

License

Notifications You must be signed in to change notification settings

hcy71o/SC-VITS

 
 

Repository files navigation

(Ongoing) Zero-shot TTS based on VITS

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Note

  1. This repository aims to implement a VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.
  2. To remove the secondary elements, we simply extract a style representation by jointly training a reference encoder from StyleSpeech. In detail, 1. we do not utilize pretrained models (e.g., Link1, Link2) as the reference encoder, 2. we do not apply meta-learning or speaker verification loss during training.
  3. LibriTTS dataset (train-clean-100 and train-clean-360) is used for training.
Model Text Encoder Flow Posterior Encoder Vocoder
master(YourTTS) Output addition Global conditioning Global conditioning Input addition
transfer(TransferTTS) None Global conditioning None None
s1(Proposed) SC-CNN Global Conditioning Global Conditioning Input addition
s2(Proposed) SC-CNN SC-CNN SC-CNN TBD
  • master
  • transfer
  • s1
  • s2

Pre-requisites

  1. Python >= 3.6
  2. Clone this repository
  3. Install python requirements. Please refer requirements.txt
    1. You may need to install espeak first: apt-get install espeak
  4. Download datasets
  5. Build Monotonic Alignment Search and run preprocessing if you use your own datasets.
# Cython-version Monotonoic Alignment Search
cd monotonic_align
python setup.py build_ext --inplace

Training Exmaple

python train_zs.py -c configs/libritts_base.json -m libritts_base

Inference Example

See inference.ipynb

About

VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.5%
  • Jupyter Notebook 4.6%
  • Cython 0.9%