Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request info on training data used for pre-trained models #69

Open
rppravin opened this issue Dec 13, 2020 · 1 comment
Open

Request info on training data used for pre-trained models #69

rppravin opened this issue Dec 13, 2020 · 1 comment

Comments

@rppravin
Copy link

Thanks for the code.

Can you please give info on the data used for training the pre-trained models, both for AutoVC and speaker embedding? If you trained on a subset of a larger database, please let me know such info as well.

Best,
Pravin

@ruclion
Copy link

ruclion commented Dec 23, 2020

In author's Paper,

  • speaker embedding is trained by "the combination of VoxCeleb1 (Nagrani et al., 2017) and Librispeech (Panayotov et al., 2015) corpora, where there are a total of 3549 speakers"
  • vocoder: pre-trained the WaveNet vocoder using the method described in Shen et al. (2018) on the VCTK corpus
  • autoVC(content encoder&decoder): VCTK corpus, which has 109 speakers; but in paper, one task uses 20 speakers, another task uses 40speakers; And we don't know which speakers author used

We are trying to re-implement the same loss on VCTK, let's find out more wisdom on author's work togethor~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants