Datasets#
TTS Dataset#
- class TTS.tts.datasets.TTSDataset(outputs_per_step=1, compute_linear_spec=False, ap=None, samples=None, tokenizer=None, compute_f0=False, compute_energy=False, f0_cache_path=None, energy_cache_path=None, return_wav=False, batch_group_size=0, min_text_len=0, max_text_len=inf, min_audio_len=0, max_audio_len=inf, phoneme_cache_path=None, precompute_num_workers=0, speaker_id_mapping=None, d_vector_mapping=None, language_id_mapping=None, use_noise_augment=False, start_by_longest=False, verbose=False)[source]#
Vocoder Dataset#
- class TTS.vocoder.datasets.gan_dataset.GANDataset(ap, items, seq_len, hop_len, pad_short, conv_pad=2, return_pairs=False, is_training=True, return_segments=True, use_noise_augment=False, use_cache=False, verbose=False)[source]#
GAN Dataset searchs for all the wav files under root path and converts them to acoustic features on the fly and returns random segments of (audio, feature) couples.
- class TTS.vocoder.datasets.wavegrad_dataset.WaveGradDataset(ap, items, seq_len, hop_len, pad_short, conv_pad=2, is_training=True, return_segments=True, use_noise_augment=False, use_cache=False, verbose=False)[source]#
WaveGrad Dataset searchs for all the wav files under root path and converts them to acoustic features on the fly and returns random segments of (audio, feature) couples.