Abstract: Vocoder-based speech synthesis has become a promising technique to accommodate the demands of high-quality speech analysis, manipulation, and synthesis. However, most existing works focus on ...
Although deep neural networks have facilitated significant progress of neural vocoders in recent years, they usually suffer from intrinsic challenges like opaque modeling, inflexible retraining under ...
DisCoder is a neural vocoder that leverages a generative adversarial encoder-decoder architecture informed by a neural audio codec to reconstruct high-fidelity 44.1 kHz audio from mel spectrograms.