Data manipulation and transformation for audio signal processing
Robust Speech Recognition via Large-Scale Weak Supervision
Industrial-level controllable zero-shot text-to-speech system
TorchMultimodal is a PyTorch library
A Conversational Speech Generation Model
Implementation of NÜWA, attention network for text to video synthesis
State-of-the-art deep learning based audio codec
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Beamforming and Speech Recognition Toolkit
seqtonedecoder
toneDetect