Data manipulation and transformation for audio signal processing
Robust Speech Recognition via Large-Scale Weak Supervision
Industrial-level controllable zero-shot text-to-speech system
TorchMultimodal is a PyTorch library
A Conversational Speech Generation Model
Implementation of NÜWA, attention network for text to video synthesis
State-of-the-art deep learning based audio codec
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Free video editor to convert, cut, trim, stream select and encode
Beamforming and Speech Recognition Toolkit
seqtonedecoder
toneDetect