Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "mpeg audio decoder"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 15
Windows 14
Mac 12
More...
BSD 9
ChromeOS 8
Desktop Operating Systems 1
Mobile Operating Systems 1

Category

Artificial Intelligence 7
Multimedia 7
Software Development 2
Communications 1
Games 1
Scientific/Engineering 1

License

OSI-Approved Open Source 13

Translations

English 3
Dutch 1
French 1
German 1
More...
Spanish 1

Programming Language

Python 16
C++ 1

Status

Beta 3
Production/Stable 3
Alpha 2

Showing 16 open source projects for "mpeg audio decoder"

View related business solutions

Python Clear Filters & Widen Search

Zenflow- The AI Workflow Engine for Software Devs
Parallel agents. Multi-agent orchestration. Specs that turn into shipped code. Zenflow automates planning, coding, testing, and verification.

Zenflow is the AI workflow engine built for real teams. Parallel agents plan, code, test, and verify in one workflow. With spec-driven development and deep context, Zenflow turns requirements into production-ready output so teams ship faster and stay in flow.

Try free now
Auth0 for AI Agents now in GA
Ready to implement AI with confidence (without sacrificing security)?

Connect your AI agents to apps and data more securely, give users control over the actions AI agents can perform and the data they can access, and enable human confirmation for critical agent actions.

Start building today
1

TorchAudio

Data manipulation and transformation for audio signal processing

The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...

Downloads: 3 This Week

Last Update: 2025-11-06
See Project
2

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

...These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.

Downloads: 63 This Week

Last Update: 2025-06-26
See Project
3

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

...It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.

Downloads: 11 This Week

Last Update: 2025-11-27
See Project
4

Multimodal

TorchMultimodal is a PyTorch library

...The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference implementations you can adopt or adapt. The design emphasizes composability: you can mix and match encoder, fusion, and decoder components rather than starting from monolithic models. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
Grafana: The open and composable observability platform
Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

Grafana is the open source analytics & monitoring solution for every database.

Learn More
5

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 2 This Week

Last Update: 2025-03-19
See Project
6

NÜWA - Pytorch

Implementation of NÜWA, attention network for text to video synthesis

Implementation of NÜWA, state of the art attention network for text-to-video synthesis, in Pytorch. It also contains an extension into video and audio generation, using a dual decoder approach. It seems as though a diffusion-based method has taken the new throne for SOTA. However, I will continue on with NUWA, extending it to use multi-headed codes + hierarchical causal transformer. I think that direction is untapped for improving on this line of work. In the paper, they also present a way to condition the video generation based on segmentation mask(s). ...

Downloads: 0 This Week

Last Update: 2023-03-22
See Project
7

EnCodec

State-of-the-art deep learning based audio codec

Encodec is a neural audio codec developed by Meta for high-fidelity, low-bitrate audio compression using end-to-end deep learning. Unlike traditional codecs (like MP3 or Opus), Encodec uses a learned quantizer and decoder to reconstruct complex waveforms with remarkable accuracy at bitrates as low as 1.5 kbps. It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. ...

Downloads: 0 This Week

Last Update: 2025-10-12
See Project
8

Denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

Denoiser is a real-time speech enhancement model operating directly on raw waveforms, designed to clean noisy audio while running efficiently on CPU. It uses a causal encoder-decoder architecture with skip connections, optimized with losses defined both in the time domain and frequency domain to better suppress noise while preserving speech. Unlike models that operate on spectrograms alone, this design enables lower latency and coherent waveform output.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
9

Fast Forward

Free video editor to convert, cut, trim, stream select and encode

Fast Forward is free video editing software that allows you to convert, cut, trim, remove streams, encode and customise a variety of parameters such as frame rate, bitrate, frame size and output file size. Fast Forward can encode H264, MPEG2 or Xvid video, as well as Dolby Digital AC3, Dolby Digital Plus eAC3+, AAC and Vorbis audio. It is very useful for removing ads from recorded TV programs, or combining the .VOB files from a DVD file system. Thanks to FFmpeg, these processes are extremely streamlined and fast. To speed up your conversions, use the "Straight Copy" codec options (only useable under specific circumstances). Accepted formats include: *.3g2 *.3gp *.asf *.avi *.drc *.flv *.gif *.gifv *.m2ts *.m2v *.m4p *.m4v *.mkv *.mng *.mov *.mp4 *.mpeg *.mpg *.mxf *.nsv *.ogg *.ogv *.rm *.rmvb *.roq *.svi *.vob *.webm *.wmv *.yuv

Downloads: 3 This Week

Last Update: 2019-06-23
See Project
Cloud-based help desk software with ServoDesk
Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.

Try ServoDesk for free
10

Distant Speech Recognition

Beamforming and Speech Recognition Toolkit

BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research and...

Downloads: 0 This Week

Last Update: 2019-08-21
See Project
11

EnKoDeur-Mixeur

EnKoDeur-Mixeur (EKD) is an open source software which makes videos, pictures and audio post-production. It can be also used to convert videos in many formats. It is written in python and use the PyQt4 bindings.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-30
See Project
12

seqtonedecoder

seqtonedecoder

A sequential tone decoder geared towards the type of paging used by Fire/EMS. It is written in Python for portability and has been tested on Windows, Linux and Mac. It provides email notification of pages which includes the audio.

Downloads: 0 This Week

Last Update: 2016-07-22
See Project
13

toneDetect

toneDetect

A sequential tone decoder geared towards the type of paging used by Fire/EMS. It is written in Python for portability and has been tested on Windows, Linux and Mac. It provides email notification of pages which includes the audio.

Downloads: 1 This Week

Last Update: 2015-07-30
See Project
14

PyKaraoke

PyKaraoke is a cross-platform karaoke player. It currently supports CDG (MP3+G, OGG+G, WAV+G), MIDI (.KAR, .MID) and MPEG formats.

5 Reviews

Downloads: 43 This Week

Last Update: 2013-04-25
See Project
15

Tag2Utf cyrillic mp3-tags decoder

Tool for encoding tags of mp3 files in the cyrillic charsets (cp1251, koi8-r) to unicode. Solution of problem with displaying tags in thе different charsets in a playlist. If you will find bug, mistake in the this text - mail me on hlamer@tut.by

Downloads: 5 This Week

Last Update: 2015-02-23
See Project
16

Media-Z

This is a program for in-car use. Its designed in python using pygame. The main goal is to have an easy to navigate program while having it look good. The features will include mp3/divx/mpeg/dvd and other media files.

Downloads: 0 This Week

Last Update: 2013-03-13
See Project

Previous
You're on page 1
Next

Related Searches

arabic speech to text

whisper-windows-x64.exe

text to speech

speech recognition in russian

farsi speech recognition

mega-voice

video editor

beamforming

vob to mp4

dtmf decoder python

Related Categories

Artificial Intelligence

Multimedia

Software Development

Communications

Games

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: