Stars
Stable Diffusion web UI
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …
A generative speech model for daily dialogue.
Easily train a good VC model with voice data <= 10 mins!
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
Real-time face swap for PC streaming or video calls
🔥 MaxKB is an open-source platform for building enterprise-grade agents. 强大易用的开源企业级智能体平台。
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
Translate the video from one language to another and embed dubbing & subtitles.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
vits2 backbone with multilingual-bert
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
GPT 3.5/4 with a Chat Web UI. No API key required.
AI一键批量生成各类短视频,自动批量混剪短视频,自动把视频发布到抖音,快手,小红书,视频号上,赚钱从来没有这么容易过! 支持本地语音模型chatTTS,fasterwhisper,GPTSoVITS,支持云语音:Azure,阿里云,腾讯云。支持Stable diffusion,comfyUI直接AI生图。Generate short videos with one click using A…
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Generative models for conditional audio generation
AI 写小说,生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型,类似 GPT-2 。AI写作。RWKV for Chinese novel generation.