Highlights
Stars
π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
π¦π The platform for reliable agents.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
A high-throughput and memory-efficient inference and serving engine for LLMs
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Making large AI models cheaper, faster and more accessible
aider is AI pair programming in your terminal
Fully open reproduction of DeepSeek-R1
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Fast and memory-efficient exact attention
π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
verl: Volcano Engine Reinforcement Learning for LLMs
State-of-the-Art Text Embeddings
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
Simple, unified interface to multiple Generative AI providers
A framework for few-shot evaluation of language models.
Retrieval and Retrieval-augmented LLMs
Large Language Model Text Generation Inference
πΈ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
A collection of libraries to optimise AI model performances
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Tools for merging pretrained large language models.
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters