Stars
Analyzing Hacker News discussions from a decade ago in hindsight with LLMs
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
A MULTI-GENERATOR ENSEMBLE FRAMEWORK FOR NATURAL LANGUAGE TO SQL
Agentar-Scale-SQL is a novel framework that leverages scalable computation to significantly improve Text-to-SQL performance.
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Official implementation of "Continuous Autoregressive Language Models"
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
A high-throughput and memory-efficient inference and serving engine for LLMs
Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.
A unified, comprehensive and efficient recommendation library
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Minimalistic large language model 3D-parallelism training
Anthropic's educational courses
Integrate the DeepSeek API into popular softwares
Swift Codes or BIC Codes for all the Banks in the world. Cached to json.
⚡ TabPFN: Foundation Model for Tabular Data ⚡
A standard framework for modelling Deep Learning Models for tabular data
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
SGLang is a high-performance serving framework for large language models and multimodal models.