Highlights
- Pro
-
LEANN Public
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
-
-
faiss Public
Forked from facebookresearch/faissA library for efficient similarity search and clustering of dense vectors.
-
colpali Public
Forked from illuin-tech/colpaliThe code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
Python MIT License UpdatedNov 6, 2025 -
mteb Public
Forked from embeddings-benchmark/mtebMTEB: Massive Text Embedding Benchmark
Python Apache License 2.0 UpdatedNov 6, 2025 -
flash-kmeans Public
Forked from svg-project/flash-kmeansFast and memory-efficient exact kmeans
Python Apache License 2.0 UpdatedNov 5, 2025 -
pylate Public
Forked from lightonai/pylateLate Interaction Models Training & Retrieval
Python MIT License UpdatedOct 10, 2025 -
LLM-on-VDB-Design Public
Evaluating the performance of LLM and agents on core data structure design of database(i.e. vector database))
-
fast-plaid Public
Forked from lightonai/fast-plaidHigh-Performance Engine for Multi-Vector Search
-
-
astchunk-leann Public
Forked from yilinjz/astchunkASTChunk is a Python toolkit for code chunking using Abstract Syntax Trees (ASTs), designed to create structurally sound and meaningful code segments.
-
gonzalezgroup.github.io Public
Forked from jegonzal/gonzalezgroup.github.ioWebsite for Joseph E. Gonzalez's group at Berkeley (created by his procrastinating student)
CSS UpdatedSep 8, 2025 -
DiskANN Public
Forked from microsoft/DiskANNGraph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
-
Omniscient Public
Forked from enricd/geoguessr_ai_botGeo-Agent-Eval
-
yichuan-w.github.io Public
Forked from xichenpan/xichenpan.github.ioYichuan Wang Homepage
-
-
gritlm Public
Forked from ContextualAI/gritlmGenerative Representational Instruction Tuning
Jupyter Notebook MIT License UpdatedJul 31, 2025 -
sentence-transformers Public
Forked from huggingface/sentence-transformersState-of-the-Art Text Embeddings
Python Apache License 2.0 UpdatedJul 30, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
-
-
ReasonIR Public
Forked from facebookresearch/ReasonIROfficial repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".
Python Other UpdatedJun 24, 2025 -
G-agent Public
Forked from ccmdi/geoguessr-agentAutonomous framework for LLMs to play GeoGuessr
-
MLsys_reading_list Public
A record of reading list on some MLsys popular topic
16 UpdatedMar 20, 2025 -
Awesome-LLM-System-Papers Public
Forked from AmadeusChan/Awesome-LLM-System-Papers2 UpdatedMar 7, 2025 -
sgl-learning-materials Public
Forked from sgl-project/sgl-learning-materialsMaterials for learning SGLang
-
SPANN Public
Forked from microsoft/SPTAGA distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search sc…
-
retrieval-scaling Public
Forked from RulinShao/retrieval-scalingOfficial repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
-
-
hnswlib Public
Forked from nmslib/hnswlibHeader-only C++/python library for fast approximate nearest neighbors
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs



