Skip to content
View cooleel's full-sized avatar
🐌
Focusing
🐌
Focusing

Block or report cooleel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

CommonForms — open models to auto-detect PDF form fields

Python 957 132 Updated Nov 26, 2025

UniTable: Towards a Unified Table Foundation Model

Jupyter Notebook 521 40 Updated Jun 4, 2024

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 24,320 2,822 Updated Jan 19, 2026

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

Python 6,317 501 Updated Jan 8, 2026

Tensorlake is a Document Ingestion API and a serverless platform for building data processing and orchestration APIs

Python 874 131 Updated Jan 22, 2026

Building blocks for rapid development of GenAI applications

Python 1,612 132 Updated Jan 21, 2026

A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.

1,536 141 Updated Jan 15, 2026

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 170,705 53,961 Updated Jan 22, 2026

A realtime serving engine for Data-Intensive Generative AI Applications

Rust 1,092 143 Updated Jan 23, 2026

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,305 104 Updated Oct 29, 2025

Mississipi Model Running using CPU

Python 4 1 Updated Dec 27, 2024

This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.

508 40 Updated Mar 18, 2025

This repository contains the Hugging Face Agents Course.

MDX 24,923 1,745 Updated Jan 7, 2026

Python scraper based on AI

Python 22,353 1,943 Updated Jan 20, 2026

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 12,016 1,215 Updated Apr 30, 2025

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 28,815 3,533 Updated Dec 5, 2025

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,341 146 Updated May 30, 2025

links and status of cool gradio demos

426 61 Updated Mar 24, 2025

Vision agent

Python 5,204 585 Updated Jan 22, 2026

[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.

Python 121 6 Updated Nov 25, 2024

Parsing-free RAG supported by VLMs

Python 901 74 Updated Dec 7, 2025

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 2,467 226 Updated Jan 8, 2026

Collection of training data management explorations for large language models

336 31 Updated Aug 2, 2024

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 11,126 1,093 Updated Nov 18, 2024

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 410 8 Updated May 5, 2025

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,944 139 Updated Oct 23, 2025

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 16,134 1,279 Updated Jan 18, 2025

A quick guide (especially) for trending instruction finetuning datasets

3,343 228 Updated Nov 28, 2023

The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Python 85 7 Updated Jan 27, 2025
Python 7 Updated Aug 16, 2024
Next