Open Source Python Large Language Models (LLM)

Python Large Language Models (LLM)

View 324 business solutions

Browse free open source Python Large Language Models (LLM) and projects below. Use the toggles on the left to filter open source Python Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • Incredable is the first DLT-secured platform that allows you to save time, eliminate errors, and ensure your organization is compliant all in one place. Icon
    Incredable is the first DLT-secured platform that allows you to save time, eliminate errors, and ensure your organization is compliant all in one place.

    For healthcare Providers and Facilities

    Incredable streamlines and simplifies the complex process of medical credentialing for hospitals and medical facilities, helping you save valuable time, reduce costs, and minimize risks. With Incredable, you can effortlessly manage all your healthcare providers and their credentials within a single, unified platform. Our state-of-the-art technology ensures top-notch data security, giving you peace of mind.
    Learn More
  • Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software. Icon
    Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.

    Banks, lending institutions

    Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)
    Learn More
  • 1
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference and more effective integration within agent frameworks. GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.
    Downloads: 192 This Week
    Last Update:
    See Project
  • 2
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe coding,” producing cleaner, more modern UIs, better-structured webpages, and visually improved slide layouts. Its tool-use capabilities are substantially enhanced, with notable improvements in browsing, search, and tool-integrated reasoning tasks. Overall, GLM-4.7 shows broad performance upgrades across coding, reasoning, chat, creative writing, and role-play scenarios.
    Downloads: 172 This Week
    Last Update:
    See Project
  • 3
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 130 This Week
    Last Update:
    See Project
  • 4
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses. They are released under the MIT license, allowing commercial use and secondary development. GLM-4.5 achieves strong performance on 12 industry-standard benchmarks, ranking 3rd overall, while GLM-4.5-Air balances competitive results with greater efficiency. The models support FP8 and BF16 precision, and can handle very large context windows of up to 128K tokens. Flexible inference is supported through frameworks like vLLM and SGLang with tool-call and reasoning parsers included.
    Downloads: 113 This Week
    Last Update:
    See Project
  • D&B Hoovers is Your Sales Accelerator Icon
    D&B Hoovers is Your Sales Accelerator

    For sales teams that want to accelerate B2B sales with better data

    Speed up sales prospecting with the rich audience targeting capabilities of D&B Hoovers so you can spend more sales time closing.
    Learn More
  • 5
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 6
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 53 This Week
    Last Update:
    See Project
  • 7
    Qwen3

    Qwen3

    Qwen3 is the large language model series developed by Qwen team

    Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions, tools/pipelines provided for inference using quantized formats (e.g. GGUF, etc.). Coverage for many languages in training and usage, alignment with human preferences in open-ended tasks, etc.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 8
    PrivateGPT

    PrivateGPT

    Interact with your documents using the power of GPT

    PrivateGPT is a production-ready, privacy-first AI system that allows querying of uploaded documents using LLMs, operating completely offline in your own environment. It provides contextual generative AI capabilities without sending data externally. Now maintained under Zylon.ai with enterprise deployment options (air gapped, cloud, or on-prem).
    Downloads: 19 This Week
    Last Update:
    See Project
  • 9
    llama.cpp Python Bindings

    llama.cpp Python Bindings

    Python bindings for llama.cpp

    llama-cpp-python provides Python bindings for llama.cpp, enabling the integration of LLaMA (Large Language Model Meta AI) language models into Python applications. This facilitates the use of LLaMA's capabilities in natural language processing tasks within Python environments.
    Downloads: 19 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including photorealistic, impressionist, anime, and minimalist aesthetics. Qwen-Image supports sophisticated editing tasks such as style transfer, object insertion and removal, detail enhancement, and even human pose manipulation, making it suitable for both professional and casual users. It also includes advanced image understanding capabilities like object detection, semantic segmentation, depth and edge estimation, and novel view synthesis.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 11
    Qwen

    Qwen

    The official repo of Qwen chat & pretrained large language model

    Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a versatile tool for developers looking to integrate advanced AI functionalities into their applications.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 12
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 12 This Week
    Last Update:
    See Project
  • 13
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as an extension of your brain. So that you can stay focused on doing what matters. Khoj started with the founding principle that a personal assistant be understandable, accessible and hackable. This means you can always customize and self-host your Khoj on your own machines.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    GPT Neo

    GPT Neo

    An implementation of model parallel GPT-2 and GPT-3-style models

    An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very inefficient at those scales. This, as well as the fact that many GPUs became available to us, among other things, prompted us to move development over to GPT-NeoX. All evaluations were done using our evaluation harness. Some results for GPT-2 and GPT-3 are inconsistent with the values reported in the respective papers. We are currently looking into why, and would greatly appreciate feedback and further testing of our eval harness.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Qwen3-Coder

    Qwen3-Coder

    Qwen3-Coder is the code version of Qwen3

    Qwen3-Coder is the latest and most powerful agentic code model developed by the Qwen team at Alibaba Cloud. Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models like Claude Sonnet. Qwen3-Coder supports an exceptionally long context window of 256,000 tokens, extendable to 1 million tokens using Yarn, enabling repository-scale code understanding and generation. It is capable of handling 358 programming languages, from common to niche, making it versatile for a wide range of development environments. The model integrates a specially designed function call format and supports popular platforms such as Qwen Code and CLINE for agentic coding workflows.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    LiteLLM

    LiteLLM

    lightweight package to simplify LLM API calls

    Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.] liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, and Huggingface models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    Tongyi DeepResearch

    Tongyi DeepResearch

    Tongyi Deep Research, the Leading Open-source Deep Research Agent

    DeepResearch (Tongyi DeepResearch) is an open-source “deep research agent” developed by Alibaba’s Tongyi Lab designed for long-horizon, information-seeking tasks. It’s built to act like a research agent: synthesizing, reasoning, retrieving information via the web and documents, and backing its outputs with evidence. The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and reinforcement learning; supports benchmarks like web search, document understanding, question answering, “agentic” tasks; provides inference tools, evaluation scripts, and “web agent” style interfaces. The aim is to enable more autonomous, agentic models that can perform sustained knowledge gathering, reasoning, and synthesis across multiple modalities (web, files, etc.).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    CodeGeeX2

    CodeGeeX2

    CodeGeeX2: A More Powerful Multilingual Code Generation Model

    CodeGeeX2 is the second-generation multilingual code generation model from ZhipuAI, built upon the ChatGLM2-6B architecture and trained on 600B code tokens. Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports over 100 programming languages. With improved inference efficiency, quantization options, and multi-query/flash attention, CodeGeeX2 achieves faster generation speeds and lightweight deployment, requiring as little as 6GB GPU memory at INT4 precision. Its backend powers the CodeGeeX IDE plugins for VS Code, JetBrains, and other editors, offering developers interactive AI assistance with features like infilling and cross-file completion.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. It supports bilingual interaction (Chinese and English) and has open-source versions optimized for dialogue and video comprehension. Notably, the Int4 quantized version allows efficient inference on GPUs with only 16GB of memory. The repository offers demos, API servers, fine-tuning examples, and integration with OpenAI API-compatible endpoints, making it accessible for both researchers and developers.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    LLaMA-Factory

    LLaMA-Factory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    LLaMA-Factory is a fine-tuning and training framework for Meta's LLaMA language models. It enables researchers and developers to train and customize LLaMA models efficiently using advanced optimization techniques.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    ChatGLM3

    ChatGLM3

    ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat

    ChatGLM3 is ZhipuAI & Tsinghua KEG’s third-gen conversational model suite centered on the 6B-parameter ChatGLM3-6B. It keeps the series’ smooth dialog and low deployment cost while adding native tool use (function calling), a built-in code interpreter, and agent-style workflows. The family includes base and long-context variants (8K/32K/128K). The repo ships Python APIs, CLI and web demos (Gradio/Streamlit), an OpenAI-format API server, and a compact fine-tuning kit. Quantization (4/8-bit), CPU/MPS support, and accelerator backends (TensorRT-LLM, OpenVINO, chatglm.cpp) enable lightweight local or edge deployment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    CogView4

    CogView4

    CogView4, CogView3-Plus and CogView3(ECCV 2024)

    CogView4 is the latest generation in the CogView series of vision-language foundation models, developed as a bilingual (Chinese and English) open-source system for high-quality image understanding and generation. Built on top of the GLM framework, it supports multimodal tasks including text-to-image synthesis, image captioning, and visual reasoning. Compared to previous CogView versions, CogView4 introduces architectural upgrades, improved training pipelines, and larger-scale datasets, enabling stronger alignment between textual prompts and generated visual content. It emphasizes bilingual usability, making it well-suited for cross-lingual multimodal applications. The model also supports fine-tuning and downstream customization, extending its applicability to creative content generation, human–computer interaction, and research on vision-language alignment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next