Artificial Analysis AI Trends
Analysis of the current state of AI and the key trends driving progress. Includes analysis of the intelligence of models, their efficiency, architecture, inference speed and cost and trends in training.
AI Progress
Tracking the continued advancement of AI, and the position of each of the leading AI companies.
Frontier Language Model Intelligence, Over Time
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Capital Expenditure by Major Tech Companies, Over Time
Represents major investments by tech companies in infrastructure, including AI hardware (like GPUs and data centers). Capex is a strong indicator of a company's commitment to AI development, as training and running frontier models requires significant computing resources.
Note: Capex data is sourced from publicly available financial reports, news articles, and primarily from the SEC.
Intelligence vs. Release Date
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Leading Models by AI Lab
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Artificial Analysis Intelligence Index by Model Type
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
{"@context":"https://schema.org","@type":"Dataset","name":"Artificial Analysis Intelligence Index by Model Type","creator":{"@type":"Organization","name":"Artificial Analysis","url":"https://artificialanalysis.ai"},"description":"Artificial Analysis Intelligence Index v4.0 incorporates 10 evaluations: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt","measurementTechnique":"Independent test run by Artificial Analysis on dedicated hardware.","spatialCoverage":"Worldwide","keywords":["analytics","llm","AI","benchmark","model","gpt","claude"],"license":"https://creativecommons.org/licenses/by/4.0/","isAccessibleForFree":true,"citation":"Artificial Analysis (2025). LLM benchmarks dataset. https://artificialanalysis.ai","data":""}
Efficiency
Analysis of how efficiency of AI is developing. This includes considering how the cost of achieving levels of intelligence and the speed at which this intelligence is accessible are trending.
Language Model Inference Price
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Figures represent performance of the model's first-party API (e.g. OpenAI for o1) or the median across providers where a first-party API is not available (e.g. Meta's Llama models).
Language Model Output Speed
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Figures represent performance of the model's first-party API (e.g. OpenAI for o1) or the median across providers where a first-party API is not available (e.g. Meta's Llama models).
Country Analysis
AI is a global phenomenon. We share perspectives on where AI progress is happening and how the leading models from countries compare. We deep-dive on the US and China as the two leading hubs for AI development.
Frontier Language Model Intelligence By Country, Over Time
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Figures represent performance of the model's first-party API (e.g. OpenAI for o1) or the median across providers where a first-party API is not available (e.g. Meta's Llama models).
Open Weights: Frontier Language Model Intelligence By Country, Over Time
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Figures represent performance of the model's first-party API (e.g. OpenAI for o1) or the median across providers where a first-party API is not available (e.g. Meta's Llama models).
Leading Models by Country
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Open Source Models
Open weights models offer flexibility in deployment and the ability to fine-tune the models for specific use cases. We analyze the leading open weights models and how intelligence compares to proprietary models.
Progress in Open Weights vs. Proprietary Intelligence
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Indicates whether the model weights are available. Models are labelled as 'Commercial Use Restricted' if the weights are available but commercial use is limited (typically requires obtaining a paid license).
Artificial Analysis Intelligence Index by Open Weights vs Proprietary
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
Indicates whether the model weights are available. Models are labelled as 'Commercial Use Restricted' if the weights are available but commercial use is limited (typically requires obtaining a paid license).
{"@context":"https://schema.org","@type":"Dataset","name":"Artificial Analysis Intelligence Index by Open Weights vs Proprietary","creator":{"@type":"Organization","name":"Artificial Analysis","url":"https://artificialanalysis.ai"},"description":"Artificial Analysis Intelligence Index v4.0 incorporates 10 evaluations: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt","measurementTechnique":"Independent test run by Artificial Analysis on dedicated hardware.","spatialCoverage":"Worldwide","keywords":["analytics","llm","AI","benchmark","model","gpt","claude"],"license":"https://creativecommons.org/licenses/by/4.0/","isAccessibleForFree":true,"citation":"Artificial Analysis (2025). LLM benchmarks dataset. https://artificialanalysis.ai","data":""}
Model Architecture
The architecture of AI models influences their performance and efficiency. This section examines architectural trends in models, such as the rise in popularity of the Mixture of Experts (MoE) architecture, and how these relate to model capabilities.
Intelligence Index vs Release Date by Model Architecture
A model where only a subset of parameters ("experts") are active per input. Routing mechanisms select a few experts per forward pass, reducing computation while allowing the model to scale to many more parameters overall.
A model where all parameters are active for every input. Every forward pass involves the full network, making it computationally intensive but straightforward to train and deploy.
Model Size: Total and Active Parameters
The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.
The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.
Intelligence vs. Active Parameters
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The number of parameters actually executed during each inference forward pass, expressed in billions. For Mixture of Experts (MoE) models, a routing mechanism selects a subset of experts per token, resulting in fewer active than total parameters. Dense models use all parameters, so active equals total.
Intelligence vs. Total Parameters
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.
Intelligence vs. Total Parameters
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The total number of trainable weights and biases in the model, expressed in billions. These parameters are learned during training and determine the model's ability to process and generate responses.
Context Length (Tokens), Median By Quarter
Larger context windows are relevant to RAG (Retrieval Augmented Generation) LLM workflows which typically involve reasoning and information retrieval of large amounts of data.
Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Training Analysis
Our analysis of trends in the training of AI models includes consideration of both trends in the size of training runs as well as the relationship between the size of the training run and the intelligence of the model.
Training Tokens By Model
The number of tokens used to train the model, represented in trillions.
Intelligence vs. Training Tokens
Artificial Analysis Intelligence Index v4.0 includes: GDPval-AA, 𝜏²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt. See Intelligence Index methodology for further details, including a breakdown of each evaluation and how we run them.
The number of tokens used to train the model, represented in trillions.
Models compared: OpenAI: GPT 4o Audio, GPT 4o Realtime, GPT 4o Speech Pipeline, GPT Realtime, GPT Realtime Mini (Oct '25), GPT-3.5 Turbo, GPT-3.5 Turbo (0125), GPT-3.5 Turbo (0301), GPT-3.5 Turbo (0613), GPT-3.5 Turbo (1106), GPT-3.5 Turbo Instruct, GPT-4, GPT-4 Turbo, GPT-4 Turbo (0125), GPT-4 Turbo (1106), GPT-4 Vision, GPT-4.1, GPT-4.1 mini, GPT-4.1 nano, GPT-4.5 (Preview), GPT-4o (Apr), GPT-4o (Aug), GPT-4o (ChatGPT), GPT-4o (Mar), GPT-4o (May), GPT-4o (Nov), GPT-4o Realtime (Dec), GPT-4o mini, GPT-4o mini Realtime (Dec), GPT-5 (ChatGPT), GPT-5 (high), GPT-5 (low), GPT-5 (medium), GPT-5 (minimal), GPT-5 Codex (high), GPT-5 Pro (high), GPT-5 mini (high), GPT-5 mini (medium), GPT-5 mini (minimal), GPT-5 nano (high), GPT-5 nano (medium), GPT-5 nano (minimal), GPT-5.1, GPT-5.1 (high), GPT-5.1 Codex (high), GPT-5.1 Codex mini (high), GPT-5.2, GPT-5.2 (high), GPT-5.2 (medium), GPT-5.2 (xhigh), GPT-5.2 Codex (xhigh), gpt-oss-120B (high), gpt-oss-120B (low), gpt-oss-20B (high), gpt-oss-20B (low), o1, o1-mini, o1-preview, o1-pro, o3, o3-mini, o3-mini (high), o3-pro, and o4-mini (high), Meta: Code Llama 70B, Llama 2 Chat 13B, Llama 2 Chat 70B, Llama 2 Chat 7B, Llama 3 70B, Llama 3 8B, Llama 3.1 405B, Llama 3.1 70B, Llama 3.1 8B, Llama 3.2 11B (Vision), Llama 3.2 1B, Llama 3.2 3B, Llama 3.2 90B (Vision), Llama 3.3 70B, Llama 4 Behemoth, Llama 4 Maverick, Llama 4 Scout, and Llama 65B, Google: Gemini 1.0 Pro, Gemini 1.0 Ultra, Gemini 1.5 Flash (May), Gemini 1.5 Flash (Sep), Gemini 1.5 Flash-8B, Gemini 1.5 Pro (May), Gemini 1.5 Pro (Sep), Gemini 2.0 Flash, Gemini 2.0 Flash (exp), Gemini 2.0 Flash Thinking exp. (Dec), Gemini 2.0 Flash Thinking exp. (Jan), Gemini 2.0 Flash-Lite (Feb), Gemini 2.0 Flash-Lite (Preview), Gemini 2.0 Pro Experimental, Gemini 2.5 Flash, Gemini 2.5 Flash Live Preview, Gemini 2.5 Flash Native Audio, Gemini 2.5 Flash Native Audio Dialog, Gemini 2.5 Flash (Sep), Gemini 2.5 Flash-Lite, Gemini 2.5 Flash-Lite (Sep), Gemini 2.5 Pro, Gemini 2.5 Pro (Mar), Gemini 2.5 Pro (May), Gemini 3 Flash, Gemini 3 Pro Preview (high), Gemini 3 Pro Preview (low), Gemini Experimental (Nov), Gemma 2 27B, Gemma 2 2B, Gemma 2 9B, Gemma 3 12B, Gemma 3 1B, Gemma 3 270M, Gemma 3 27B, Gemma 3 4B, Gemma 3n E2B, Gemma 3n E4B, Gemma 3n E4B (May), Gemma 7B, PALM-2, and Whisperwind, Anthropic: Claude 2.0, Claude 2.1, Claude 3 Haiku, Claude 3 Opus, Claude 3 Sonnet, Claude 3.5 Haiku, Claude 3.5 Sonnet (June), Claude 3.5 Sonnet (Oct), Claude 3.7 Sonnet, Claude 4 Opus, Claude 4 Sonnet, Claude 4.1 Opus, Claude 4.5 Haiku, Claude 4.5 Sonnet, Claude Instant, Claude Opus 4.5, claude-flan-v3-p, claude-flan-v3-p (low), and claude-flan-v3-p (medium), Mistral: Codestral (Jan), Codestral (May), Codestral-Mamba, Devstral 2, Devstral Medium, Devstral Small, Devstral Small (May), Devstral Small 2, Magistral Medium 1, Magistral Medium 1.1, Magistral Medium 1.2, Magistral Small 1, Magistral Small 1.1, Magistral Small 1.2, Ministral 3 14B, Ministral 3 3B, Ministral 3 8B, Ministral 3B, Ministral 8B, Mistral 7B, Mistral Large (Feb), Mistral Large 2 (Jul), Mistral Large 2 (Nov), Mistral Large 3, Mistral Medium, Mistral Medium 3, Mistral Medium 3.1, Mistral NeMo, Mistral Saba, Mistral Small (Feb), Mistral Small (Sep), Mistral Small 3, Mistral Small 3.1, Mistral Small 3.2, Mixtral 8x22B, Mixtral 8x7B, Pixtral 12B, and Pixtral Large, DeepSeek: DeepSeek Coder V2 Lite, DeepSeek LLM 67B (V1), DeepSeek Prover V2 671B, DeepSeek R1 (FP4), DeepSeek R1 (Jan), DeepSeek R1 0528, DeepSeek R1 0528 Qwen3 8B, DeepSeek R1 Distill Llama 70B, DeepSeek R1 Distill Llama 8B, DeepSeek R1 Distill Qwen 1.5B, DeepSeek R1 Distill Qwen 14B, DeepSeek R1 Distill Qwen 32B, DeepSeek V3 (Dec), DeepSeek V3 0324, DeepSeek V3.1, DeepSeek V3.1 Terminus, DeepSeek V3.2, DeepSeek V3.2 Exp, DeepSeek V3.2 Speciale, DeepSeek-Coder-V2, DeepSeek-OCR, DeepSeek-V2, DeepSeek-V2.5, DeepSeek-V2.5 (Dec), DeepSeek-VL2, and Janus Pro 7B, Perplexity: PPLX-70B Online, PPLX-7B-Online, R1 1776, Sonar, Sonar 3.1 Huge, Sonar 3.1 Large, Sonar 3.1 Small , Sonar Large, Sonar Pro, Sonar Reasoning, Sonar Reasoning Pro, and Sonar Small, TII UAE: Falcon-H1R-7B, xAI: Grok 2, Grok 3, Grok 3 Reasoning Beta, Grok 3 mini, Grok 3 mini Reasoning (low), Grok 3 mini Reasoning (high), Grok 4, Grok 4 Fast, Grok 4 Fast 1111 (Reasoning), Grok 4 mini (0908), Grok 4.1 Fast, Grok 4.1 Fast v4, Grok Beta, Grok Code Fast 1, Grok Voice Agent, Grok-1, and test model, OpenChat: OpenChat 3.5, Amazon: Nova 2.0 Lite, Nova 2.0 Lite (high), Nova 2.0 Lite (low), Nova 2.0 Lite (medium), Nova 2.0 Omni, Nova 2.0 Omni (high), Nova 2.0 Omni (low), Nova 2.0 Omni (medium), Nova 2.0 Pro Preview, Nova 2.0 Pro Preview (high), Nova 2.0 Pro Preview (low), Nova 2.0 Pro Preview (medium), Nova 2.0 Realtime, Nova 2.0 Sonic, Nova Lite, Nova Micro, Nova Premier, and Nova Pro, Microsoft Azure: Phi-3 Medium 14B, Phi-3 Mini, Phi-4, Phi-4 Mini, Phi-4 Multimodal, Phi-4 mini reasoning, Phi-4 reasoning, Phi-4 reasoning plus, Yosemite-1-1, Yosemite-1-1-d36, Yosemite 1.1 d36 Updated, Yosemite-1-1-d64, Yosemite 1.1 d64 Updated, and Yosemite, Liquid AI: LFM 1.3B, LFM 3B, LFM 40B, LFM2 1.2B, LFM2 2.6B, LFM2 8B A1B, LFM2.5-1.2B-Instruct, LFM2.5-1.2B-Thinking, and LFM2.5-VL-1.6B, Upstage: Solar Mini, Solar Open 100B, Solar Pro, Solar Pro (Nov), Solar Pro 2, and Solar Pro 2 , Databricks: DBRX, MiniMax: MiniMax M1 40k, MiniMax M1 80k, MiniMax-M2, MiniMax-M2.1, and MiniMax-Text-01, NVIDIA: Cosmos Nemotron 34B, Llama 3.1 Nemotron 70B, Llama 3.1 Nemotron Nano 4B v1.1, Llama 3.1 Nemotron Nano 8B, Llama 3.3 Nemotron Nano 8B, Llama Nemotron Ultra, Llama 3.3 Nemotron Super 49B, Llama Nemotron Super 49B v1.5, NVIDIA Nemotron 3 Nano, NVIDIA Nemotron Nano 12B v2 VL, NVIDIA Nemotron Nano 9B V2, and Nemotron Nano V3 (30B A3B), StepFun: Step-2, Step-2-Mini, Step-Audio R1.1 (Realtime), Step3, Step3 VL 10B, step-1-128k, step-1-256k, step-1-32k, step-1-8k, step-1-flash, step-2-16k-exp, and step-r1-v-mini, IBM: Granite 3.0 2B, Granite 3.0 8B, Granite 3.3 8B, Granite 4.0 1B, Granite 4.0 350M, Granite 4.0 8B, Granite 4.0 H 1B, Granite 4.0 H 350M, Granite 4.0 H Small, Granite 4.0 Micro, Granite 4.0 Tiny, and Granite Vision 3.3 2B, Inceptionlabs: Mercury, Mercury Coder Mini, Mercury Coder Small, and Mercury Instruct, Reka AI: Reka Core, Reka Edge, Reka Flash (Feb), Reka Flash, Reka Flash 3, and Reka Flash 3.1, LG AI Research: EXAONE 4.0 32B, EXAONE Deep 32B, Exaone 4.0 1.2B, and K-EXAONE, Xiaomi: MiMo 7B RL and MiMo-V2-Flash, Baidu: ERNIE 4.5, ERNIE 4.5 0.3B, ERNIE 4.5 21B A3B, ERNIE 4.5 300B A47B, ERNIE 4.5 VL 28B A3B, ERNIE 4.5 VL 424B A47B, ERNIE 5.0 Thinking Preview, and ERNIE X1, Baichuan: Baichuan 4 and Baichuan M1 (Preview), vercel: v0-1.0-md, Apple: Apple On-Device and FastVLM, Other: LLaVA-v1.5-7B, Tencent: Hunyuan A13B, Hunyuan 80B A13B, Hunyuan T1, and Hunyuan-TurboS, Prime Intellect: INTELLECT-3, Motif Technologies: Motif-2-12.7B, Korea Telecom: Mi:dm K 2.5 Pro and Mi:dm K 2.5 Pro Preview, LongCat: LongCat-Flash-Thinking-2601, Z AI: GLM-4 32B, GLM-4 9B, GLM-4-Air, GLM-4 AirX, GLM-4 FlashX, GLM-4-Long, GLM-4-Plus, GLM-4.1V 9B Thinking, GLM-4.5, GLM-4.5-Air, GLM-4.5V, GLM-4.6, GLM-4.6V, GLM-4.7, GLM-4.7-Flash, GLM-Z1 32B, GLM-Z1 9B, GLM-Z1 Rumination 32B, and GLM-Zero (Preview), Cohere: Aya Expanse 32B, Aya Expanse 8B, Command, Command A, Command Light, Command R7B, Command-R, Command-R (Mar), Command-R+ (Apr), and Command-R+, Bytedance: Duobao 1.5 Pro, Seed-Thinking-v1.5, Skylark Lite, and Skylark Pro, AI21 Labs: Jamba 1.5 Large, Jamba 1.5 Large (Feb), Jamba 1.5 Mini, Jamba 1.5 Mini (Feb), Jamba 1.6 Large, Jamba 1.6 Mini, Jamba 1.7 Large, Jamba 1.7 Mini, Jamba Instruct, and Jamba Reasoning 3B, Snowflake: Arctic and Snowflake Llama 3.3 70B, PaddlePaddle: PaddleOCR-VL-0.9B, Alibaba: QwQ-32B, QwQ 32B-Preview, Qwen Chat 14B, Qwen Chat 72B, Qwen Chat 7B, Qwen1.5 Chat 110B, Qwen1.5 Chat 14B, Qwen1.5 Chat 32B, Qwen1.5 Chat 72B, Qwen1.5 Chat 7B, Qwen2 72B, Qwen2 Instruct 7B, Qwen2 Instruct A14B 57B, Qwen2-VL 72B, Qwen2.5 Coder 32B, Qwen2.5 Coder 7B , Qwen2.5 Instruct 14B, Qwen2.5 Instruct 32B, Qwen2.5 72B, Qwen2.5 Instruct 7B, Qwen2.5 Max, Qwen2.5 Max 01-29, Qwen2.5 Omni 7B, Qwen2.5 Plus, Qwen2.5 Turbo, Qwen2.5 VL 72B, Qwen2.5 VL 7B, Qwen3 0.6B, Qwen3 1.7B, Qwen3 14B, Qwen3 235B, Qwen3 235B A22B 2507, Qwen3 235B 2507, Qwen3 30B, Qwen3 30B A3B 2507, Qwen3 32B, Qwen3 4B, Qwen3 4B 2507, Qwen3 8B, Qwen3 Coder 30B A3B, Qwen3 Coder 480B, Qwen3 Max, Qwen3 Max (Preview), Qwen3 Max Thinking, Qwen3 Next 80B A3B, Qwen3 Omni 30B A3B, Qwen3 VL 235B A22B, Qwen3 VL 30B A3B, Qwen3 VL 32B, Qwen3 VL 4B, Qwen3 VL 8B, and qwen3-max-thinking-2601, InclusionAI: Ling-1T, Ling-flash-2.0, Ling-mini-2.0, Ring-1T, and Ring-flash-2.0, 01.AI: Yi-Large and Yi-Lightning, and ByteDance Seed: Doubao Seed Code, Doubao-Seed-1.8, and Seed-OSS-36B-Instruct