TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

AI Models Directory

Browse and discover AI models from leading companies in the industry.

to
  • By OpenAI
    GPT-5.2-Codex is OpenAI’s frontier agentic coding model based on GPT-5.2, optimized for long-horizon software work, large refactors and migrations, better Windows support and state-of-the-art cybersecurity performance in Codex.
    NewCoding
    Released 1d ago
  • By OpenAI
    GPT-5.3-Codex is OpenAI’s most capable agentic coding model, combining GPT-5.2-Codex’s frontier coding with GPT-5.2’s reasoning in one model that is ~25% faster, tops SWE-Bench Pro and Terminal-Bench, and runs long, tool-using workflows on a computer.
    NewCoding
    Released 1d ago
  • Claude Opus 4.6 is Anthropic’s newest frontier model, built for complex reasoning, large-scale coding and multi-agent orchestration, with major gains in cybersecurity bug-finding and long-horizon enterprise workflows over Opus 4.5.
    NewMultimodal
    Released 1d ago
  • Open weights translation model built on Gemma 3 4B that translates across all 22 official Indian languages, handling long form and structured documents with cultural nuance.
    Text
    Released 7mo ago
  • By NVIDIA
    Music Flamingo is NVIDIA’s large audio-language model for deep music understanding, fine-tuned on the MF-Skills and MF-Think datasets to analyze full songs with theory-aware chain-of-thought and state-of-the-art benchmark scores
    NewAudio
    Released 2mo ago
  • By OpenBMB
    MiniCPM-o 4.5 is an on-device multimodal LLM (~9B params) that matches Gemini 2.5 Flash on vision and speech, supporting full-duplex live streaming so it can see, listen and speak in real time.
    Multimodal
    Released 5mo ago
  • PaperBanana is an agentic framework that turns raw scientific content into publication-ready methodology diagrams and plots, orchestrating multiple AI agents plus a dedicated benchmark, PaperBananaBench.
    NewMultimodal
    Released 1d ago
  • Intern-S1 is a scientific multimodal foundation model built on a 235B-parameter Qwen3 MoE LLM plus a 6B vision encoder, trained on 5T multimodal tokens with over half from scientific domains.
    Multimodal
    Released 5mo ago
  • Universal-3 Pro is a promptable speech language model that turns raw audio into highly accurate transcripts, letting you control disfluencies, tags, speaker roles and code-switching through natural-language prompts.
    NewAudio
    Released 3d ago
  • By Baidu
    Production ready OCR and document AI toolkit that turns images and PDFs into structured data, with multilingual OCR, layout analysis and VLM based document parsing.
    NewText
    Released 8d ago
  • Open source foundation model that jointly generates video and audio in one pass, achieving tightly synchronized lip movements and environment-aware sound effects.
    NewVideo
    Released 5d ago
  • By Meituan
    Chat-oriented LongCat-Flash variant, a 560B MoE language model with around 27B parameters active per token, tuned as a fast, non-thinking foundation for general and agentic tasks.
    Text
    Released 5mo ago
  • By Meituan
    6B parameter bilingual (Chinese-English) text-to-image foundation model focused on photorealism, strong Chinese text rendering and high quality image editing.
    NewImage
    Released 10mo ago
  • By Meituan
    Large reasoning model built on a 560B MoE backbone, activating about 18.6 to 31.3B parameters for advanced chain-of-thought, formal and agentic reasoning tasks.
    Text
    Released 4mo ago
  • By Meituan
    560B parameter omni-modal MoE model (about 27B active) for real time audio-visual interaction, built on LongCat-Flash with multimodal perception and speech modules.
    NewMultimodal
    Released 1d ago
  • ZipVoice-based voice cloning TTS that generates 48 kHz speech at up to 150x real time, fitting in about 1 GB VRAM for local, high quality synthesis
    NewAudio
    Released 4d ago
  • By Hexgrad
    Open-weight 82M-parameter TTS model that delivers high quality speech at low cost, designed for fast, production-ready deployment across several languages.
    Audio
    Released 1y ago
  • Long-form video extension engine that analyzes scene semantics and motion to extend clips with coherent shots, maintaining strong temporal consistency and cinematic storytelling
    NewMultimodal
    Released 8d ago
  • Virtual try-on model that composes a person and garment image into a photorealistic result in pixel space, without segmentation masks, supporting model shots and flat-lay product photos.
    NewImage
    Released 1mo ago
  • By Alibaba
    Qwen-Coder-Qoder is a reinforced code model based on Qwen-Coder, custom trained for the Qoder agentic coding platform to improve end to end programming performance inside its IDE and CLI workflows
    NewCoding
    Released 2d ago
  • By StepFun
    ACE-STEP v1.5 is an open source, super fast music foundation model that uses a hybrid language model plus diffusion transformer pipeline to turn short prompts into multi minute songs, running on consumer GPUs with under 4 GB VRAM
    NewAudio
    Released 5d ago

No models found

Try adjusting your search or filters.

0 AIs selected
Clear selection
#
Name
Task