All Models

Search Models

all

AIOZ AI

Advanced Filters

MiniMax-M2.5

A large agentic model built for coding, tool use, search, and office work, reporting strong results on SWE-Bench Verified and other agentic benchmarks.

Other

Text Generation

Transformers

Safetensors

vLLM

English

Chinese

by @AIOZAI

Olmo-3.1-32B-Instruct

A 32B instruction-tuned model from Ai2, built for reasoning, coding, and instruction-following, and released as a fully open model with public code, checkpoints, and training data.

Apache-2.0

Text Generation

Transformers

Safetensors

vLLM

English

by @AIOZAI

SmolLM3

A compact 3B model that punches above its size: dual-mode reasoning, six languages, and up to 128k-token context, released fully open with weights, data mixture, and training details.

Apache-2.0

Text Generation

Transformers

Safetensors

ONNX

vLLM

English

Chinese

French

Spanish

Portuguese

Italian

Russian

Arabic

by @AIOZAI

Smoker Classification Challenge

Baseline source code for Smoker Classification Challenge

MIT

Image Classification

PyTorch

English

by @AIOZAI

Devstral Small 2 24B Instruct 2512

An agentic coding model that explores codebases, edits across multiple files, and drives software-engineering agents — light enough to run on a single GPU, with a 256k context window and vision support.

Apache-2.0

Text Generation

Transformers

Safetensors

vLLM

English

Multilingual

by @AIOZAI

Qwen2.5-Omni-7B

Qwen2.5-Omni-7B is an end-to-end multimodal foundation model that perceives text, images, audio, and video while streaming both text and natural speech responses in real time. Built on the novel Thinker–Talker architecture with TMRoPE time-aligned multimodal position embeddings, it delivers state-of-the-art results on OmniBench and matches or surpasses similarly sized single-modality models across speech, vision, and audio reasoning benchmarks. Its end-to-end speech-instruction-following ability rivals its text-input performance on standards such as MMLU and GSM8K.

Apache-2.0

Text Generation

Transformers

Safetensors

English

by @AIOZAI

License Plate Recognition Challenge

Model for License Plate Recognition Challenge

MIT

Object Detection

PyTorch

English

by @AIOZAI

Z-Image Turbo

Z-Image-Turbo is a 6B-parameter text-to-image diffusion model distilled from the Z-Image foundation model, producing high-fidelity images in only 8 NFEs (Number of Function Evaluations). It delivers sub-second inference latency on H800-class GPUs and fits within 16 GB of VRAM on consumer hardware, while preserving strong photorealism, bilingual (English/Chinese) text rendering, and reliable instruction following.

Apache-2.0

Text-to-Image

Safetensors

Diffusers

English

by @AIOZAI

QwQ-32B

QwQ-32B is a 32.5B-parameter causal reasoning model from the Qwen series, post-trained with supervised fine-tuning and reinforcement learning to think explicitly before answering. Despite its mid-range size, it delivers performance competitive with leading reasoning systems such as DeepSeek-R1 and o1-mini, particularly on hard math, coding, and multi-step problems. It supports a native 131,072-token context (with YaRN scaling for inputs beyond 8,192 tokens) and is best driven with non-greedy sampling (Temperature 0.6, TopP 0.95, TopK 20–40).

Apache-2.0

Text Generation

Transformers

Safetensors

English

by @AIOZAI

Phi-4

Phi-4 is Microsoft's 14B-parameter dense decoder-only language model, trained on ~9.8T tokens of synthetic, textbook-quality, and curated web data with a 16K-token context window. It is engineered for reasoning and logic in memory- or latency-constrained deployments, matching or surpassing far larger models on math (MATH: 80.4) and science (GPQA: 56.1) benchmarks. Released under the permissive MIT license with SFT + DPO alignment for instruction following and safety.

MIT

Text Generation

Transformers

Safetensors

English

by @AIOZAI

Qwen3-Coder-Next

Qwen3-Coder-Next is an open-weight coding model from the Qwen team that activates just 3B of its 80B parameters per token, matching the performance of models 10–20× larger at a fraction of the inference cost. It pairs a hybrid Gated DeltaNet + Gated Attention MoE architecture with a native 262,144-token context, and is purpose-built for agentic coding tasks such as long-horizon reasoning, tool use, and failure recovery, with out-of-the-box support for Claude Code, Qwen Code, Qoder, Kilo, Trae, and Cline.

Apache-2.0

Text Generation

Transformers

Safetensors

by @AIOZAI

Spaceship Titanic Prediction Challenge

Model for Spaceship Titanic Prediction Challenge

MIT

Tabular Classification

PyTorch

English

by @AIOZAI

HeartMuLa

Most open-source music models give you one capability. HeartMuLa gives you four: a lyrics-conditioned song generator, a high-fidelity music codec, a lyrics transcription model, and an audio-text alignment model — all open-sourced together as a coherent foundation. The 3B generator handles multilingual lyrics across English, Chinese, Japanese, Korean, and Spanish, with style controlled through simple comma-separated tags. An internal 7B version already reaches Suno-level quality, with the open 7B release planned.

Apache-2.0

Text-to-Audio

PyTorch

Safetensors

Diffusers

English

by @AIOZAI

UI-Venus 1.5

Give UI-Venus 1.5 a natural language instruction and a screenshot — it will find the right button, navigate the interface, and complete the task, just like a human would. No accessibility APIs, no DOM parsing, no special permissions needed. The unified 2B/8B/30B-A3B model family achieves state-of-the-art results on major GUI benchmarks including AndroidWorld (77.6%) and ScreenSpot-Pro (69.6%), with a full Android automation framework supporting 40+ mainstream apps out of the box.

Apache-2.0

Image-Text-to-Text

PyTorch

Transformers

Safetensors

English

by @AIOZAI

Text Generation with SmolLM-135M

Text Generation with SmolLM-135M involves utilizing a compact language model with 135 million parameters to automatically generate text. This model, although smaller in size, is proficient at producing coherent and structured textual content.

Apache-2.0

Text Generation

Transformers

ONNX

English

by @AIOZAI

132

155

Melanoma Skin Cancer Classification Challenge

Model for Melanoma Skin Cancer Classification Challenge

MIT

Image Classification

PyTorch

English

by @AIOZAI

DeepSeek-OCR

DeepSeek-OCR reimagines optical character recognition as a context compression problem — treating visual documents not as images to scan, but as information to compress and decode through an LLM-centric vision encoder. It converts documents, PDFs, and images to clean markdown, extracts text with layout awareness, parses figures, and localizes specific elements by reference — all at ~2500 tokens per second on a single A100 with vLLM. Multiple resolution modes from 64 to 400+ vision tokens let you tune the quality-speed tradeoff for your use case.

MIT

Image-Text-to-Text

PyTorch

Transformers

Safetensors

Multilingual

by @AIOZAI

LightRAG

LightRAG is a simple, fast, and powerful RAG system that goes beyond chunk retrieval by automatically building a knowledge graph from your documents — then querying both the graph and vector store simultaneously for richer, more contextually aware answers. Published at EMNLP 2025 and trusted by 29k+ developers, it works with any LLM, supports production-grade storage backends, and ships with a Web UI featuring live knowledge graph visualization.

MIT

Text2Text Generation

PyTorch

Transformers

Safetensors

English

by @AIOZAI

OmniLottie

OmniLottie is the first end-to-end model capable of generating Lottie animations directly from text descriptions, images, or video clips — producing structured, editable JSON output rather than raster video. Built on a 4B vision-language model and trained on MMLottie-2M, a dataset of 2 million annotated animations, it introduces a custom Lottie tokenizer that makes complex vector animation learnable by a language model. Accepted to CVPR 2026.

Apache-2.0

PyTorch

Transformers

Safetensors

English

by @AIOZAI

2 3 4

All Models

spaceship model 2

MiniMax-M2.5

Olmo-3.1-32B-Instruct

SmolLM3

Smoker Classification Challenge

Devstral Small 2 24B Instruct 2512

Qwen2.5-Omni-7B

License Plate Recognition Challenge

Z-Image Turbo

QwQ-32B

Phi-4

Qwen3-Coder-Next

Spaceship Titanic Prediction Challenge

HeartMuLa

UI-Venus 1.5

Text Generation with SmolLM-135M

Melanoma Skin Cancer Classification Challenge

DeepSeek-OCR

LightRAG

OmniLottie