QwQ-32B

QwQ-32B is a 32.5B-parameter causal reasoning model from the Qwen series, post-trained with supervised fine-tuning and reinforcement learning to think explicitly before answering. Despite its mid-range size, it delivers performance competitive with leading reasoning systems such as DeepSeek-R1 and o1-mini, particularly on hard math, coding, and multi-step problems. It supports a native 131,072-token context (with YaRN scaling for inputs beyond 8,192 tokens) and is best driven with non-greedy sampling (Temperature 0.6, TopP 0.95, TopK 20–40).

Apache-2.0

Text Generation

Transformers

Safetensors

English

by @AIOZAI

•

Last updated: 2 months ago

Details

Files

Discussions

orCreate an account