2026-03-19AIMistralopen sourceAI codingmultimodal

Mistral Small 4 combines reasoning, vision, and coding in one open-source model

Mistral's new Small 4 model unifies reasoning, image understanding, and coding into one 119B-parameter open-source model — 40% faster than its predecessor.

Mistral just released Small 4 — an open-source AI model that combines three capabilities previously split across separate models: deep reasoning, image understanding, and code generation. It's 40% faster than the previous version and handles 3x more requests per second.

The model is fully open-source under the Apache 2.0 license, meaning anyone can download, modify, and even sell products built on it — no restrictions.

Mistral Small 4 performance comparison across internal models

One model instead of three

Previously, Mistral offered separate models for different tasks: Magistral for reasoning, Pixtral for image understanding, and Devstral for coding. Small 4 merges all three into a single model you can switch between on the fly.

The key feature is a reasoning effort dial — you can set it to "none" for quick, instant answers (like asking a simple question) or "high" for step-by-step thinking (like solving a complex math problem). This means one model adapts to both casual chat and deep analysis.

Specs at a glance

Total parameters: 119 billion (but only 6 billion active at once — like having 128 specialists where only 4 work on each question)
Context window: 256,000 tokens (~190,000 words — enough for a full novel)
License: Apache 2.0 (fully open, no restrictions)
Speed: 40% faster end-to-end, 3x more throughput than Small 3

Benchmark results: less output, same accuracy

The standout metric is efficiency. On the LCR benchmark (a test of logical code reasoning), Small 4 scores 0.72 while generating only 1,600 characters of output. Competing models like Qwen need 3.5–4x more output (5,800–6,100 characters) to achieve similar scores.

LCR benchmark comparison showing Mistral Small 4 efficiency

This matters for cost: fewer output tokens means lower API bills. On the LiveCodeBench (real-world coding tasks), Small 4 outperforms GPT-OSS 120B while using 20% fewer tokens.

LiveCodeBench results showing Mistral Small 4 vs competitors

Who should care about this

If you're a developer building AI-powered apps, Small 4 means you can use one model for chatbots, image analysis, and code generation instead of juggling three. It runs on Hugging Face, vLLM, llama.cpp, and other popular tools.

If you're a business looking to self-host AI, the Apache 2.0 license means no licensing fees and full control over your data. You'll need serious hardware though — minimum 4x NVIDIA H100 GPUs for self-hosting.

If you just want to try it, the easiest path is through the Mistral Le Chat interface or the Mistral API (model name: mistral-small-2603). NVIDIA also offers free prototyping on their Build platform.

The open-source AI race heats up

Small 4 continues the trend of open-source models closing the gap with proprietary ones. At 119B parameters with only 6B active, it's remarkably efficient — competitive with models that use 10–20x more compute per token.

The "reasoning effort" feature mirrors what Anthropic and OpenAI have added to their closed models, but here it's available for anyone to deploy on their own servers. For companies worried about sending sensitive data to third-party APIs, this is a significant option.

Full details are on the Mistral blog, and the model weights are available on Hugging Face.

Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News

Sources

Stay updated on AI news

Simple explanations of the latest AI developments