MiniMax-M2.5: Built for Real-World Productivity
The frontier coding and agentic model trained across 200,000+ real-world environments. SOTA on SWE-Bench Verified, BrowseComp, and Terminal-Bench with a 200K-token context window.
Run MiniMax-M2.5 continuously for one hour at 100 tokens/s for under $1.

Benchmarks
Industry-leading performance across coding, search, and agentic tasks
State-of-the-art on the industry-standard code repair benchmark, averaged over 4 runs
Leading performance on multi-repository, cross-project software engineering tasks
Best-in-class web research and information retrieval with context management
Process extensive codebases, long documents, and multi-turn agent sessions in a single context
Coding
Architect-level planning meets SOTA execution
MiniMax-M2.5 approaches complex projects the way a senior software architect would: decomposing requirements, planning structure, and designing interfaces before writing a single line of code. This spec-writing behavior emerged naturally during reinforcement learning across 200,000+ real-world environments.
- Trained on 13+ languages including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby
- Covers the full development lifecycle: system design, environment setup, feature development, iteration, code review, and testing
- Full-stack across Web, Android, iOS, and Windows — server APIs, business logic, databases, not just frontend demos
- On Droid: 79.7 (M2.5) vs 78.9 (Opus 4.6). On OpenCode: 76.1 (M2.5) vs 75.9 (Opus 4.6)
- Upgraded VIBE benchmark to the more complex Pro version — M2.5 performs on par with Opus 4.5


Search & Tool Calling
Smarter decisions, fewer rounds, better results
MiniMax-M2.5 achieves industry-leading performance on BrowseComp and Wide Search while using approximately 20% fewer reasoning rounds than M2.1. The model has learned to solve problems through more precise search strategies and better token efficiency — not just getting the right answer, but finding it through more efficient paths.
- 76.3% on BrowseComp with context management — best-in-class web research capability
- Built and evaluated on RISE (Realistic Interactive Search Evaluation) for expert-level search tasks in real-world professional settings
- Stronger generalization across unfamiliar scaffolding environments compared to previous generations
- ~20% fewer rounds across BrowseComp, Wide Search, and RISE compared to M2.1

Office & Finance
Enterprise-grade document intelligence and financial modeling
MiniMax-M2.5 brings frontier-model capabilities to real enterprise workflows. From Excel competitions to financial modeling, the model handles composite instruction constraints and multi-step business processes that demand both precision and domain expertise.
- Evaluated on MEWC (Microsoft Excel World Championship) — 179 problems from 2021–2026 competition divisions
- Financial modeling benchmark with end-to-end research and analysis tasks scored by expert-designed rubrics
- Enhanced handling of composite instruction constraints for complex multi-step office scenarios
- GDPval-MM evaluation shows strong performance with lower average token cost per task

Reasoning & General Intelligence
Efficient reasoning that translates to real-world performance
Trained with reinforcement learning to reason efficiently and decompose tasks optimally, MiniMax-M2.5 delivers strong performance across mathematics, science, and general knowledge benchmarks while maintaining the practical focus that defines the M2 family.
- Competitive performance on AIME 2025, GPQA Diamond, and LiveCodeBench
- Efficient reasoning chains that reduce token consumption without sacrificing accuracy
- Strong results on the Artificial Analysis Intelligence Index leaderboard

Speed & Cost
Intelligence too cheap to meter
MiniMax-M2.5 delivers frontier performance at a fraction of the cost. Trained to reason efficiently, the model completes complex agentic tasks significantly faster while consuming fewer tokens per task.
Completes SWE-Bench Verified 37% faster, matching Claude Opus 4.6 speed
Run MiniMax-M2.5 continuously at 100 tokens/second for under $1 per hour
Achieves better results with ~20% fewer rounds across agentic tasks vs M2.1
Pricing
Input
$0.50 / 1M tokens
Output
$1.50 / 1M tokens
Appendix
Comprehensive benchmark results
Detailed evaluation data across coding, search, office, reasoning, and general intelligence benchmarks.


Start building with MiniMax-M2.5
Experience SOTA coding, 200K context, and architect-level planning at $0.50 / $1.50 per million tokens.