Dolphin Research
2026.03.02 13:58

MiniMax (Trans): Doubling down on coding, productivity, and multimodal use cases

Below is Dolphin Research's$MINIMAX-WP(00100.HK) FY25 earnings call transcript; for our earnings analysis, please see 'Model Losses at 360%, Is MiniMax Still a Hot Pick?'.

I. Detailed content from the earnings call

1.1 Key takeaways from management

1. Language model

a. Model iteration: M2/M2.1 updates completed in 2025, with M2.5 launched in Feb-2026. Focused breakthroughs in coding, tool use, and workplace scenarios.

b. Key metrics: M2.5 set a new SWE-bench mark in coding, with token efficiency up 37% vs. M2.1.

c. Dev ecosystem: First China model on OpenRouter to exceed 50B tokens in a single day. By Feb-2026, daily token use was 6x+ Dec-2025, with coding tokens up over 10x.

d. Platform integration: Deployed on Azure, AWS, and Google Vertex AI, and became Notion's first and only open-source model partner.

2. Multimodal generation

a. Video (Hailuo 2.3): Over 600 mn videos generated by end-2025. Rolled out a Fast model, cutting batch ops cost by 50%.

b. Speech (Speech 2.6): Supports 40+ languages, with 200 mn+ hours of audio generated, optimized for ultra-low latency interactions.

c. Music (Music 2.0/2.5): Delivered generational upgrades, supporting complex emotions and diverse vocal styles.

3. AI-native products and tools

a. Product upgrade: Released Minimax Expert 2.0, enabling Agents to enter local workspaces.

b. User side: As of Feb-2026, pro users created over 50,000 expert Agents.

c. Globalization: Served 236 mn+ users across 200+ countries.

4. Latest progress

By Feb-2026, unit token inference cost for the M2 series fell over 50% vs. end-2025, and Hailuo video generation cost dropped 30%. Management emphasized that capability gains make complex Agents economically viable.

ARR reached $150 mn in Feb-2026, with strong momentum.

5. 2026 strategy and guidance

a. Three tech/application goals:

- Coding: Target L4–L5 intelligence, evolving from 'tool' to 'coworker'.

- Workplace: Replicate last year's coding growth curve, drive Agent penetration.

- Multimodal: Move video creation to 'deliverable straight-out' mid/long-form and real-time streaming output.

b. Strategic repositioning: Shift from a 'large-model company' to a 'platform company for the AI era'.

a. Management frames firm value as intelligent density × token throughput.

b. Model pipeline: Developing the Hailuo 3 series to meet 10–100x token growth over the next 1–2 orders of magnitude.

1.2 Q&A

Q: Against Google and OpenAI's first-mover advantages, where does MiniMax, as a startup, see the logic and opportunity to become an AI platform company?

A: On platformization in the AI era, we see the market as overwhelmingly incremental rather than zero-sum, and not a winner-takes-all arena. Startups with innovation and distinctiveness still have major opportunities. Over the next 2–3 years, with sustained advances in model R&D and infrastructure, coding, workplace, and interactive entertainment will open up large innovation spaces.

At the model level, moats come from long-term accumulation and rapid iteration. In the past 108 days, we released M2, M2.1, and M2.5 consecutively, each driving sharp gains in users and conversion.

We have pursued multimodal R&D since inception, and this first-mover stance in fusion will be a core edge going forward.

On product and capability integration, MiniMax is the first in China to let 'tech models drive products'. We believe 'model + product' together form a stronger moat, as model capability sets the product ceiling.

This integrated capability is hard to copy and is a distinctive label for us.

At the ecosystem level, we are building niche ecosystems around model traits. In the OpenClaw ecosystem, MiniMax models offer high value-for-money, materially lowering dev barriers.

Our Agent products further integrate model capabilities, cutting user friction. We are also contributing to open-source, underscoring our ability to accelerate ecosystem growth.

Looking ahead, building a global ecosystem has just begun. In 2H, we plan to push boundaries with M3 and follow-on versions.

We aim to build unique products and ecosystems around our models. In Asia, MiniMax may be among the very few, if not the only, startups able to stand alongside big tech while deeply investing across model, product, and ecosystem layers.

Q: Some argue that 'single-point breakthroughs then fusion' are more efficient for multimodal. Does MiniMax's all-modality parallel strategy risk overburdening R&D or falling behind?

A: We have always believed modal fusion is foundational to sustained intelligence gains. Over the past six months, multiple cases showed step-change via fusion, e.g., Google's Nano Banana Pro expanded image generation through combining visual understanding and generation.

MiniMax's multimodal strategy has two phases. Phase one over the past four years delivered influential models across language, vision, speech, and music, building technical credibility.

We are now in phase two: bottom-layer integration to combine proven modalities into a cohesive force for new breakthroughs. The upcoming M3 and L3 in 1H are milestone outputs of this phase.

This approach is not a burden but a deep moat. Each modality requires long cycles of data, algorithm paths, and talent pipelines.

Only three companies domestically are leading across all modalities, and MiniMax is the only startup among them. This full-stack base gives us uniqueness in the fusion trend ahead.

On market opportunity, we see video generation as the largest AGI market beyond coding and assistants. This year it should advance to mid/long-form and near real-time, expanding the addressable market.

Our fusion capabilities are particularly well positioned for this shift.

As for R&D challenges and financial pressure, from day one we defined AGI as requiring multimodal I/O and built a reusable base across modalities. Financially, our R&D spend has not notably exceeded single-modality startups and is well below the giants.

Our technical judgment keeps being validated, and our models outperform some single-modality peers even with parallel multimodal development. Over time, the advantages of this path will become more evident.

Q: How do you view the industry shifts from L4/L5 coding intelligence (e.g., AI replacing software firms), and where does MiniMax fit?

A: We see L3 as the industry's common level today, while L4 implies innovation, e.g., a single researcher executing experiments from papers or solving tough engineering problems.

L5 is organizational intelligence, coordinating multiple people or Agents to deliver tasks like 'developing a leading model' requiring algorithm innovation, training optimization, and ops.

In scenario selection, coding is where Agent capability breaks out first, serving pros while lowering barriers. But workplace scenarios will iterate very fast and have a larger market than coding.

Most white-collar work involves data analysis, financial reporting, and PPT prep, with far broader audiences and density than pure coding. We have carved out an edge in coding and Agents with modest resources, and this is only the start.

MiniMax's advantages rest on two pillars. First, extreme iteration speed. We moved from M2 to M2.5 across three generations in just 108 days, proving our R&D efficiency under constraints.

We now have materially more resources, and with increased investment, we expect faster progress. The M3 series should raise the growth ceiling further.

Second, model distinctiveness. In a vast AGI market, it's not winner-takes-all; the key is defining unique technical traits. We do not follow the herd.

In M2, we positioned M2 for high value-for-money and speed, Hailuo 2 for complex task handling, and Speech 2 for multilingual and low-latency. These differentiators helped us unlock the market precisely.

As resources scale, this distinctiveness should translate into greater value. We are confident that stronger models and faster iteration will secure better positions in coding-driven Agents and broad workplace scenarios, lifting share over time.

Q: With giants and open-source models intensifying competition, where does MiniMax see the core battleground, and which fights must be won?

A: Our goal is to be a platform company for the AI era, driven by 'sustained increases in intelligent density' and 'token throughput'. Compared with peers, our strategy differs in two ways:

First, in positioning, we practice selective focus, concentrating resources where we can create unique value.

For example, in 2023 we decided not to build a mobile general personal assistant (e.g., Duobao or ChatGPT-like). We judged these areas lack MiniMax-specific differentiation, so we channeled resources into Agents and multimodal innovation.

This choice builds long-term industrial advantage and raises decision success rates. Our full-modality layout from day one positions us well at the current fusion inflection.

Second, R&D efficiency determines outcomes, not sheer resource burn. In AI, the winners are those whose models improve fastest and reach scaled revenue first.

We enforce extreme efficiency in algorithm optimization, experiment design, iteration cadence, and decision mechanisms. Leveraging a nimble startup structure, we blend top-down and bottom-up decisions and reuse technical infra across modalities.

This efficiency lead keeps our models in the core cohort despite limited resources.

Longer term, only a few platform AI companies will remain globally. We believe MiniMax has unique competitive advantages and is one of the few with independent growth potential to stay in the industry's core cohort.

Q: M2 calls in the first two months of 2026 were already 6x Dec-2025, likely helped by hit apps and stronger M2 control. Is this a one-off early windfall that will normalize, or the start of a sustainable long-term trend?

A: We see it as the start of a long-term trend rather than a one-off. Industry growth tends to be stepwise rather than linear.

Our ability to keep shipping new models and capture a larger share stems from deep understanding of intelligent iteration, pre-allocating R&D, and clearly defining each generation.

On upcoming growth sources, we have been preparing since 2H25. We expect emergent intelligence in 2026 to unlock several 'super PMF' points, with faster penetration and acceleration than the market expects and more diversified growth drivers.

The first super PMF remains coding, with a very high ceiling. While current tools already perform well, we believe this year will see a step change to 'coworker-level' collaboration, possibly achieving innovative discoveries and complex organizational coordination.

From tech evolution, market need, and our R&D progress, this leap is very likely this year.

The second super PMF is workplace scenarios across professions. This space is broader and larger than coding. It is also more complex, given varied professions, tool use, and outcomes that are hard to objectively verify, which challenges iteration pace.

We have prepared extensively and believe workplace penetration this year could match last year's pace in coding.

The third super PMF comes from advances in multimodal dynamic generation. Direct interaction and long-form generation lower application barriers further.

Over the past 2–3 years, model competition has seen back-and-forth wins; all companies face challenges, and none can guarantee permanent SOTA. We are confident in winning more key battles.

Our strategy has two cores: continuous boundary-pushing in technology, and using breakthroughs to make our products and businesses increasingly ecosystem-centric, capturing more upside. We aim to grow with the industry, elevating distinctiveness, R&D efficiency, innovation, and global commercialization, evolving into more scalable, long-term organizational competitiveness.

Q: Management said 'AI interns' now cover 90% of employees, effectively making the company a frontier testbed. What internal learnings has this produced, and how do they feed back into products and technology?

A: Beyond aiming to become a platform AI company, we also strive to build an 'AI-native' organization now during R&D. Two changes are key:

The first is the speed of organizational progress. As a startup with limited resources, we must maximize efficiency to expand possibilities.

As more colleagues bring AI into daily work, we see a clear trend: initially humans taught Agents how to work; increasingly, humans observe Agent workflows, and sometimes Agents surprise us.

This shortens organizational chains, letting each business link enjoy intelligent dividends. From model iteration to product innovation and user service, our loop keeps accelerating.

Staff can also free time for higher-value work, further speeding collective thinking and innovation.

The second is feedback from the internal testbed to model R&D, clarifying frontier intelligence targets. With Agents running widely inside the company, we observe that even the best current models still fail in many areas.

Those gaps often carry high economic and usage value, and these frontline pain points directly inform the next-gen models and Agents, sharpening R&D goals.

As our models approach world-class levels, internal validation's value multiplies. In recent months, our model iteration speed, revenue growth, user service capability, and token throughput have all continued to improve, aided by faster target definition and full internal AI leverage.

In sum, this Agent-based AI-native organizational model is now running and has formed a positive flywheel internally. We see it as a core competitive advantage for sustained development.

<End here>

Risk disclosure and statement:Dolphin Research Disclaimer and General Disclosure