Xiaomi's AI just bought something online — it even haggled
Xiaomi's MiMo V2-Omni can browse the web, compare prices across sites, negotiate with sellers via chat, and complete purchases — all without any human help.
Xiaomi just launched three new AI models — and the wildest one doesn't just answer questions. MiMo V2-Omni can see your screen, hear what's happening around it, browse the internet, and take action on your behalf. In a live demo, it opened a browser, searched for a product on one platform, compared prices on another, negotiated a discount with a seller via chat, and completed the purchase. No human touched the keyboard.
An AI with eyes, ears, and hands
Most AI chatbots just read and write text. MiMo V2-Omni is different — it processes images, video, and audio simultaneously through a single system, and it can interact with software the way a human does: clicking buttons, scrolling pages, and typing into forms.
In Xiaomi's demos, Omni performed tasks that normally eat up your afternoon:
🛒 Online shopping: Browsed products on Xiaohongshu (China's Instagram), jumped to JD.com to compare prices, opened a chat with the seller, negotiated a discount, and placed the order — all autonomously.
🚗 Dashcam analysis: Watched live dashcam footage, identified pedestrians, cyclists, and vehicles, and flagged potential hazards in real time.
📱 Content creation: Created a multimedia post, debugged code, and published it to TikTok — without human intervention.
The audio processing is equally impressive: Omni can listen continuously for over 10 hours, making it potentially useful for meeting recording, real-time translation, or monitoring.
How it stacks up against the competition
On web navigation benchmarks (tests that measure how well AI can browse and interact with websites), Omni outperformed both Google's Gemini 3 Pro and OpenAI's GPT-5.2. On image understanding tests, it scored 76.8 — beating Claude Opus 4.6's 73.9.
That said, it's still behind the top coding-focused models. On ClawEval (a coding benchmark), Omni scored 54.8 — well behind Claude Opus 4.6's 66.3. This isn't a replacement for a coding assistant; it's built for real-world tasks that involve seeing, hearing, and interacting with software.
The AI voice that actually sings
The third model in Xiaomi's launch, MiMo V2-TTS, is the only commercial voice AI that can both speak and sing natively. Trained on over 100 million hours of speech data, it doesn't just read text aloud — it interprets emotional cues.
Tell it to sound "sleepy, just woken up, slightly hoarse" and it will. Tell it to sound "angry, but trying to stay calm" and it adjusts. It even generates natural sounds like coughs, hesitations, sighs, and laughter — without being explicitly told to. Write a word in ALL CAPS? It emphasizes it. Repeat a letterrrrr? It stretches the sound.
Why this matters for creators: If you make podcasts, audiobooks, voiceovers, or any audio content, a TTS (text-to-speech) model that understands emotion and can sing could dramatically cut production time. No more robotic AI voices reading your script in a monotone.
Pricing that undercuts everyone
Xiaomi also launched MiMo V2-Pro, a trillion-parameter language model (the AI's total brain size, with 42 billion neurons active per request). Its pricing tells the real story:
MiMo V2-Pro: $1 per million input tokens / $3 per million output tokens
Claude Opus 4.6: $5 / $25
Claude Sonnet 4.6: $3 / $15
That makes MiMo V2-Pro 5–8x cheaper than the leading Western models, while scoring competitively on coding and reasoning benchmarks. Before its official launch, it ran anonymously on OpenRouter under the codename "Hunter Alpha" — and topped the daily rankings for several days. Many users assumed it was DeepSeek V4.
Who should pay attention
If you run an e-commerce business or do a lot of online research: Omni's autonomous browsing could eventually handle product comparison, competitive analysis, and purchasing workflows.
If you create audio content: V2-TTS's emotional range and singing capability could replace expensive voice actors for certain projects.
If you're a developer using AI APIs: V2-Pro's pricing makes it worth testing as a cheaper alternative to Claude or GPT for coding tasks. It's available now through partners like OpenRouter, OpenClaw, and OpenCode — with free access for one week at launch.
Xiaomi's team summed up their vision: "A model that only reads text lives in a library. A model that sees, hears, reasons, and acts lives in the world."
Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News
Sources
Stay updated on AI news
Simple explanations of the latest AI developments