Chinese delivery giant Meituan has joined the crowded field of open-source artificial intelligence (AI) models, unveiling its own system called LongCat-Flash-Chat Monday across GitHub, Hugging Face and its official website, reports Caixin. The model, nicknamed “LongCat” in Chinese as “龙猫,” uses a Mixture-of-Experts (MoE) architecture with 560 billion total parameters, of which only 18.6 billion to 31.3 billion are activated at a time. While it does not feature reasoning capabilities, Meituan said LongCat-Flash-Chat delivers performance comparable to leading models while activating fewer parameters, excelling especially in agent-based tasks.
On Nvidia H800 chips, LongCat-Flash achieved generation speeds of 100 tokens per second, with output costs as low as RMB 5 ($0.69) per million tokens. Meituan also published benchmark comparisons against rivals including DeepSeek, Alibaba’s Qwen, Moonshot’s Kimi, Google’s Gemini, OpenAI’s GPT and Anthropic’s Claude.
The results showed LongCat-Flash outperforming several mainstream models in agentic benchmarks such as τ2-Bench and VitaBench, ranking above DeepSeek V3.1, Kimi-K2, GPT-4.1 and Google Gemini 2.5 Flash. In coding tasks, however, it lagged behind DeepSeek, Kimi-K2 and Anthropic’s Claude Sonnet, though still surpassed GPT-4.1 and Gemini’s lighter version.