--- title: "Alibaba AI beats Google and OpenAI in global coding rankings" type: "News" locale: "en" url: "https://longbridge.com/en/news/287775800.md" description: "Alibaba's AI model Qwen3.7-Max has secured the fourth position on the global Code Arena coding leaderboard, outperforming models from OpenAI and Google. This achievement highlights a shift among Chinese AI developers towards specialized coding agents. The ranking reflects real-world developer preferences, as users test models' abilities to create interactive web applications. The industry is increasingly focusing on coding capabilities, with predictions that success will depend on integrating AI models into developers' daily routines." datetime: "2026-05-27T13:01:30.000Z" locales: - [zh-CN](https://longbridge.com/zh-CN/news/287775800.md) - [en](https://longbridge.com/en/news/287775800.md) - [zh-HK](https://longbridge.com/zh-HK/news/287775800.md) --- # Alibaba AI beats Google and OpenAI in global coding rankings Alibaba Group Holding’s latest artificial intelligence model has clinched a top-tier spot on a major global coding leaderboard, making the Chinese technology giant the only developer other than Anthropic to break into the ranking’s top five spots. Qwen3.7-Max, Alibaba’s latest AI model, scored 1,541 on the Code Arena ranking to claim the fourth spot globally, placing it ahead of rival models from OpenAI and Google. The other four spots in the top five were held by various iterations of Claude models by AI powerhouse Anthropic. Alibaba owns the South China Morning Post. The ranking comes as Chinese AI developers are increasingly pivoting from general-purpose chatbots towards specialised coding agents and other autonomous systems, which investors view as the most commercially viable applications for generative AI. Unlike traditional coding benchmarks such as HumanEval or SWE-bench, which rely on standardised tests, Code Arena users test how well models can independently build complete, interactive web applications from scratch, based on user prompts. Users then vote on anonymised outputs in blind comparisons, meaning the leaderboard closely reflects the preferences of real-world developers. The benchmark is run by Arena, an organisation founded by researchers from the University of California, Berkeley in collaboration with University of California San Diego and Carnegie Mellon University. The industry’s growing focus on coding follows the success of US companies like Anthropic, whose Claude models and coding features have emerged as some of the first AI products to demonstrate sustained user engagement and meaningful revenue potential. A survey conducted last year by Stack Overflow, a popular question and answer website for programmers, showed that 84 per cent of developers had used or planned to use AI tools, while 51 per cent of professional developers used AI tools daily. Qwen3.7-Max was designed for autonomous tasks, allowing it to manage long-running workflows, use software tools and write code on its own. In a WeChat post, Alibaba said the model could handle complex tasks for up to 35 hours straight and use software tools more than 1,000 times in a row – all without human intervention. The move reflects a broader industry shift away from conversational chatbots towards independent AI systems that can complete multi-step projects with minimal human supervision. Several of Alibaba’s domestic rivals are also ramping up efforts in their coding units. Beijing-based DeepSeek recently announced two new positions related to coding agents – a product manager and a software engineer. DeepSeek senior researcher Chen Deli said on social media that the new hires would work on a project “essentially benchmarking against Claude Code”. The goal, he said, would be to develop a coding “harness” – the crucial software infrastructure required to transform a standard AI model into an autonomous AI agent. Because software development relies on globally standardised programming languages, the segment lowers the barriers for Chinese models looking to gain international adoption, compared with consumer-facing internet services. For now, however, US products such as Cursor, GitHub Copilot and Claude Code continue to dominate the global software development workflow. Still, industry leaders, including Microsoft CEO Satya Nadella and Anthropic CEO Dario Amodei, have predicted that the long-term AI race will ultimately depend less on leaderboard scores and more on which companies can successfully embed their models into a developer’s daily routine, to become the default infrastructure for software creation. ### Related Stocks - [KBAB.US](https://longbridge.com/en/quote/KBAB.US.md) - [BABX.US](https://longbridge.com/en/quote/BABX.US.md) - [BABA.US](https://longbridge.com/en/quote/BABA.US.md) - [09988.HK](https://longbridge.com/en/quote/09988.HK.md) - [GOOGL.US](https://longbridge.com/en/quote/GOOGL.US.md) - [GOOG.US](https://longbridge.com/en/quote/GOOG.US.md) - [OpenAI.NA](https://longbridge.com/en/quote/OpenAI.NA.md) - [00583.HK](https://longbridge.com/en/quote/00583.HK.md) - [MSFT.US](https://longbridge.com/en/quote/MSFT.US.md) - [89988.HK](https://longbridge.com/en/quote/89988.HK.md) - [HBBD.SG](https://longbridge.com/en/quote/HBBD.SG.md) ## Related News & Research - [Alibaba AI voice model beats OpenAI, xAI to bridge Chinese dialect gap](https://longbridge.com/en/news/288128873.md) - [How to fix AI's branding problem, according to top marketers](https://longbridge.com/en/news/287767399.md) - [OpenAI's biggest problem isn't AI safety. It's Sam Altman.](https://longbridge.com/en/news/287766259.md) - [Google Cloud Security Uses Instruqt Platform to Train 150+ Practitioners on Agentic AI at Google Next 2026](https://longbridge.com/en/news/287465754.md) - [Anthropic surges past OpenAI with $900B valuation in new funding](https://longbridge.com/en/news/287959648.md)