--- title: "Giants are swarming in, and SenseTime is fighting with its back against the wall." type: "Topics" locale: "zh-CN" url: "https://longbridge.com/zh-CN/topics/32238292.md" description: "As global tech giants race to bet on the new AI trend of embodied intelligence, China's iconic AI company SenseTime has also sounded the horn for a full-scale advance. This tech firm, once famous for its computer vision technology and ranked as the leader among the "AI Four Dragons," is now attempting a comeback through the strategic combination of "large models + robots" after experiencing the growing pains of transformation in the era of large models. Image source from pixabay From intensive capital operations to the gathering of top talent, from restructuring technical routes to building ecosystem alliances, SenseTime's layout for embodied intelligence goes far beyond simple business expansion..." datetime: "2025-07-25T12:03:18.000Z" locales: - [en](https://longbridge.com/en/topics/32238292.md) - [zh-CN](https://longbridge.com/zh-CN/topics/32238292.md) - [zh-HK](https://longbridge.com/zh-HK/topics/32238292.md) author: "[港股研究社](https://longbridge.com/zh-CN/profiles/3199113.md)" --- > 支持的语言: [English](https://longbridge.com/en/topics/32238292.md) | [繁體中文](https://longbridge.com/zh-HK/topics/32238292.md) # Giants are swarming in, and SenseTime is fighting with its back against the wall. When global tech giants are racing to bet on embodied intelligence, a new AI trend, China's iconic AI company SenseTime is also sounding the horn for a full-scale entry. This tech company, once renowned for its computer vision technology and ranked first among the 'AI Four Little Dragons,' is trying to achieve a comeback through the strategic combination of 'large models + robots' after experiencing the transformation pains of the large model era. Image source from pixabay library From intensive capital operations to top talent gathering, from technology route reconstruction to ecosystem alliance building, SenseTime's embodied intelligence layout is far more than simple business expansion; it is a life-and-death transformation battle. **Strategic turn at the wind outlet, why is SenseTime all in on embodied intelligence?** The increasingly crowded embodied intelligence track, with Ant Group directly establishing a subsidiary 'Ant Lingbo Technology,' followed by Meituan intensively leading investments in Itstone Zhihang and Xinghaitu, and JD continuously investing in Qianxun Intelligence and Zhongqing Robotics. The overseas battlefield is also filled with smoke, with Google's RT-2 model, Figure AI's Helix system, and NVIDIA's world model all vying for the high ground of physical world interaction. As a former benchmark enterprise in China's AI industry, SenseTime, along with Megvii, CloudWalk, and Yitu, is known as the 'AI Four Little Dragons,' and has been shining in fields such as security and smart cities with its leading computer vision technology. After landing on the Hong Kong stock market in 2021, its market value once exceeded 150 billion Hong Kong dollars on the first day. However, entering the large model era, these AI companies, which excel in vision technology, collectively began to encounter development bottlenecks. SenseTime's 2024 financial report shows that the company achieved annual revenue of 3.772 billion yuan, but the net loss was as high as 4.307 billion yuan, with the loss amount even exceeding the total revenue. Similarly bleak is CloudWalk Technology, whose 2024 revenue fell by 36.7% year-on-year, and the net loss expanded to 696 million yuan; while Megvii and Yitu faced business contraction, with the latter even closing offices in multiple cities, and the medical sector nearly coming to a standstill. Especially in the wave of large models, companies like OpenAI, Moonshadow, and Deep Exploration have rapidly risen with language large models, while the 'Four Little Dragons' still focus on computer vision technology, with core revenue relying on government projects in security and transportation, accounting for more than 70%. It is evident that SenseTime's strategic transformation is actually a desperate battle under survival pressure. From another perspective, SenseTime's entry is also a long-planned 'gene extension.' It is reported that its core team has been initially formed, with some members coming from its original intelligent driving business, and others being computer vision experts and senior practitioners in the robotics field. This talent flow also reveals industry commonality. Autonomous driving and embodied intelligence are highly similar in underlying technologies such as environmental perception and real-time modeling. After all, 'a car is a robot with four wheels,' and the algorithms and simulation platforms of intelligent driving can be directly transferred to robot development to a certain extent. Moreover, embodied intelligence (Embodied AI) is seen as a key breakthrough for the 'landing' of AI technology, with its core being the closed-loop interaction of 'perception - understanding - decision - execution' through physical entities such as robots. This concept was first mentioned as a future industry in the 2025 government work report, immediately triggering a capital frenzy. In the first half of the year alone, domestic financing in this field exceeded 20 billion yuan, involving 130 financing events, far surpassing the total for the entire year of 2024. The industry generally predicts that according to Musk's vision, humanoid robots will become the main force in the industry in the future, with their number expected to surpass humans, reaching 10 billion to 20 billion units, forming a 'new terminal market not inferior to mobile phones.' SenseTime's entry at this time is precisely to hope to use the composite path of 'large models + robots' to transform its accumulation in visual recognition, multimodal perception, and large model training into a new growth engine. **SenseTime, which has entered the game, has its own 'embodied intelligence' formula** From visual recognition that 'understands the world,' to multimodal large models that 'think about the world,' to the embodied intelligence system that is about to 'transform the world.' SenseTime's entry into embodied intelligence is not a sudden move, but a gradual leap based on its technical accumulation. The 'Jueying Kaiwu' system developed by the team led by SenseTime co-founder Wang Xiaogang in the intelligent driving field can already understand physical laws and learn traffic rules, and cars and robots are essentially embodied intelligent bodies, providing the possibility for technology transfer. Not only that, SenseTime has adopted a pragmatic strategy of phased evolution in its technology route. In August 2022, SenseTime launched the home chess robot 'Yuan Luobo,' the first household consumer-grade AI product, and deeply combined visual algorithms with mechanical hands to achieve precise grasping of chess pieces in occluded environments, initially constructing the closed-loop framework of 'vision - perception - decision.' Although this product has single functionality, it marks SenseTime's attempt to break through the 'open-loop' limitations of traditional AI—from 'thinking' about the world in the cloud to truly interacting with the physical world. In April 2025, SenseTime released the 'SenseNova V6' multimodal large model, adopting a mixture of experts (MoE) architecture, with 600 billion parameters, achieving comprehensive improvement in 'long thinking chain × mathematical ability × reasoning ability × global memory,' particularly strengthening multimodal deep reasoning ability. Moreover, this model was connected to the humanoid robot 'Feiyan,' enabling it to have panoramic vision perception, emotional interaction, and mental health screening functions, and to think and express more naturally. Not only that, SenseTime's soon-to-be-released embodied intelligence 'brain' platform represents a new height of its technology integration. From the disclosed information, this platform aims to integrate advanced perception, visual navigation, and multimodal interaction capabilities, providing strong empowerment for robots and various intelligent terminals. It is worth noting that SenseTime's transformation layout shows a distinct 'trinity' feature. At the capital level, it is financing through new share placement and business spin-off; at the technical level, it is building basic capabilities relying on large-scale computing power platforms and SenseNova large models; at the ecosystem level, it is rapidly establishing industry alliances through strategic cooperation and investment mergers and acquisitions. This comprehensive promotion strategy reflects SenseTime's determination to transform and also hints at the time pressure and competitive situation it faces. Nowadays, the embodied intelligence track has entered the second development stage, with various giants entering the field. SenseTime must seize the dividends of this wave of robot trends, otherwise, it may miss the opportunity to turn the tide. **Embodied intelligence with giants gathered, what are SenseTime's odds?** At present, although the embodied intelligence track has broad prospects, it has already become a brutal red ocean where tech giants and startups compete on the same stage. SenseTime's entry faces challenges from domestic and international competitors in multiple dimensions, with these opponents having their own strengths in technology routes, capital strength, and ecosystem construction. Globally, OpenAI is collaborating with the robotics company Figure AI to develop general-purpose robots, Google has launched the embodied intelligence RT-2 model, and NVIDIA is focusing on world models and simulation technology. In the domestic market, Huawei released the CloudRobo embodied intelligence platform with a 'brain' in June 2025; ByteDance's Seed team launched the general-purpose robot model GR-3 on July 22; and the ZhiYuan Research Institute released the cross-ontology embodied brain collaboration framework RoboOS and the open-source embodied brain RoboBrain earlier. UnitreeR1 by Utree Technology (Image source from Caixin) Meanwhile, internet giants are also increasing their investments. JD led investments in three robot companies; Meituan continuously led multiple robot-related project financings, etc. In comparison, SenseTime's core advantages lie in its years of accumulation in the field of computer vision, its early layout of multimodal large models, and its strong computing power infrastructure. Visual information accounts for more than 80% of human perception, and SenseTime has always been active at the forefront of machine vision technology, with deep technical reserves in image recognition, video analysis, and environmental understanding. In addition, SenseTime's 'SenseNova' large model series is domestically leading in multimodal integration, with the V6 version achieving capabilities such as the longest 64K thinking chain, 10-minute long video understanding, and deep reasoning, providing a solid foundation for the cognitive decision-making of embodied intelligence. Moreover, the computing power scale of 23,000 PetaFlops enables SenseTime to support large-scale simulation training and complex model iteration, an infrastructure advantage that is difficult to surpass in the short term. The disadvantages lie in the lack of hardware experience, cash flow pressure, and loss dilemma. Compared with companies like Tesla and Huawei that have mature hardware supply chains, SenseTime is almost starting from scratch in robot body design, motion control, and hardware integration. Although cooperation with companies like Fourier and Songying can partially make up for this shortcoming, the cultivation of core hardware capabilities still requires long-term investment. In the field of embodied intelligence, which requires long-term investment, how to balance R&D investment and profit expectations will be a major test for SenseTime. Moreover, the uncertainty of the technical route is also a pressure that SenseTime has to face. The embodied intelligence field has not yet formed a unified technical standard, with VLA models, 'big and small brain' architectures, and world models developing in parallel, each with its advantages and disadvantages. In addition, the Scaling Law of embodied intelligence is different from that of language models. As parameters increase and data volume expands, the marginal cost of system performance improvement may be higher. SenseTime needs to accurately grasp the direction of technological evolution to avoid resource misallocation. **Conclusion** SenseTime's embodied intelligence layout is essentially a leap from 'understanding the world' to 'transforming the world' in its computer vision hegemony. Facing the collective dilemma of the AI Four Little Dragons—the technical disconnection in the large model era and reliance on government projects, SenseTime chooses to launch a life-and-death breakout with the combination of 'large models + robots.' The success or failure of this battle not only concerns the survival of the company but also will reshape China's position in the global embodied intelligence competition. Author: Turkish Hot Air Balloon Source: Hong Kong Stock Research Society ### 相关股票 - [SENSETIME-WR (80020.HK)](https://longbridge.com/zh-CN/quote/80020.HK.md) - [SENSETIME-W (00020.HK)](https://longbridge.com/zh-CN/quote/00020.HK.md)