--- title: "Mobilizing a large workforce to enter the market, JD.com aims to \"refine\" embodied data" type: "News" locale: "zh-CN" url: "https://longbridge.com/zh-CN/news/279553310.md" description: "Targeting the data desert of embodied intelligence" datetime: "2026-03-18T07:49:08.000Z" locales: - [zh-CN](https://longbridge.com/zh-CN/news/279553310.md) - [en](https://longbridge.com/en/news/279553310.md) - [zh-HK](https://longbridge.com/zh-HK/news/279553310.md) --- > 支持的语言: [English](https://longbridge.com/en/news/279553310.md) | [繁體中文](https://longbridge.com/zh-HK/news/279553310.md) # Mobilizing a large workforce to enter the market, JD.com aims to "refine" embodied data On March 16, JD.com announced the establishment of the world's largest and most comprehensive embodied intelligent data collection center, making a significant impact in the robotics sector, which had been overshadowed by the lobster phenomenon and quiet for a while. In a sense, this is a data mass production movement with a strong industrial internet flavor. This mobilization involves over 100,000 internal employees and up to 500,000 personnel from various industries externally, with more than 100,000 citizens mobilized in Suqian alone—**this unprecedented human wave tactic attempts to forcefully penetrate the most critical weakness of embodied intelligence today: data scarcity, using a large-scale aesthetic of violence.** As model architectures gradually converge and the threshold for computing power becomes relatively transparent, high-quality physical interaction data has become the only decisive factor determining whether robots can truly enter various industries. Behind this action, defined as "the largest data collection operation in human history," reveals an industry consensus: as the "little brain" responsible for motion control in embodied intelligence becomes increasingly developed, how to feed a truly understanding brain of the physical world with higher quality data is becoming the core battle that determines the future landscape of the industry. Transitioning from JD.com's grand narrative to the micro reality of the industry, it is still difficult to determine whether the data generated by these hundreds of thousands of people is a gold mine or gravel. ## Involved Workers The reason JD.com dares to, and must, initiate this data human wave battle lies in its vast and highly complex self-operated physical supply chain. Unlike pure software internet companies, JD.com itself is a huge interactive physical world, and the maturity of embodied intelligence directly relates to its fulfillment costs and operational efficiency over the next decade. This layout forms a deep coupling with the robotics industry ecosystem in Beijing's Yizhuang. The Yizhuang Economic and Technological Development Zone has currently gathered over 300 robotics-related enterprises, with an industrial chain scale exceeding 10 billion yuan, opening more than 40 real application scenarios, becoming the core aggregation area for the domestic humanoid robotics industry. As a "chain master" enterprise rooted in Yizhuang, JD.com has previously released a plan to accelerate the robotics industry. JD.com's significant investment in soft infrastructure represented by the data collection center is essentially filling the most lacking link in the industrial chain. Yizhuang provides the "trunk" and testing ground, while JD.com attempts to inject common sense for robots to understand the real world through massive scenarios. **This soft-hard industrial resonance aims to create a commercial closed loop from data flywheel to hardware iteration.** Coordinating hundreds of thousands of people is no easy task. According to the plan, the collection scenarios cover logistics, industry, retail, etc. In practice, this will likely rely on JD.com's existing digital management network. For example, frontline couriers and warehouse sorting staff may wear wearable devices equipped with visual and even tactile sensors during daily operations. From the perspective of frontline employees and the mobilized citizens of Suqian, this movement is filled with complexity. Employees become the data teachers for robots, whose future goal is to replace high-intensity manual labor. How to design reasonable compensation incentives and benefit distribution mechanisms to avoid employee resistance has become a consideration for JD.com However, the specific implementation has not yet been communicated to the employees. An employee from JD.com in the Beijing area told Wall Street Insight that he has not heard about this matter yet. In his view, **if there is corresponding compensation, it should be considered a market behavior, and whether employees are willing to participate depends on personal choice.** An employee from JD.com in Suqian also mentioned to Wall Street Insight that he has not received any relevant notification. Although the official announcement states, "JD.com will strictly comply with laws and regulations for all data collection," the reality is often more complex. In the case of express delivery, while the warehousing assembly line is standardized, express delivery penetrates into thousands of households, and retail scenarios involve a large number of consumers' facial features and privacy habits. In today's increasingly strict data compliance environment, the compliance costs for desensitizing and cleaning the unstructured data collected by hundreds of thousands of people could be astronomical. ## Breaking the Moravec Paradox In 1988, roboticist Hans Moravec concluded: **"It is easy to get computers to perform at adult levels on intelligence tests or chess, but it is extremely difficult, if not impossible, to give them the perceptual and motor skills of a one-year-old."** Today, the main reflection of embodied intelligence on the Moravec paradox focuses on the industry's data vacuum. The success of large models is built on directly consuming trillions of high-quality text data accumulated over thirty years on the internet. However, the physical world does not have a ready-made internet. For embodied intelligence to scale in the real world, it faces a huge data wall. JD.com's current major initiative targets this anchor point and the dilemmas behind data collection. First, the issue of simulation limitations needs to be addressed. At present, the mainstream paths for the industry to obtain data have undergone serious differentiation and are struggling within their respective bottlenecks. Currently, the vast majority of startups heavily rely on simulation environments, such as NVIDIA's Isaac Sim or MuJoCo, allowing robots to undergo millions of reinforcement learning iterations in a virtual world. This method is extremely low-cost, fast, and does not require worrying about hardware damage caused by trial and error. However, **experienced practitioners are increasingly aware of the limitations of "Sim-to-Real."** The complexity of the physical world lies not only in visual light and shadow changes but also in extremely subtle physical contact feedback, such as the flexible deformation of cables, the non-rigid pulling of clothing, the slight changes in friction when screws are tightened, and even the electromagnetic noise of the sensors themselves. **Current physical engine computing power cannot perfectly simulate these high-dimensional, nonlinear microscopic physical laws.** This leads to many models that perform perfectly in simulation environments experiencing severe "brain block" or action distortion once deployed on real machines. Since there is a gap in simulation, let's return to the real world. From the Mobile ALOHA that became popular at Stanford to today's leading companies like Figure AI, Yushu, and Zhiyuan, a large amount of remote operation is being used—where humans wear motion capture suits or use VR devices to control robots like avatars to perform tasks, thereby recording first-person visual, joint angle, and torque data This is currently recognized as the highest quality method of data acquisition, but it encounters the second major problem of data collection in business, which is the extremely unfavorable input-output ratio in terms of economic efficiency. According to industry estimates, the hardware cost of a single full-sized humanoid robot can easily reach hundreds of thousands or even millions, and collecting effective data through remote operation not only incurs high hardware depreciation costs but also requires paying high labor costs for specialized operators. Wall Street Insight has learned that the collection and cleaning cost of a single high-quality complex interactive task data can reach hundreds of dollars, with a very high failure rate. This workshop-style, hand-crafted data model cannot support the hundreds of billions or trillions of parameter scale required for embodied intelligence to move towards generalization. To lower the threshold, giants like Google have initiated open-source dataset programs such as Open X-Embodiment, attempting to centralize data from laboratories around the world for use across the industry. Domestic companies have also chosen to open-source millions of real machine datasets. However, there lies another major dilemma in data collection, a huge engineering challenge, which is the extreme fragmentation of robot hardware. Dog-like, wheeled, bipedal humanoid robots, and even humanoid robots from different manufacturers have completely different joint degrees of freedom, motor torque, sensor layouts, and center of gravity structures. High-quality grasping data trained on an UR5 robotic arm cannot be directly transferred to a Tesla Optimus or JD's logistics robot. It is precisely the difficulty of "cross-body mapping" that has led to most existing open-source data becoming scattered islands, making it difficult to form economies of scale. Perhaps it is under the aforementioned three major dilemmas that the commercial competition logic in the embodied intelligence track has fundamentally changed: whoever possesses real landing scenarios has a moat for continuously acquiring cheap, high-quality closed-loop data. This explains why Tesla and JD have chosen a route that is completely different from other pure hardware startups. Tesla relies on its massive super factory, allowing Optimus to directly experiment day and night on real battery sorting assembly lines; while JD attempts to create a semi-automated data assembly line through its nationwide logistics network, hundreds of thousands of industrial workers, and a vast physical retail system. This approach directly transforms the company's supply chain barriers into data barriers in the AI era. In stark contrast, many robot startups without their own scenarios are forced to transform—they either sell hardware at a loss to universities and research institutions at low prices to exchange for researchers sharing data, or they can only spend heavily to rent space in factories or hire emerging embodied intelligence data service providers like JianZhi to customize data. It can be said that JD's entry has completely torn apart the algorithmic veil of the embodied intelligence industry, pulling it into a heavy asset commercial battle period characterized by competition in capital, scenarios, and human resource scheduling. In the face of data scarcity, the moat of algorithms is becoming shallower, while giants that control the real physical world interaction gateways are quietly tightening this net leading to AGI ## Scarcer High-Quality Data In response to JD.com's plan to "accumulate over 10 million hours of real scene data within two years," industry insiders have reacted not with unbridled enthusiasm but with a calm examination. In the context of embodied intelligence, the quality and modality of data are far more important than mere duration. The algorithm industry points out the current core pain point: what is lacking is not first-person videos from a human perspective, but "state-action pairs" that include precise physical feedback. For example, a citizen in Suqian walking through a supermarket with a camera, or a delivery person recording the delivery process, generates a massive amount of internet-level generalized visual data. This data is highly valuable for training a robot's world model, helping it understand what a door is and what an apple is; however, for training a robot's "control strategy," teaching it how much force in Newtons to apply to hold an apple without crushing it, such purely visual data is almost ineffective. A person in the robotics industry told Wall Street Insights that **what robots lack is valuable data, especially real machine data.** In their view, JD.com's operation still falls under the category of business process outsourcing (BPO), providing personnel and venues. When humans perform physical grasping, it is accompanied by extremely complex tactile, force, and proprioceptive adjustments; these high-dimensional implicit knowledge cannot be captured by ordinary wearable devices. If JD.com's hundreds of thousands of personnel are only contributing videos, the subsequent conversion rate to executable actions for robots will be astonishingly high. Another head of a leading domestic robotics company candidly stated that the primary challenge in the industry is "the lack of a unified data set definition standard." For instance, each robotics company has different joint degrees of freedom, sensor positions, and actuator types. How can the massive human motion data collected by JD.com be redirected and mapped to different configurations of robotic bodies? Without a unified underlying standard, this 10 million hours of data may ultimately only become proprietary nutrition for JD.com's self-developed robots, rather than serving as infrastructure to drive progress across the entire industry. This may be why JD.com emphasized "1 million hours of robotic body data collection" in its first-year plan. The true direction of industry development lies in using generalized human videos for world cognition pre-training, high-quality data fine-tuning for skill learning in robotic bodies, and reinforcement learning self-exploration for evolution and iteration. JD.com’s announcement to build an embodied intelligence data collection center marks the beginning of domestic companies attempting to address the data shortage in the robotics industry through scaled and engineered methods. By combining physical scenarios with large-scale manpower, it can indeed provide a new path for data accumulation. However, to truly achieve "intelligent emergence" in robots, merely piling up data at scale is not enough. How to ensure high dimensionality and high quality of data in massive collections, **how to establish unified data standards, and how to properly handle privacy and compliance issues in large-scale collection will be the challenges that enterprises and the entire industry must address as they move toward commercialization.** ### 相关股票 - [JD LOGISTICS (02618.HK)](https://longbridge.com/zh-CN/quote/02618.HK.md) - [JD HEALTH (06618.HK)](https://longbridge.com/zh-CN/quote/06618.HK.md) - [JD.com (JD.US)](https://longbridge.com/zh-CN/quote/JD.US.md) - [JD-SW (09618.HK)](https://longbridge.com/zh-CN/quote/09618.HK.md) - [Amplify Online Retail ETF (IBUY.US)](https://longbridge.com/zh-CN/quote/IBUY.US.md) - [Proshares Big Data Refiners ETF (DAT.US)](https://longbridge.com/zh-CN/quote/DAT.US.md) - [KraneShares 2x Long JD Daily ETF (KJD.US)](https://longbridge.com/zh-CN/quote/KJD.US.md) - [ProShares Online Retail (ONLN.US)](https://longbridge.com/zh-CN/quote/ONLN.US.md) - [Global X E-commerce ETF (EBIZ.US)](https://longbridge.com/zh-CN/quote/EBIZ.US.md) - [Franklin Exponential Data ETF (XDAT.US)](https://longbridge.com/zh-CN/quote/XDAT.US.md) - [VanEck Retail ETF (RTH.US)](https://longbridge.com/zh-CN/quote/RTH.US.md) - [SPDR S&P Retail ETF (XRT.US)](https://longbridge.com/zh-CN/quote/XRT.US.md) - [Global X Data Center & Dgtl Infrs ETF (DTCR.US)](https://longbridge.com/zh-CN/quote/DTCR.US.md) ## 相关资讯与研究 - [JD.com files Form 3 disclosing CEO and director Xu Ran’s beneficial ownership](https://longbridge.com/zh-CN/news/279665191.md) - [JD.com launches Joybuy in Europe, emphasizing same-day delivery](https://longbridge.com/zh-CN/news/279368753.md) - [Tcfg Wealth Management LLC Makes New Investment in The TJX Companies, Inc. $TJX](https://longbridge.com/zh-CN/news/279597584.md) - [JD.com launches Joybuy in Europe, targeting Amazon](https://longbridge.com/zh-CN/news/279175472.md) - [Huatai Securities Keeps Their Buy Rating on JD.com, Inc. Class A (9618)](https://longbridge.com/zh-CN/news/278664490.md)