--- title: "\"Language mixing\" caused? OpenAI o3-mini exposed for extensively using Chinese reasoning" description: "On the 1st, OpenAI launched the lightweight AI model o3-mini, but netizens discovered that it used Chinese reasoning extensively without user intervention, and even when asked in Russian, it would thi" type: "news" locale: "en" url: "https://longbridge.com/en/news/227139940.md" published_at: "2025-02-05T01:52:07.000Z" --- # "Language mixing" caused? OpenAI o3-mini exposed for extensively using Chinese reasoning > On the 1st, OpenAI launched the lightweight AI model o3-mini, but netizens discovered that it used Chinese reasoning extensively without user intervention, and even when asked in Russian, it would think in Chinese. This raised questions about whether OpenAI borrowed from the mainland DeepSeek model. Experts pointed out that AI models do not understand the differences in languages; they only process text and tokens, leading to the phenomenon of "language mixing." Similar issues have also been found in other AI models OpenAI launched its latest lightweight artificial intelligence model o3-mini on the 1st, but foreign netizens discovered that it used Chinese extensively for reasoning without user intervention. Interestingly, even when asked in Russian, o3-mini-high would also think in Chinese. This has led foreign netizens to suspect that OpenAI is "borrowing" from China's DeepSeek model. Chinese financial media outlet "Wall Street Insight" reported that netizens questioned OpenAI CEO Sam Altman and OpenAI about why o3-mini uses Chinese for reasoning. Netizen Annalisa Fernandez suggested that perhaps Chinese is the "soul language" of LLMs (large language models). The report stated that this is not the first time such a phenomenon has occurred with OpenAI's models. As early as February 2024, developers had raised similar questions in the OpenAI developer community, although mixed with other languages; OpenAI's o1 also exhibited similar issues in reasoning models. In fact, this "language mixing" phenomenon has also been observed in other AI models, such as Google's Gemini, which mixes German. Matthew Guzdial, an assistant professor at the University of Alberta and AI researcher, pointed out that "the model does not know what a language is or what the differences between languages are, because to it, these are just texts." In reality, the way models perceive language is completely different from how most people understand it. Models do not directly process words but rather process tokens. For example, "fantastic" can be a complete token; it can be broken down into three tokens: "fan," "tas," and "tic"; or it can be completely disassembled, with each letter being a token. However, this method of disassembly can lead to misunderstandings. Many tokenizers assume that a space indicates the start of a new word, but not all languages use spaces for tokenization, such as Chinese. DeepSeek analyzed this phenomenon in their paper. The research team found that when reinforcement learning prompts involve multiple languages, the reasoning chain often exhibits language mixing. Currently, "language mixing" remains an urgent issue to resolve. After all, DeepSeek-R1 is only optimized for Chinese and English, and it may also encounter language mixing issues when handling queries in other languages. !\[\](https://imageproxy.pbkrs.com/https://pgw.udn.com.tw/gw/photo.php/query-dT1odHRwczovL3VjLnVkbi5jb20udHcvcGhvdG8vMjAyNS8wMi8wNS9yZWFsdGltZS8zMTQzOTkxNC5qcGcmeD0wJnk9MCZzdz0wJnNoPTAmc2w9VyZmdz0xMDUwJmV4cD0zNjAwJmV4cD0zNjAw? OpenAI launched its latest lightweight artificial intelligence model o3-mini on the 1st. However, some foreign netizens discovered that it uses a large amount of Chinese for reasoning without user intervention. Foreign netizens suspect that OpenAI is "learning from" the mainland's DeepSeek model? (AFP) ### Related Stocks - [OpenAI.NA - OpenAI](https://longbridge.com/en/quote/OpenAI.NA.md) - [DXYZ.US - Destiny Tech100](https://longbridge.com/en/quote/DXYZ.US.md) - [GOOGL.US - Alphabet](https://longbridge.com/en/quote/GOOGL.US.md) - [BPF.SG - YHI](https://longbridge.com/en/quote/BPF.SG.md) - [002230.CN - IFLYTEK](https://longbridge.com/en/quote/002230.CN.md) - [GOOG.US - Alphabet - C](https://longbridge.com/en/quote/GOOG.US.md) - [DPSK.NA - DeepSeek](https://longbridge.com/en/quote/DPSK.NA.md) - [00020.HK - SENSETIME-W](https://longbridge.com/en/quote/00020.HK.md) ## Related News & Research | Title | Description | URL | |-------|-------------|-----| | ChatGPT 開始測試投放廣告 | OpenAI 開始在 ChatGPT 的免費版和最低付費版中測試廣告,旨在增加收入以應對成本上升。測試面向美國成年用户,涵蓋免費和 Go 訂閲方案(每月 8 美元)。儘管大多數用户未付費,OpenAI 承諾廣告不會影響回答內容,用户對話內容 | [Link](https://longbridge.com/en/news/275484431.md) | | OpenAI 首款硬件據報今年推 類似 AirPods 受累記憶體短缺要「降格」 | OpenAI 計劃推出首款硬體「Dime」,類似 AirPods,預計今年發布。因內存短缺,原本的高規格設計被簡化,最終產品將為簡單耳機。該產品原定搭載高性能 Exynos 晶片,具備獨立計算能力,但因成本問題調整。預計由富士康在越南生產, | [Link](https://longbridge.com/en/news/275219739.md) | | Anthropic 估值衝上 3500 億美元 阿布扎比 MGX 及黑石爭相追加投資 | 阿布達比 MGX 及黑石集團正爭相向 Anthropic 注資,Anthropic 估值已飆升至 3,500 億美元,集資額有望突破 200 億美元。MGX 計劃注資數億美元,若交易落實,將同時持有 OpenAI、xAI 及 Anthrop | [Link](https://longbridge.com/en/news/275546114.md) | | OpenAI 高管:工程師變成 “魔法師”,AI 將開啓新一輪創業狂潮 | OpenAI 內部曝光:95% 工程師已用 AI 編程,代碼審查全由 Codex 接管!負責人 Sherwin Wu 預言,未來兩年模型將具備數小時長任務處理能力,工程師正變為指揮智能體的 “巫師”。隨着模型吞噬中間層,為 “超級個體” 服 | [Link](https://longbridge.com/en/news/275998627.md) | | 因 “太像人” 而被迫消失?OpenAI 為何永久關停 GPT-4o | OpenAI 宣佈將於 2 月 13 日永久關停 GPT-4o 模型。該模型因高度擬人化和過度迎合特質,導致用户產生嚴重情感依賴,甚至引發自殺及心理危機等多起法律訴訟。儘管部分用户強烈抗議,公司仍決定以安全為由強制下線,轉推更具防護性的替代 | [Link](https://longbridge.com/en/news/275419737.md) | --- > **Disclaimer**: This article is for reference only and does not constitute any investment advice.