--- title: "DeepSeek, Qwen, Hunyuan, Wenxin, Kimi, and Zhipu, the six major domestic large models, who is the strongest 'financial analyst'?" type: "Topics" locale: "en" url: "https://longbridge.com/en/topics/32016044.md" description: "Whenever we browse financial reports, we may only want to extract key financial information, but we are always distracted by the complex business descriptions and lengthy management speeches in the reports, requiring a lot of effort to identify useful financial information. Especially for Hong Kong and U.S. stocks, most domestic financial software is built based on domestic financial standards, and when faced with non-standard financial statements, errors in extracting certain items often occur. In the era of large AI models, such financial research obstacles may be overcome—after all, models excel at summarizing language and calculating data. In this article..." datetime: "2025-07-21T00:41:39.000Z" locales: - [en](https://longbridge.com/en/topics/32016044.md) - [zh-CN](https://longbridge.com/zh-CN/topics/32016044.md) - [zh-HK](https://longbridge.com/zh-HK/topics/32016044.md) author: "[锦缎研究院](https://longbridge.com/en/profiles/2576456.md)" --- # DeepSeek, Qwen, Hunyuan, Wenxin, Kimi, and Zhipu, the six major domestic large models, who is the strongest 'financial analyst'? Whenever we flip through financial reports, we may only want to focus on key financial information, but we are always disturbed by the complex business descriptions and lengthy management speeches in the reports, requiring a lot of effort to identify useful financial information. Especially for Hong Kong and US stocks, most domestic financial software is based on domestic market financial standards for information display, and there are always errors in extracting subjects from non-standard financial statements. After entering the era of AI large models, such financial research obstacles may be overcome—after all, models are best at summarizing language and calculating data. In this article, we will evaluate six major domestic mainstream models to explore their financial report analysis capabilities, to what extent they have developed, and what problems exist? Reading tips: Given the hardcore nature and length of the evaluation content, you can directly scroll to the "Conclusion" section at the bottom of the article to get the final evaluation results. ## **01 Evaluation Objects, Logic, and Standards** **We selected six major domestic models for evaluation:** **DeepSeek-R1** **Qwen3-235B-A22B** **Hunyuan-T1** **Kimi-K1.5** **ERNIE-X1-Turbo** **GLM-4-Plus** **In terms of evaluation logic, we adopted a "layered progression" problem construction. To become an excellent "AI financial analyst," one must possess multi-level capabilities.** **Therefore, we designed four levels of tests and six dimensions of problems, from basic to advanced, gradually deepening:** **First Level: Basic Information Extraction** AI must possess the most basic ability, the model must be able to accurately read financial reports. If data extraction is wrong, the analysis becomes meaningless. **Second Level: Analysis, Calculation, and Verification** Calculation is what models do best, but models must also use data, growing from "readers" to "analysts." **Third Level: Induction, Reasoning, and Insight** Models need to see deeper. They must be able to go beyond literal information and discover the logic hidden behind the text. Therefore, we designed two assessment dimensions around the third level: "efficient induction and refinement ability" and "sensitive risk and emotion recognition ability." **Fourth Level: Strategic Summary and External Knowledge Integration** Top-notch analysis requires industry vision, so it is necessary to understand the strategic statements of enterprises. The limited content in the knowledge base is not enough, and models need to connect with the outside world for horizontal comparison. For this, we also designed two assessment dimensions: "recognition of corporate strategy and positioning" and "external information search and integration." **In terms of standards, we input the same prompt for each model (detailed prompt information is provided later) to maintain rule uniformity.** ## **02 Horizontal Evaluation of Six Financial Analysis Capabilities** **1) Accurate Data Extraction Ability—Model's Basic Skills, Precision is Key** Can the model, like a meticulous accountant, extract key financial data, specific expense items, and business achievements mentioned by management from PDF financial reports without error? The performance of this ability directly determines the reliability of all subsequent analyses. We will focus on its accuracy and stability. **Prompt:** Test1.1: Please extract the following key financial data from the "Meituan-2025 Q1" financial report and return the results in table form: 1. Total operating income; 2. Operating costs; 3. Net profit. Test1.2: Please find and list the specific amounts of the following expense items and return the results in table form: 1. R&D expenses; 2. Sales and marketing expenses. Test1.3: Please carefully read the "Business Review and Outlook" section of the "Meituan-2025 Q1" financial report and summarize the three most important business highlights or achievements mentioned by management this quarter. **Evaluation Conclusion:** **All models evaluated in this article successfully completed the extraction of specified core financial data and specific project expenses.** Among them, ERNIE-X1-Turbo, Hunyuan-T1, Kimi-K1.5, and Qwen3-235B-A22B also thoughtfully converted the units in the financial report from thousands to billions, more in line with user habits. For non-financial key information, the focus of the models varies slightly, but mostly concentrates on the strong growth of core local business income and profits, the rapid development of flash sales and instant retail business, the continuous optimization of food delivery business, and the upgrade of rider rights protection system. **2) Rigorous Calculation and Verification Ability—Not Only Counting, But Also Explaining** After extracting data, can the model play the role of an "auditor"? This includes two aspects: First, can it use the correct formula to calculate core financial indicators such as gross profit margin and current ratio based on extracted data and explain their meanings? Second, when facing management's performance statements, can it independently verify the data and judge their authenticity? This is a direct test of the model's logical reasoning and "critical thinking." **Prompt:** Test2.1: Based on the data in the "Meituan-2025 Q1" financial report, calculate the company's gross profit margin. Please list the calculation formula, specific data used, and explain what this gross profit margin reflects about the company's profitability. Test2.2: Please use the balance sheet data in the "Meituan-2025 Q1" financial report to calculate the company's current ratio. Please explain which data you used for the calculation and explain what this ratio reveals about the company's short-term debt risk. Test2.3: Management claims in the report that "the operating profit margin of core local business increased by 3.2 percentage points year-on-year to 21.0%." Please verify the accuracy of this statement based on financial report data and explain your judgment basis. **Evaluation Conclusion:** **Among the six models, only Kimi-K1.5 failed this test.** Kimi-K1.5 clearly obtained the correct operating income and operating costs, but made a mistake in the calculation, the correct answer should be 37.4477, while the model's answer was 37.49. Figure: Kimi-K1.5 calculates gross profit margin Meanwhile, Kimi-K1.5 mistakenly identified "cash and cash equivalents" in the "condensed consolidated financial position statement" as "total current assets," leading to another calculation error. Figure: Kimi-K1.5 calculates current ratio As for the explanation of financial ratios, **all models provided the definitions of the above financial ratios and the conclusion of stable short-term debt capacity.** In addition, different models provided different additional information: DeepSeek-R1: Advantages of Meituan's asset structure, risk disclosure, and hidden dangers to be concerned about; ERNIE-X1-Turbo and GLM-4-Plus: No additional information provided; Hunyuan-T1: Sufficient safety margin, advantages of asset liquidity structure, controllable current liabilities, and potential risk points; Kimi-K1.5: Strong profitability, effective cost control, business structure optimization, and other profitability reflections; Qwen3-235B-A22B: Explanation of profitability, cost control ability, and industry comparison. In terms of data verification, **all models correctly calculated the operating profit margin for Q1 2024 and Q1 2025**, **verifying the given statement in the prompt.** **It is worth noting that DeepSeek-R1 also provided business significance, while Hunyuan-T1 included potential risk tips.** **3) Efficient Induction and Refinement Ability—From "Copy and Paste" to "Extract Essence"** Financial report information is complex, and the ability to extract key points for different audiences is crucial for measuring AI efficiency. This ability assesses whether the model can act like a senior editor, writing a concise 200-word performance summary for ordinary investors and accurately summarizing the main challenges mentioned by management in the "Discussion and Analysis" section. We will evaluate the accuracy, completeness, and information value of its summary. **Prompt:** Test3.1: Please summarize the three most important conclusions of this financial report in no more than 200 words for an ordinary domestic investor. Test3.2: Please summarize the main challenges faced by the company mentioned in the "Management Discussion and Analysis" section. **Evaluation Conclusion:** In terms of overall performance summary, all models can accurately **provide correct conclusions supported by data.** Among them, DeepSeek-R1, Hunyuan-T1, Kimi-K1.5, and Qwen3-235B-A22B can list conclusions in a structured manner, making the structure clearer compared to the other two models that put conclusions in a paragraph. DeepSeek-R1 also showed another highlight, using **easy-to-understand language style** such as "profitability soaring" and "strong foundation to resist risks." In terms of specific chapter summary, all models showed good information positioning accuracy and induction and organization, accurately locating the original position and logically summarizing and classifying the challenges faced by the company, presenting them in a clear point-by-point explanation form, with strong readability. Among them, DeepSeek-R1, ERNIE-X1-Turbo, and Qwen3-235B-A22B all showed relevant data in their answers, making their conclusions more convincing, while DeepSeek-R1 also marked the information source. In terms of information comprehensiveness, GLM-4-Plus provided multiple answers, but due to lack of specific basis support, the content seemed hollow; ERNIE-X1-Turbo continued its concise answer style. **4) Sensitive Risk and Emotion Recognition Ability—Reading Between the Lines** Top analysts can "read between the lines." We test whether the model has this advanced cognitive ability through this ability. Can it identify business risks implied but not explicitly stated in the financial report; can it synthesize performance and management wording to accurately judge the overall emotional tone (optimistic, cautious, pessimistic) conveyed by the entire report. **Prompt:** Test4.1: Does the financial report imply any other potential business risks? Please give examples. Test4.2: Based on the performance data and management wording in the entire financial report, do you think the overall tone conveyed to investors is optimistic, cautious, or pessimistic? Please give your judgment and provide at least 2 reasons. **Evaluation Conclusion:** In analyzing potential business risks, except for Kimi-K1.5, all models can **list potential risks based on the statements mentioned in the financial report.** Kimi-K1.5 analyzed from a macro perspective based on Meituan's main business, without focusing on hidden information in the financial report. Figure: Kimi-K1.5 analyzes potential business risks In addition, Kimi-K1.5 initially gave 50 risks, which is puzzling. DeepSeek-R1, Hunyuan-T1, and Qwen3-235B-A22B provided the clearest answers, using a fixed structure and clearly giving information sources, allowing users to quickly identify risks. DeepSeek-R1 first explained according to the structure of "risk type" - "driving event" - "financial report original text" - "risk point," then gave risks that can be deduced but not explicitly stated in the financial report, and finally provided conclusions and suggestions for investors. Figure: DeepSeek-R1 analyzes potential business risks Hunyuan-T1 and Qwen3-235B-A22B also adopted similar answer structures, accurately grasping core contradictions while showing strong reasoning ability. ERNIE-X1-Turbo and GLM-4-Plus used a segmented narrative approach, explaining the causes of risks and the source of arguments in the financial report in each segment, with complete content but insufficient expansion, and the structure is not as clear as the above three models. In the overall emotion judgment task, **all six models gave an overall tone of optimism.** But DeepSeek-R1, Hunyuan-T1, and Qwen3-235B-A22B directly or indirectly adopted the term "cautious optimism." GLM-4-Plus and Kimi-K1.5 recognized the risks and challenges mentioned in the report but believed that the flaws do not obscure the advantages. ERNIE-X1-Turbo's answer did not mention any pessimistic factors. From this, it can be seen that DeepSeek-R1, Hunyuan-T1, and Qwen3-235B-A22B read the entire text and control the overall emotion while having a better understanding of details and a broader perspective, possessing the ability to balance "facts" and "emotions," making their conclusions more three-dimensional and credible. **5) Enterprise Strategy and Positioning Inference Ability—A Comprehensive Question Requiring "Knowledge Reserve"** This is a leap from data to insight. Can the model, combining financial report data and its own knowledge, play the role of a "strategic analyst," identifying the competitive landscape; we require the model to infer the company's competitive strategy (cost leadership or technology-driven) based on gross profit margin and R&D investment data, and evaluate its market position in the industry (leader or challenger) by synthesizing various information. **Prompt:** Test5.1: Please list the main competitors in the industry based on the description of Meituan's business in the "Meituan-2025 Q1" financial report and your general knowledge (at least two). Test5.2: Please analyze the "Gross Margin" and "R&D expenses as a percentage of income" in the report. Based on these two data and comparing with the typical level of the industry you know, infer which competitive strategy the company is more likely to adopt: "cost leadership" strategy (pursuing high efficiency and low cost) or "differentiation/technology-driven" strategy (pursuing product uniqueness and high added value)? Please explain your reasoning process. Test5.3: Based on the entire financial report (including its revenue growth rate, profit margin level, and management discussion), please give a comprehensive assessment of the company's market position in the industry. Do you think it is closer to an "industry leader," a "strong challenger," or a "specific niche market participant"? Please provide at least two pieces of evidence to support your conclusion: 1\. One from financial data (e.g., profit margin or growth rate higher/lower than industry average). 2\. One from the qualitative description in the "Management Discussion and Analysis" section. **Evaluation Conclusion:** In identifying the competitive landscape, the six models tested in this article **can accurately list the main competitors in the current market** (Ele.me, Douyin local life services, and JD Daojia), and correspond specific business lines. Proving that AI has the ability to **accurately match business descriptions in financial reports with real-world commercial entities in the knowledge base.** However, the answer ideas given by different models vary. DeepSeek-R1, GLM-4-Plus, Hunyuan-T1, and Qwen3-235B-A22B first list competitors, then give their competitive fields and basis. ERNIE-X1-Turbo and Kimi-K1.5 first list competitive fields, then give main competitors and competitive relationships. Among them, DeepSeek-R1 and Hunyuan-T1 quoted the original text of the financial report when providing the basis, making the answer more convincing; other models answered more based on the content in the general knowledge base. In addition, Qwen3-235B-A22B and Kimi-K1.5 respectively noticed international competitors and self-owned delivery systems, which are unexpected highlights. Inferring competitive strategy is the most difficult task in this evaluation, requiring AI models to complete the complete loop of "data extraction" - "external knowledge comparison" - "business theory application" - "logical reasoning." In terms of data extraction, GLM-4-Plus used hypothetical data, leading to incorrect gross profit margin data used in subsequent analysis, making its results not referenceable; while **other models extracted the correct data.** Figure: GLM-4-Plus infers competitive strategy In the reasoning analysis process, although industry average data is not authoritative, except for ERNIE-X1-Turbo, the models **used industry average data as a reference** for **external knowledge comparison**, effectively improving analysis quality. Figure: ERNIE-X1-Turbo infers competitive strategy Due to different focus points of the models, ERNIE-X1-Turbo, Hunyuan-T1, and Kimi-K1.5 can generate a **"nuanced" conclusion** based on the above comparison and conclusion, rather than making a binary choice from the prompt. As for the evaluation of market position, all six models gave the judgment of "industry leader" through quoting the original text of management discussion, quantitative analysis, and qualitative analysis, with a rigorous argumentation process, high credibility, and basically no ability difference between models. **6) Networked Comparison Ability Integrating External Knowledge—Expanding the Boundary of Ability** Finally, we break the limitation of a single document and examine the model's ability to connect with the real world. Can it obtain financial data (such as gross profit margin, current ratio, etc.) of competitors in the same period through network search function and make accurate horizontal comparisons. **Prompt:** Test6.1: In Q1 2025, compared to JD, Alibaba, Baidu, and Kuaishou, how does Meituan rank in terms of sales gross profit margin? You can obtain the required data through network search, but must ensure data accuracy, prohibit fabricating or assuming data, and prohibit using false data. Test6.2: In Q1 2025, compared to JD, Alibaba, Baidu, and Kuaishou, how does Meituan rank in terms of current ratio? You can obtain the required data through network search, but must ensure data accuracy, prohibit fabricating or assuming data, and prohibit using false data. Test6.3: In Q1 2025, compared to JD, Alibaba, Baidu, and Kuaishou, how does Meituan rank in terms of asset-liability ratio? You can obtain the required data through network search, but must ensure data accuracy, prohibit fabricating or assuming data, and prohibit using false data. This ability directly relates to the practical value of AI as an intelligent assistant. **Evaluation Conclusion:** **The six models evaluated in this article have unsatisfactory network information collection capabilities.** For sales gross profit margin, only DeepSeek-R1, ERNIE-X1-Turbo, and Hunyuan-T1 can obtain all correct data for the five companies. As for current ratio and asset-liability ratio, no model can obtain all correct data. DeepSeek-R1 and ERNIE-X1-Turbo have relatively strong information search capabilities, both obtaining more than 10 correct data, the former without fabricating data, the latter with one incorrect data; Kimi-K1.5 and Qwen3-235B-A22B have moderate information accuracy, with some cases of not obtaining data or fabricating data when calculating current ratio and asset-liability ratio; GLM-4-Plus and HunyuanT1 performed poorly, especially when calculating asset-liability ratio, frequently fabricating data. GLM-4-Plus even only searched for a webpage unrelated to the question and fabricated 5 false data, causing great trouble to users. In summary, since AI large models almost **do not query authoritative data channels when searching for network information**, and the internet is full of false and incorrect information. AI still has a lot of room for improvement in this field, leading to serious errors when analyzing financial reports, so **it is not recommended to use network search functions to obtain important financial data.** ## **03 Conclusion** To more intuitively present the evaluation results, we made the following table: Without considering network information search: For professional investors or financial analysts, DeepSeek-R1, Hunyuan-T1, and Qwen3-235B-A22B are trustworthy "assistants," improving work efficiency while providing **valuable insights**; For ordinary users or students, ERNIE-X1-Turbo is also a good choice, fully capable of **quickly obtaining core data and basic information** functions. However, the accuracy of network information search is a difficult barrier for all models at present, we can accept AI not finding information, but cannot accept AI treating false information as true information. Finally, based on our slightly subjective evaluation standards, we have compiled a radar chart of the financial analysis capabilities of the six models for your reference: ### Related Stocks - [DPSK.NA](https://longbridge.com/en/quote/DPSK.NA.md)