---
title: "A professor from Hong Kong University of Science and Technology tested AI glasses for \"cheating\": crushing 95% of students in 30 minutes, completely breaking the traditional teaching assessment system"
type: "News"
locale: "en"
url: "https://longbridge.com/en/news/271649456.md"
description: "Professors at the Hong Kong University of Science and Technology conducted an experiment to test the performance of AI glasses equipped with the ChatGPT-5.2 model in final exams. The glasses completed the exam paper for \"Computer Network Principles\" under real exam conditions, scoring 92.5 points within 30 minutes, ranking in the top five and surpassing 95% of the candidates. This experiment has raised questions about the traditional educational assessment system, highlighting the potential impact of AI in the field of education"
datetime: "2026-01-06T12:20:47.000Z"
locales:
  - [zh-CN](https://longbridge.com/zh-CN/news/271649456.md)
  - [en](https://longbridge.com/en/news/271649456.md)
  - [zh-HK](https://longbridge.com/zh-HK/news/271649456.md)
---

# A professor from Hong Kong University of Science and Technology tested AI glasses for "cheating": crushing 95% of students in 30 minutes, completely breaking the traditional teaching assessment system

It's quite outrageous; AI has truly entered the university final exam hall, and in the role of a cheater. (Just imagine how shocking that is!)

No joke, this incident occurred during the undergraduate final exam for "Computer Network Principles" at the Hong Kong University of Science and Technology.

An AI glasses equipped with the ChatGPT-5.2 model was directly worn on the nose, completing the entire final exam paper under conditions that replicated a real exam.

The result was quite magical: after 30 minutes, it submitted the paper, scoring 92.5 points, and ranked among the top five out of over a hundred participants, easily surpassing more than 95% of human examinees:

Indeed, each generation has its own learning tools; previously, it was cheat sheets and review materials, and now it has directly upgraded to—"the whole machine."

However, when this whole machine can complete the entire exam process, the focus of attention may no longer just be whether AI can answer the questions.

This time, the AI "cheater" simply answered the questions like a human student, but it made the traditional teaching evaluation system seem a bit untenable.

## An AI Glasses Completed an Entire University Final Exam

This seemingly outrageous "human-machine joint exam" was not a last-minute effort by students but an experiment led by Professor Zhang Jun and Professor Meng Zili's team at the Hong Kong University of Science and Technology.

The goal was clear: to let a pair of AI glasses equipped with a large model "cheat" openly in the exam hall and see how high a score it could achieve.

The selected test scenario was also very straightforward, directly targeting the professional course that countless university students dread—Computer Network Principles. (Shivering...)

This course not only tests a vast array of professional concepts but also involves rigorous logical deductions and algorithm applications, posing a significant challenge for human students and an even greater difficulty for AI.

To ensure that this AI examinee could perform at its best, the project team did extensive homework on the "hardware and software" selection!

In the hardware selection phase, the project team systematically evaluated 12 mainstream commercial smart glasses available on the market, including products from familiar brands like Meta, Xiaomi, and Rokid:

After the first round of screening, the team found that there were actually not many products that simultaneously had built-in cameras and integrated displays; the main candidates were only Meta Ray-Ban, Frame, and Rokid.

However, the experiment required secondary development; although Meta provided an access toolkit for the device, it did not open direct control interfaces for display content, making it difficult to meet the experimental requirements for information presentation In contrast, Rokid's SDK is richer, and its ecosystem is more complete, offering significantly higher development freedom.

Considering the camera quality limitations of Frame in scenarios like paper recognition, the research team ultimately chose Rokid AI glasses as the hardware testing candidate for this human-machine joint examination:

When it came to selecting the large model that determines the brain's upper limit, the team compared several mainstream models and ultimately locked in OpenAI's latest model—ChatGPT-5.2, which excels in both response speed and general knowledge capability.

Both the software and hardware "candidates" are now in place, and the next step is the main event—the exam.

The examination process can be described as smooth: students look down at the test paper, and the AI glasses quickly capture the questions through the camera, transmitting the images via the "glasses—phone—cloud" link to the remote large model for inference. The generated answers are then returned along the reverse path and displayed on the glasses' screen for students to transcribe.

And guess what? This AI glasses developed based on Rokid Glasses and equipped with the GPT-5.2 model scored 92.5 points in this final exam, outperforming 95% of the students.

Moreover, in multiple-choice questions and single-page short answer questions, Rokid achieved full marks, and even in the more challenging cross-page short answer questions (SAQ), it scored most of the points:

Additionally, when faced with core questions split across different pages and highly dependent on contextual logic, Rokid still demonstrated strong reasoning coherence.

Even though there were occasional deviations in calculating the most complex parts, the intermediate steps provided by the AI were quite complete, making it adept at handling high-pressure knowledge tasks.

Of course, this test not only validated the software logic but also ruthlessly highlighted the "shortcomings" of current commercial AI glasses.

The first issue exposed was the power consumption problem.

In high-pressure continuous scenarios like exams, connectivity itself has already become a major power drain. In the experiment, as long as Wi-Fi was turned on and high-resolution image transmission was ongoing, the glasses' battery would drop from 100% to 58% within 30 minutes.

In other words, if AI glasses are to truly achieve all-day, long-term use, power consumption control and connection stability remain unavoidable engineering bottlenecks...

Furthermore, the project team found that the "clarity" of the glasses' camera directly determines the AI's vision. If the questions appear blurry, reflective, or at an incorrect shooting angle, even the strongest model can only make inferences based on incomplete information, ultimately reflected in a noticeable decline in answering performance However, the impact and reflection brought by this test do not only stay at the technical level.

Without any special accommodations, AI glasses can still run the entire process of reading questions—understanding—answering quickly and steadily, which in turn highlights a more noteworthy issue—

When teaching assessments primarily focus only on whether a "standard answer" has been submitted, it happens to fall within the capability range where AI excels and is most stable.

Because of this, the teaching assessment method centered on the mastery of knowledge points and standard problem-solving paths begins to seem somewhat strained in an era already surrounded by various "learning machines."

## With smart AI, can traditional teaching assessment standards still hold up?

I wonder if anyone has noticed something quite interesting:

From elementary school all the way to university, the exams we are most familiar with have actually been repeatedly confirming one thing, which is whether we have memorized what the teacher taught and whether we can solve problems step by step using standard methods.

To be honest, for a long time, this assessment method has indeed been quite effective.

Because in terms of memory, calculation, and step-by-step deduction abilities, there are indeed significant differences between individuals; some remember well and calculate quickly, while others tend to miss steps and make calculation errors.

The numbers on the report card can indeed cover a large proportion of a person's learning performance.

But the problem is that when AI begins to perform quickly, steadily, and almost flawlessly on these assessment dimensions, things start to become subtle...

Previously, an entrepreneur named Eddy Xu modified Meta smart glasses to create a "cheating" device that can display optimal solutions in real-time during chess competitions, allowing one to win games steadily without needing to think for themselves:

In this process, AI glasses do not get nervous, do not tire, and do not experience fluctuations in performance; one word to describe it—stable.

This follows the same logic as the performance of Rokid glasses in final exams: as long as the rules of the questions are clear and the evaluation goals are singular, AI can stably complete the process of reading questions—understanding—reasoning—answering.

Even when detached from paper and pencil, it can still achieve high scores in highly structured exams.

Similar cases are not limited to individuals.

Previously, a study from the University of Reading in the UK found that when researchers mixed AI-generated answer sheets into the exam question pool, as many as 94% of the papers successfully "fished in troubled waters," and the average scores of these AIs were even significantly higher than those of real students... (The sky is falling!)

This is indeed a bit awkward—unable to compete with humans, and unable to compete with AI as well:

 At the same time that it surprises and opens people's eyes, a question that was originally not so sharp has been directly pushed to the forefront—

When AI or machines are better at answering according to standards than humans, what exactly is the assessment system, which is centered around written exams and used to measure the mastery of knowledge points, measuring?

Looking back at the original purpose of educational training, we find that many important abilities that have been repeatedly emphasized do not naturally fit the form of "a test paper."

— For example, the ability to ask good questions.

— The ability to make judgments when information is incomplete.

— The ability to weigh options among multiple solutions.

— And the ability to understand real situations and comprehend others' perspectives.

……

These abilities truly point to the learning process, thinking paths, and decision-making quality; whether the answer is standard is just a small part of it.

Moreover, these are the aspects that have long been the hardest for traditional written exams to capture, the easiest to be systematically overlooked, and precisely where AI is most difficult to replace and can best distinguish students' true qualities.

Shifting from a results-oriented approach to an overall assessment of reasoning paths, inquiry processes, interdisciplinary integration, and creative problem-solving abilities may be the real challenge posed to the existing educational assessment system after AI enters the examination room.

## Shifting the Assessment Focus from "Providing Answers" to "Providing Thought Processes"

Educational psychologist Howard Gardner mentioned in "Frames of Mind" that humans possess at least eight different types of intelligence—

Including linguistic, logical-mathematical, spatial, musical, interpersonal, intrapersonal, bodily-kinesthetic, and naturalistic.

From this perspective, human abilities themselves are a highly multidimensional structure, while the educational assessment systems we are familiar with have long focused only on capturing a very narrow segment of this.

This also explains why some students who do not perform outstandingly in standardized tests can demonstrate stronger creativity, collaboration skills, and complex problem-solving abilities in the real world.

After all, a single exam score reflects more about a student's performance stability in a "standardized environment," while personal comprehensive qualities in real situations are not easily revealed...

It is precisely for this reason that how to assess innovation ability, critical thinking, and complex problem-solving skills is becoming an unavoidable practical challenge for educational assessment systems.

Currently, some assessment attempts pointing in different directions have emerged—

Recently, Panos Ipeirotis, a professor at NYU Stern School of Business, introduced an AI-supported oral assessment method, where students not only need to submit assignments but also explain their decision-making basis and thought processes on the spot, unfolding their understanding and reasoning in dialogue.

In this mechanism, AI first acts as the examiner to ask follow-up questions and then participates in the subsequent assessment phase.

Claude, Gemini, and ChatGPT will independently score the oral exam transcripts, then cross-check and revise the results to determine whether students truly understand the questions while exposing common blind spots in teaching: 
How to put it, this approach cannot be said to specifically "target" AI, but it does shift the focus of teaching assessment a step towards understanding itself.

Similar changes are not isolated cases; previously, The Washington Post also mentioned that some foreign universities have begun to introduce oral exams and presentation assignments, which essentially aim to make students' thinking processes more visible.

So looking back, when the Leqi AI glasses equipped with GPT-5.2 enter the exam room and deliver high scores, whether AI has "outperformed" students seems less important.

It feels more like a special yet clear development experiment, bringing to light a long-standing issue that has rarely been addressed:

Traditional teaching assessments heavily rely on final answers but can hardly depict the entire learning process.

Scores are certainly meaningful, but the range they can explain is narrowing. Whether understanding truly occurs, whether thoughts are coherent, and whether judgments are made through trade-offs—these key links are still compressed into a single result, making them difficult to distinguish and see.

It is precisely on this point that simply keeping technology out has become quite challenging in responding to the issue itself. (And it may not even be able to block it...

The more realistic challenge has become how to enable students to use AI for information organization, scenario simulation, and hypothesis validation, focusing human energy on judgment, understanding, and choices—these aspects that cannot be "outsourced."

When tools can reliably complete information extraction and standard responses, whether classrooms and exams can still differentiate between different levels of thinking is being brought to the forefront.

Risk Warning and Disclaimer

The market has risks, and investment requires caution. This article does not constitute personal investment advice and does not take into account individual users' specific investment goals, financial conditions, or needs. Users should consider whether any opinions, views, or conclusions in this article align with their specific circumstances. Investing based on this is at one's own risk

## Related News & Research

- [Forget AI Models. This Company Sells What Every AI Giant Needs.](https://longbridge.com/en/news/289086501.md)
- [SpaceX IPO Fears Are Overblown, But the AI Bet Is Real](https://longbridge.com/en/news/289483157.md)
- [Meta made its own AI-generated clickbait news feed](https://longbridge.com/en/news/288934190.md)
- [Meta Forms Running Triangle as Wave E Support Comes Into Focus](https://longbridge.com/en/news/289437519.md)
- [06:10 ETBrandi AI Wins 2026 Intellyx Digital Innovator Award](https://longbridge.com/en/news/289308459.md)