Tests, exams and exams no longer make sense. Here's the culprit

ChatGPT scored higher than the top candidates in the University of Tokyo and Kyoto University entrance examinations. This is the moment when you need to ask yourself about the meaning of tests based mainly on knowledge, calculations and patterns.

LifePrompt Inc. checked how ChatGPT 5.2 Thinking will cope with one of the most prestigious entrance exams in Japan. It turned out that the model scored higher than the top admitted applicants at the University of Tokyo and Kyoto University. Big deal, because in 2024 the AI used in a similar test failed to pass all the exams for the University of Tokyo. Look at the pace: it took only two years of market development and artificial intelligence began to cope without any problems with some of the most difficult tests in the world. The Japanese education system is famous for the high difficulty of entrance exams.

ChatGPT succeeded where it failed before

According to LifePrompt, the model scored 452 out of a possible 550 on the University of Tokyo’s Humanities and Social Sciences exam. In the science exam, he scored 503 points out of 550, surpassing the highest scores of admitted candidates announced by the university. For comparison, the highest score in the Humanities and Social Sciences III track was 434 points, and in the most competitive natural sciences III track, related to medicine, it was 453 points. LifePrompt also claims that ChatGPT scored 50 points more than the best test taker and took full marks in mathematics.

The way the test is conducted matters: it’s not just about simple closed questions. LifePrompt converted the exam papers into images and fed them to the model, and the expected answers also included descriptive, open-ended forms. And these fragments were assessed by teachers from the large Japanese preparatory school Kawai Juku, which was supposed to bring the procedure closer to the conditions of real exams. The company also took into account ChatGPT results from standardized university entrance exams and then added up the points for each track.

Kyoto University also caved

The results from Kyoto University only multiplied the amazement at AI’s capabilities. In the exam for the Faculty of Law, ChatGPT scored 771 points, while the highest threshold among admitted candidates was 734 points. In the exam for the Faculty of Medicine, the model scored 1,176 points, surpassing the already solid score of 1,098 points assigned to the best accepted candidate.

This success, however, was not without some “negative” surprises. ChatGPT won, in fact, 90 percent. points in English, but in descriptive questions in subjects such as world history he scored only… 25 percent. Why? Because there, the exam requires not just recognizing the pattern, but preparing a good-quality argument, selecting arguments and working with the geopolitical context and realities known from history. AI will “eat” tasks with a specific structure for breakfast, but where you need to have some flair and intellectual effort, it will let you down. For now.

Where is the problem with current testing?

In 2024, LifePrompt used the ChatGPT 4 model to solve the University of Tokyo exam, but the result did not reach the minimum required for admission to the university. A year later, the test with the o1 model ended with the first passing threshold. The current result achieved by ChatGPT 5.2 Thinking is already a breakthrough. Now let’s consider: How can we fairly test the human advantage in such tests? Do such tests still make sense at all? And won’t they lose it completely in, for example, the next two years?

Read also: Will ChatGPT reach a new level? The creators announce that now they are for everyone

This translates not only into schools, tests and exams, but also into jobs. Companies should implement AI with an eye on what work will look like in 10 or 20 years. Humans and AI should not compete on the same field and should be approached this way. However, exams focused on memory and computational skills increasingly test skills in which machines are already naturally strong. Maybe it’s time to make some changes in the scope of competences we expect from people, use AI to the fullest in schools and companies, and leave to people what is… typically human?