• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

HSE Researchers Develop Novel Approach to Evaluating AI Applications in Education

HSE Researchers Develop Novel Approach to Evaluating AI Applications in Education

© iStock

Researchers at HSE University have proposed a novel approach to assessing AI's competency in educational settings. The approach is grounded in psychometric principles and has been empirically tested using the GPT-4 model. This marks the first step in evaluating the true readiness of generative models to serve as assistants for teachers or students. The results have been published in arXiv.

Each year, artificial intelligence plays a progressively larger role in education, prompting developers to address crucial questions about how to assess AI's capabilities, particularly in the context of its role in teaching and learning. Researchers at HSE University have introduced a novel psychometrics-based approach to creating effective benchmarks for evaluating the professional competencies of large language models (LLM), such as GPT. The approach is based on Bloom's taxonomy, which, despite the availability of numerous benchmarks (tests for language models), is not widely used specifically for result verification. 

A distinctive feature of the proposed methodology is its comparison of tasks across different levels of complexity—ranging from basic (knowledge) to advanced (application of knowledge) and addressing these varying levels in task evaluation. This is essential for assessing the quality of the model's recommendations across diverse situations and determining the extent to which it can be trusted in the educational context. As part of the study, the researchers developed and tested over 3,900 unique assignments, categorised into 16 content areas, including teaching methods, educational psychology, and classroom management. The experiment was conducted using the Russian language version of the GPT-4 model.

Elena Kardanova

'We have developed a new approach that goes beyond conventional testing,' explains Elena Kardanova, lead author of the project and Academic Supervisor at the Centre for Psychometrics and Measurement in Education of the HSE Institute of Education. Our approach is demonstrated through a comprehensive new benchmark—which is the term for language model tests—designed for AI in pedagogy. This benchmark is grounded in psychometric principles and emphasises key competencies essential for teaching. 

Today's AI models, such as ChatGPT, possess an impressive ability to process and generate text quickly, making them potential assistants in educational settings. However, our results indicate that the model struggles with more complex tasks that require a deeper understanding and the ability to think adaptively. For example, AI excels at retrieving known facts but demonstrates lower proficiency in applying this information to address real-world pedagogical challenges. In particular, ChatGPT is not always successful in solving theoretical problems, which can sometimes appear basic even to average students. 

Yaroslav Kuzminov

'The approach we have developed clearly highlights a key issue with AI today: you never know where to expect an error to occur. A model can make mistakes even in the simplest tasks, which are considered the core of an academic discipline. Our test highlights key issues both in the area of knowledge and in the application of that knowledge, thereby paving the way to address these core challenges. Addressing these issues is crucial if we want to rely on such models as assistants for teachers, and even more so for students. An assistant that requires everything to be rechecked—which is currently the case—is unlikely to inspire a desire to use it,' according to Yaroslav Kuzminov, Academic Supervisor of HSE University. 

Among the potential scenarios for AI use in education, scientists worldwide cite assisting teachers in creating educational materials, automating the assessment of student responses, developing adaptive curricula, and quickly generating analytics on student academic performance. According to the authors, AI can be a powerful tool for teachers, especially in the face of increasing workloads. However, there is still a need to improve the models and approaches used for their training and evaluation.

Taras Pashchenko

'The test we conducted helped us understand not only—and not so much—how to train large generative models, but also why concerns about teachers being replaced with artificial intelligence are, at the very least, premature. Indeed, it is impossible to overlook the breakthrough of generative models serving as teacher assistants: they can already attempt to develop curricula, compile reading lists for lessons, and, in some cases, grade assignments. Nevertheless, we still encounter the model's hallucinations, where it invents answers to questions when it lacks information about a phenomenon, or misunderstands the context. In general, if we want tools based on generative models to be used in pedagogical practice and earn epistemic trust, there is still much work to be done,' according to Taras Pashchenko, Head of the HSE Laboratory for Curriculum Design, who shares his perspective on the test results. 

In the future, the research team plans to continue finalising the benchmark by incorporating more complex tasks that can assess AI abilities such as information analysis and evaluation. 

Ekaterina Kruchinskaya

'Our upcoming papers will focus on both introducing new types of benchmarks and discussing academic techniques. Such techniques will be developed to further train models and mitigate the risks of hallucinations, loss of context, and errors in core knowledge. The main goal we aim to achieve is to ensure models are stable in their knowledge and to develop methods for testing this stability with even greater accuracy. Otherwise, they will remain merely tools that facilitate copying and imitation of knowledge,' notes Ekaterina Kruchinskaya, Senior Lecturer at the HSE Department of Higher Mathematics

See also:

HSE Strategic Technological Projects in 2025

In 2025, HSE University continued its participation in the Priority 2030 Strategic Academic Leadership Programme, maintaining a strong focus on technological leadership in line with the programme’s updated framework. A key element of the university’s technological leadership strategy is its Strategic Technological Projects (STPs), aimed at creating in-demand, knowledge-intensive products and services.

School Students Master Communication with GigaChat at HSE and Sber Hackathon

In late December 2025, a unique competition was held at HSE University where participants solved challenges not by writing code, but solely by interacting with Sber’s GigaChat artificial intelligence model. The Improm(p)tu hackathon was an experiment less about programming skills than a new form of literacy: the ability to work effectively with AI by translating complex problems into a language neural networks can understand.

Artificial Intelligence Transforms Employment in Russian Companies

Russian enterprises rank among the world’s top ten leaders in AI adoption. In 2023, nearly one-third of domestic companies reported using artificial intelligence. According to a new study by Larisa Smirnykh, Professor at the HSE Faculty of Economic Sciences, the impact of digitalisation on employment is uneven: while the introduction of AI in small and large enterprises led to a reduction in the number of employees, in medium-sized companies, on the contrary, it contributed to job growth. The article has been published in Voprosy Ekonomiki.

HSE Seeks New Ideas for AI Agents: Initiative Competition Launched

HSE University is inviting researchers and lecturers to present concepts for new digital products based on artificial intelligence. The best projects will receive expert and technological support. Applications are open until December 19, 2025.

Final of International Yandex–HSE Olympiad in AI and Data Analysis Held at HSE University

Yandex Education and the HSE Faculty of Computer Science have announced the results of the international AIDAO (Artificial Intelligence and Data Analysis Olympiad) competition. Students from 14 countries took part. For the second year in a row, first place went to the team AI Capybara, which developed the most accurate AI model for an autonomous vehicle vision system.

AI Lingua Included in Compilation of Best International AI Practices in Higher Education

HSE University has been acknowledged internationally for its pioneering efforts in integrating artificial intelligence into higher education. The AI Lingua Neural Network developed at HSE was included in the renowned international collection ‘The Global Development of AI-Empowered Higher Education: Beyond the Horizon.’ The compilation was prepared by the Institute of Education (IOE) of Tsinghua University with the support of the Ministry of Education of the People's Republic of China and a global advisory committee, which included experts from Oxford, UCL, Sorbonne, Stanford, and other leading academic centres.

Technological Breakthrough: Research by AI and Digital Science Institute Recognised at AI Journey 2025

Researchers from the AI and Digital Science Institute (part of the HSE Faculty of Computer Science) presented cutting-edge AI studies, noted for their scientific novelty and practical relevance, at the AI Journey 2025 International Conference. A research project by Maxim Rakhuba, Head of the Laboratory for Matrix and Tensor Methods in Machine Learning, received the AI Leaders 2025 award. Aibek Alanov, Head of the Centre of Deep Learning and Bayesian Methods, was among the finalists.

HSE University to Join Physical AI Garage Project by Yandex

Yandex is collaborating with leading Russian universities to launch a new educational programme called Physical AI Garage. This initiative unites five universities—HSE University, ITMO, MIPT, MAI, and MEPhI—to train future professionals in physical artificial intelligence by tackling real-world industrial challenges. The programme is free, and participants will receive scholarships.

Larger Groups of Students Use AI More Effectively in Learning

Researchers at the Institute of Education and the Faculty of Economic Sciences at HSE University have studied what factors determine the success of student group projects when they are completed with the help of artificial intelligence (AI). Their findings suggest that, in addition to the knowledge level of the team members, the size of the group also plays a significant role—the larger it is, the more efficient the process becomes. The study was published in Innovations in Education and Teaching International.

HSE Researchers Assess Creative Industry Losses from Use of GenAI

Speaking at the IPQuorum.Music forum on October 15, Leonid Gokhberg, HSE First Vice Rector, and Daniil Kudrin, an expert at the Centre for Industry and Corporate Projects of HSE ISSEK, presented the findings of the first study in Russia on the economic impact of GenAI on creative professions. The analysis shows that creators’ potential losses could reach one trillion roubles by 2030.