May

2021

Faster and More Precise: Researcher Improves Performance of Image Recognition Neural Network

A scientist from HSE University has developed an image recognition algorithm that works 40% faster than analogues. It can speed up real-time processing of video-based image recognition systems. The results of the study have been published in the journal Information Sciences.

Convolutional neural networks (CNNs), which include a sequence of convolutional layers, are widely used in computer vision. Each layer in a network has an input and an output. The digital description of the image goes to the input of the first layer and is converted into a different set of numbers at the output. The result goes to the input of the next layer and so on until the class label of the object in the image is predicted in the last layer. For example, this class can be a person, a cat, or a chair. For this, a CNN is trained on a set of images with a known class label. The greater the number and variability of the images of each class in the dataset are, the more accurate the trained network will be.

If there are only a few examples in the training set, the additional training (fine-tuning) of the neural network is used. CNN is trained to recognize images from a similar dataset that solves the original problem. For example, when a neural network learns to recognize faces or their attributes (emotions, gender, age), it is preliminary trained to identify celebrities from their photos. The resulting neural network is then fine-tuned on the available small dataset to identify the faces of family or relatives in home video surveillance systems. The more depth (number) of layers there are in a CNN, the more accurately it predicts the type of object in the image. However, if the number of layers is increased, more time is required to recognize objects.

The study’s author, Professor Andrey Savchenko of the HSE Campus in Nizhny Novgorod, was able to speed up the work of a pre-trained convolutional neural network with arbitrary architecture, consisting of 90-780 layers in his experiments. The result was an increase in recognition speed of up to 40%, while controlling the loss in accuracy to no more than 0.5-1%. The scientist relied on statistical methods such as sequential analysis and multiple comparisons (multiple hypothesis testing).

The decision in the image recognition problem is made by a classifier — a special mathematical algorithm that receives an array of numbers (features/embeddings of an image) as inputs, and outputs a prediction about which class the image belongs to. The classifier can be applied by feeding it the outputs of any layer of the neural network. To recognize "simple" images, the classifier only needs to analyse the data (outputs) from the first layers of the neural network.

Andrey Savchenko
Professor, Department of Information Systems and Technologies

There is no need to waste further time if we are already confident in the reliability of the decision made. For "complex" pictures, the first layers are clearly not enough — you need to move on to the next. Therefore, classifiers were added to the neural network into several intermediate layers. Depending on the complexity of the input image, the proposed algorithm decided whether to continue recognition or complete it. Since it is important to control errors in such a procedure, I applied the theory of multiple comparisons: I introduced many hypotheses, at which intermediate layer to stop, and sequentially tested these hypotheses.

If the first classifier already produced a decision that was considered reliable by the multiple hypothesis testing procedure, the algorithm stopped. If the decision was declared unreliable, the calculations in the neural network continued to the intermediate layer, and the reliability check was repeated.

The most accurate decisions are obtained for the outputs of the last layers of the neural network. Early network outputs are classified much faster, which means it is necessary to simultaneously train all classifiers in order to accelerate recognition while controlling loss in accuracy. For example, so that the error due to an earlier stop is no more than 1%.

High accuracy is always important for image recognition. For example, if a decision in face recognition systems is made incorrectly, then either someone outside can gain access to confidential information or conversely the user will be repeatedly denied access, because the neural network cannot identify him correctly. Speed can sometimes be sacrificed, but it matters, for example, in video surveillance systems, where it is highly desirable to make decisions in real time, that is, no more than 20-30 milliseconds per frame. To recognize an object in a video frame here and now, it is very important to act quickly, without losing accuracy.

Date

12 May 2021

Topics

Research & Expertise

Keywords

professors research projects neural networks video image recognition

About

HSE Campus in Nizhny Novgorod

About persons

Andrey Savchenko

Group and Shuffle: Researchers at HSE University and AIRI Accelerate Neural Network Fine-Tuning

Researchers at HSE University and the AIRI Institute have proposed a method for quickly fine-tuning neural networks. Their approach involves processing data in groups and then optimally shuffling these groups to improve their interactions. The method outperforms alternatives in image generation and analysis, as well as in fine-tuning text models, all while requiring less memory and training time. The results have been presented at the NeurIPS 2024 Conference.

19 June

Mar

2025

‘When You Have a Lot to Do, You Find Time for Everything’

Egor Churaev specialises in neural networks. In an interview for the HSE Young Scientists project, he talked about his program for determining the emotions and engagement of online conference participants, his trip to Brazil, and his sports hobbies.

14 March

Oct

2024

Beauty in Details: HSE University and AIRI Scientists Develop a Method for High-Quality Image Editing

Researchers from theHSE AI Research Centre, AIRI, and the University of Bremen have developed a new image editing method based on deep learning—StyleFeatureEditor. This tool allows for precise reproduction of even the smallest details in an image while preserving them during the editing process. With its help, users can easily change hair colour or facial expressions without sacrificing image quality. The results of this three-party collaboration were published at the highly-cited computer vision conference CVPR 2024.

4 October 2024

Jul

2024

HSE University at VK Fest: VR Games and Emotion Recognition

On July 13-14, 2024, the annual large-scale VKontakte festival took place at Moscow’s Luzhniki Stadium. HSE University, as usual, participated in the event. The university's tent featured a variety of activities, including emotion recognition challenge, quizzes about artificial intelligence, IT career testing, a smile detector, VR gaming, and a blue tractor equipped with a smart sprinkler system.

19 July 2024

May

2024

Russian Researchers Improve Neural Networks' Spatial Navigation Performance

Researchers at HSE University, MISiS National University of Science and Technology, and the Artificial Intelligence Research Institute (AIRI) have developed an enhanced approach to reinforcement learning for neural networks tasked with navigation in three-dimensional environments. By using the attention mechanism, they managed to improve the performance of a graph neural network by 15%. The study results have been published in IEEE Access.

13 May 2024

Mar

2024

Neural Network Developed at HSE Campus in Perm Will Determine Root Cause of Stroke in Patients

Specialists at HSE Campus in Perm and clinicians at Perm City Clinical Hospital No. 4, have been collaborating to develop a neural network capable of determining the root cause of a stroke. This marks the world's first attempt to create such a system, the developers note.

4 March 2024

Feb

2024

HSE Researchers Teach Neural Networks to Better Detect Humour

A group of scientists from the HSE Faculty of Computer Science has conducted a study on the ability of neural networks to detect humour. It turns out that for more reliable recognition, it’s necessary to change the approach to creating datasets on which neural networks are trained. The scientists presented these results at one of the world's most important conferences on natural language processing — EMNLP 2023.

9 February 2024

Jan

2024

Neural Networks of Power: AI Unravels Knots and Tangles in Relationships between Humans, Elves and Hobbits

One of the most popular writers of the last century, John Ronald Reuel Tolkien, was born on January 3^rd. Researchers from HSE University, AIRI and MISSIS have used machine learning to explore the social connections between the characters of his Middle-earth universe. The algorithm managed to create an accurate picture of the social structures and dynamics of the characters' relationships, providing a unique map of interactions in the epic world. The results of the work were published in IEEE Xplore.

17 January 2024

Dec

2023

Specialists from the HSE Institute of Education Confirm GigaChat’s Erudition in Social Sciences

A multimodal neural network model by Sber, under the supervision of HSE University’s expert commission, has successfully passed the Unified State Exam in social studies. GigaChat completed all exam tasks and scored 67 points.

5 December 2023

Oct

2023

Child Ex Machina: What Artificial Intelligence Can Learn from Toddlers

Top development teams around the world are trying to create a neural network similar to a curious but bored three-year-old kid. IQ.HSE shares why this approach is necessary and how such methods can bring us closer to creating strong artificial intelligence.

30 October 2023

Faster and More Precise: Researcher Improves Performance of Image Recognition Neural Network

Andrey Savchenko Professor, Department of Information Systems and Technologies

Andrey Savchenko
Professor, Department of Information Systems and Technologies