Sergey Troshin
- Research Assistant:Faculty of Computer Science / Big Data and Information Retrieval School / Centre of Deep Learning and Bayesian Methods
- Sergey Troshin has been at HSE University since 2017.

Young Faculty Support Program (Group of Young Academic Professionals)
Category "New Researchers" (2022)
Courses (2021/2022)
- Research Seminar "Machine Learning and Applications" (Bachelor’s programme; Faculty of Computer Science; 3 year, 1-4 module)Rus
- Past Courses
Courses (2020/2021)
- Machine Learning 1 (Bachelor’s programme; Faculty of Computer Science; 3 year, 1, 2 module)Rus
Machine Learning 2 (Bachelor’s programme; Faculty of Computer Science; field of study "01.03.02. Прикладная математика и информатика", field of study "01.03.02. Прикладная математика и информатика"; 3 year, 3, 4 module)Rus
Publications2
- Chapter Chirkova N., Troshin S. A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code, in: 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021). Association for Computational Linguistics, 2021. P. 278-288. doi
- Chapter Chirkova N., Troshin S. Empirical Study of Transformers for Source Code, in: ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery (ACM), 2021. P. 703-715. doi
Employment history
Development of a system for active learning of neural networks to search for relationships between entities in a text.
‘One Year of Combined-Track Studies Expands Students’ Research Horizons’
HSE University continues to develop its new study format for students embarking on a research career: the Combined Master's-PhD track. This year, there will be 75 places for Master’s students on the track—two thirds more than last year. HSE Vice Rector Sergey Roshchin talks about the appeal of the combined-track option, how to enrol, and the achievements of last year’s applicants.
Two papers were accepted to NAACL 2021
Two papers were accepted to the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021):“On the Embeddings of Variables in Recurrent Neural Networks for Source Code” by Nadezhda Chirkova;“A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code” by Nadezhda Chirkova and Sergey Troshin.The final versions of the papers and the source code will be released soon. The research is conducted with the use of the computational resources of the HSE Supercomputer Modeling Unit.Both papers address the problem of improving the quality of deep learning models for source code by utilizing the specifics of variables and identifiers. The first paper proposes a recurrent architecture that explicitly models the semantic meaning of each variable in the program. The second paper proposes a simple method for preprocessing rarely used identifiers in the program so that a neural network (particularly, Transformer architecture) would better recognize the patterns in the program. The proposed methods were shown to significantly improve the quality of code completion and variable misuse detection.