• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Suggesting of Identifier Names

Student: Davidenko Igor

Supervisor: Boris Novikov

Faculty: St. Petersburg School of Physics, Mathematics, and Computer Science

Educational Programme: Enterprise Software Development (Master)

Year of Graduation: 2021

Developers not only write code, but also read it, so its understandability is a very important property. Most of the source code tokens are identifiers (70%). This means that they are largely responsible for its readability and understandability. Thanks to them, the programmer can quickly determine the purpose of a variable, method or class, without even looking at its implementation. Variables are quite a big part of all identifiers, unfortunately, at the moment there is no full-fledged tool for automatically suggesting their names. The goal of this work is to create a machine learning-based tool for suggesting variable names in real time. As part of this work, I implemented a plugin for IntelliJ IDEA, which uses several machine learning models to suggest variable names. These models can include both N-gram models, such as IdNGram and Naturalize, and neural models, IdTransformer and GGNN. The first ones are quickly trained on the user's side and can immediately make predictions, while the second ones are trained in advance on a large corpus of projects and are accessed via the server. Also the use of mixtures of these models was proposed. Due to the fact that N-gram models quickly learn the conventions of naming variables in a particular project, while neural models have a common knowledge of variable names, their mixture allows us to eliminate their shortcomings and improve their metrics.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses