• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Development of a Software Tool for the Semantic Ambiguity Resolution of Natural Language Texts

Student: Litvinova Natalia

Supervisor: Eduard Klyshinskiy

Faculty: HSE Tikhonov Moscow Institute of Electronics and Mathematics (MIEM HSE)

Educational Programme: Information Science and Computation Technology (Bachelor)

Final Grade: 9

Year of Graduation: 2019

In order to improve and spread the use of machine learning for natural language processing new methods of generating very accurate digital representations of words are required. Nowadays, the most popular methods for constructing such representations do not take into account the diversity of the meanings of the same word depending on the context, which leads to a decline in the quality of the models created on their basis. The purpose of this paper is to develop software that is able to transform text into vector space with the resolution of its semantic ambiguity for the Russian language. To achieve this goal, existing practices of creating a digital representation of the text, as well as methods for resolving ambiguities in the text were studied, an adaptation of the AdaGram method for working in Python was proposed, and also tested on three different tasks. The results of this work can be used for further analysis of methods for resolving the semantic ambiguity of the Russian language, as well as to improve the accuracy of the text analysis algorithms. The paper consists of 49 pages, contains 4 figures, 4 tables, 5 listings and 11 applications. Keywords: machine learning; data mining; natural language processing; semantic ambiguity resolution.

Full text (added May 23, 2019)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses