• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Supervised Approaches for Detection of Non-Compositional Nominal Compounds

Student: Puzyrev Dmitriy

Supervisor: Ekaterina Artemova

Faculty: Faculty of Computer Science

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Year of Graduation: 2019

Сompositionality of noun compounds indicates to what extent the meaning of a phrase can be derived from the meaning of its parts and their grammatical relations. If semantics of a compound differs from its own components, the phrase is called non-compositional. Some of the examples of non-compositional collocations are "hot dog" and "rat race". This can be applied in various areas, notably, in mashine translation. Detection of compositionality is a task that has been frequently addressed with Distributional Semantic Models (DSMs) and mathematical operations with corresponding word vectors. Unlike traditional approaches that use an "unsupervised setting", we treat this task as a classification problem and introduce various supervised learning algorithms to gain substantial improvement across gold standard dataset over state-of-the-art models. We also proceed to introduce Russian language dataset for compound compositionality task with ADJ+NOUN and NOUN+NOUN speech patterns, the first one among Slavic languages to our knowledge, and perform same classification methods to introduce baseline results comparable with english corpora.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses