• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Comparison of Dictionary Based and Corpora Based Phonological Inventories: Andic Data

Student: Davidenko Anastasiya

Supervisor: George Moroz

Faculty: Faculty of Humanities

Educational Programme: Fundamental and Computational Linguistics (Bachelor)

Year of Graduation: 2021

Linguists distinguish between two types of frequencies: type and token frequency. Each of them is used for its own purpose and says something about its own aspect of the language. We are going to use these frequencies for quantitative phonological analysis of low-resource languages. This work focuses on the difference between automatically calculated sound frequencies in two types of data: corpora and dictionaries. Our hypothesis is that frequencies of sounds in them are correlated and the frequency calculated from the dictionary data can be used to describe phonology in the same way as the frequency calculated from the corpus data. To test it, we use corpora and dictionaries of Andic languages. Some of them already have corpora (Botlikh, Andi), some of them have texts in their grammatical descriptions from which we could make corpora (Bagvalal). For each language we will calculate frequencies of sound, and then compare them with each other to find a dependency. Ideally, they would be correlated. If this work can confirm our hypothesis, it may contribute to future quantitative studies of the phonology of low-resource languages since it will be possible to predict one value from another.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses