Дереза Оксана Владимировна
Morphological Analysis of Old Irish Data with Neural Networks
Теория языка и компьютерная лингвистика
The interest to automatic morphological analysis of historical languages arose at the very start of computational linguistics, but still this field is underrepresented in comparison to other NLP tasks. Moreover, most of the functionality of modern NLP tools for historical language data is confined to Latin and Ancient Greek. However, there are many other well-documented languages, such as Old Irish, where a statistical-based approach to linguistic analysis may prove useful. Such languages are usually morphologically rich and orthographically inconsistent, which complicates automatic processing and requires NLP instruments to be language-specific; the lack of annotated corpora is an even bigger problem. In this work, I describe a neural network approach to lemmatisation and compare it with my previous work on a rule-based Old Irish lemmatiser.