Predicting Grammatical Properties of Words Helps Us Read Faster
Psycholinguists from the HSE Centre for Language and Brain found that when reading, people are not only able to predict specific words, but also words’ grammatical properties, which helps them to read faster. Researchers have also discovered that predictability of words and grammatical features can be successfully modelled with the use of neural networks. The study was published in the journal PLOS ONE.
The ability to predict the next word in another person’s speech or in reading has been described by many psycho- and neurolinguistic studies over the last 40 years. It is assumed that this ability allows us to process the information faster. Some recent publications on the English language have demonstrated evidence that while reading, people can not only predict specific words, but also their properties (e.g., the part of speech or the semantic group). Such partial prediction also helps us to read faster.
In order to access predictability of a certain word in a context, researchers usually use cloze tasks, such as 'The cause of the accident was a mobile phone, which distracted the ______'. In this phrase, different nouns are possible, but driver is the most probable, which is also the real ending of the sentence. The probability of the word 'driver' in the context is calculated as the number of people who correctly guessed this word over the total number of people who completed the task.
The other approach for predicting word probability in context is the use of language models that offer word probabilities relying on a big corpus of texts. However, there are virtually no studies that would compare the probabilities received from the cloze task to those from the language model. Additionally, no one has tried to model the understudied grammatical predictability of words. The authors of the paper decided to learn whether native Russian speakers would predict grammatical properties of words and whether the language model probabilities could become a reliable substitution to probabilities from cloze tasks.
The researchers analysed responses of 605 native Russian speakers in the cloze task in 144 sentences and found out that people can precisely predict the specific word in about 18% of cases. Precision of prediction of parts of speech and morphological features of words (gender, number and case of nouns; tense, number, person and gender of verbs) varied from 63% to 78%. They discovered that the neural network model, which was trained on the Russian National Corpus, predicts specific words and grammatical properties with precision that is comparable to people’s answers in the experiment. An important observation was that the neural network predicts low-probability words better than humans and predicts high-probability words worse than humans.
The second step in the study was to determine how experimental and corpus-based probabilities impact reading speed. To look into this, the researchers analysed data on eye movement in 96 people who were reading the same 144 sentences. The results showed that first, the higher the probability of guessing the part of speech, gender and number of nouns, as well as the tense of verbs, the faster the person read words with these features.
The researchers say that this proves that for languages with rich morphology, such as Russian, prediction is largely related to guessing words’ grammatical properties.
Second, probabilities of grammatical features obtained from the neural network model explained reading speed as correctly as experimental probabilities. ‘This means that for further studies, we will be able to use corpus-based probabilities from the language model without conducting new cloze task-based experiments,’ commented Anastasiya Lopukhina, author of the paper and Research Fellow at the HSE Centre for Language and Brain.
Third, the probabilities of specific words received from the language model explained reading speed in a different way as compared to experiment-based probabilities. The authors assume that such a result may be related to different sources for corpus-based and experimental probabilities: corpus-based methods are better for low-probability words, and experimental ones are better for high-probability ones.
Anastasiya Lopukhina, Research Fellow at the HSE Centre for Language and Brain.
Two things have been important for us in this work. First, we found out that reading native speakers of languages with rich morphology actively involve grammatical predicting. Second, our colleagues, linguists and psychologists who study prediction got an opportunity to assess word probability with the use of language model. This will allow them to simplify the research process considerably.
Moscow, like any modern big city, attracts migrants from different regions and countries. Some of them speak very little or no Russian. Their adaptation and successful integration depend in part on how fast they can learn Russian and in part on whether the city makes an effort to accommodate other languages. According to linguist Mira Bergelson, this latter factor is particularly important if the city is to benefit from immigration.
Neurolinguists from HSE University have confirmed experimentally that for people with aphasia, it is easier to retrieve verbs describing situations with several participants (such as ‘someone is doing something’), although such verbs give rise to more grammar difficulties. The results of the study have been published in Aphasiology.
‘We Have Not Yet Fully Understood How Languages Work, and We Are Already Losing 90% of Their Diversity’
Why might a grandmother and her grandson not understand each other? Why would linguists want to go to Dagestan? Is it possible to save the less commonly spoken languages of small nations and Russian dialects? Nina Dobrushina, Head of the Linguistic Convergence Laboratory answered these questions in an interview with HSE News Service.
Originally from Pavia, Italy, Chiara Naccarato developed an interest in Russian early on in her studies, completing her undergraduate and master’s degrees in Russian Language and Linguistics at the University of Milan. She recently joined HSE as a postdoctoral researcher in the Linguistic Convergence Laboratory after completing her PhD studies in Linguistic Sciences at the Universities of Pavia and Bergamo.
Lecture Series Explores Communicative Supertypes, Russian as a Reality-Oriented Language, and Language & Culture
On March 19 and 22, Per Durst-Andersen, professor in the Department of Management, Society and Communication at Copenhagen Business School, gave three lectures at the Higher School of Economics on topics that fall under his current research interests, which focus largely on cognitive linguistics; communicative and linguistic typology; language, culture and identity; semiotics; and the philosophy of science. A well-known expert in cross-cultural pragmatics and specialist in business communication, Professor Durst-Andersen delivered the lectures as part of the ‘Language in the Universe of Culture: Russian Communicative Style’ course.
One of HSE’s newest faculty members is Francis Tyers, who will join the School of Linguistics on August 28 as an Assistant Professor. A native of Normanton on Soar, a small village in the south of Nottinghamshire in England, he joins HSE following a postdoctoral fellowship at UiT Norgga árktalaš universitehta in Tromsø in the north of Norway, where he worked on language technology for Russian and the Sámi languages. Prior to that, he completed PhD studies in the Department of Languages and Information Systems at the Universitat d'Alacant in Spain.
'HSE Students Are not Content with Knowing Things — They Immediately Want to Solve Linguistic Problems'
Guglielmo Cinque is a professor of linguistics at the University of Venice and one of the most well-known European generativists. Recently he paid a week-long visit the HSE School of linguistics, and now shares his impressions of our students and staff, as well as of this year's weather in Moscow.
Google announced the recepients of its several scholarship programs, including the Women Techmakers Scholarship (formerly the Anita Borg Memorial Scholarship). Among this year's winners Elizaveta Kuzmenko, 1st year student on the Computational Linguistics MA programme at the HSE School of linguistics.
Yale postdoc Kevin Tang recently gave a talk at HSE on his research in experimental phonology. We talked to Kevin about his conversion from an engineer to a linguist and asked him how he liked the feedback he received from HSE students.
The aim of the course is to obtain the idea of the lexicon as a complex system and to get the methodology of the typological approach to the lexicon cross-linguistically, as well as to learn about the general mechanisms of semantic shift and their typological relevance.