Feb

2021

Predicting Grammatical Properties of Words Helps Us Read Faster

Psycholinguists from the HSE Centre for Language and Brain found that when reading, people are not only able to predict specific words, but also words’ grammatical properties, which helps them to read faster. Researchers have also discovered that predictability of words and grammatical features can be successfully modelled with the use of neural networks. The study was published in the journal PLOS ONE.

The ability to predict the next word in another person’s speech or in reading has been described by many psycho- and neurolinguistic studies over the last 40 years. It is assumed that this ability allows us to process the information faster. Some recent publications on the English language have demonstrated evidence that while reading, people can not only predict specific words, but also their properties (e.g., the part of speech or the semantic group). Such partial prediction also helps us to read faster.

In order to access predictability of a certain word in a context, researchers usually use cloze tasks, such as 'The cause of the accident was a mobile phone, which distracted the ______'. In this phrase, different nouns are possible, but driver is the most probable, which is also the real ending of the sentence. The probability of the word 'driver' in the context is calculated as the number of people who correctly guessed this word over the total number of people who completed the task.

The other approach for predicting word probability in context is the use of language models that offer word probabilities relying on a big corpus of texts. However, there are virtually no studies that would compare the probabilities received from the cloze task to those from the language model. Additionally, no one has tried to model the understudied grammatical predictability of words. The authors of the paper decided to learn whether native Russian speakers would predict grammatical properties of words and whether the language model probabilities could become a reliable substitution to probabilities from cloze tasks.

The researchers analysed responses of 605 native Russian speakers in the cloze task in 144 sentences and found out that people can precisely predict the specific word in about 18% of cases. Precision of prediction of parts of speech and morphological features of words (gender, number and case of nouns; tense, number, person and gender of verbs) varied from 63% to 78%. They discovered that the neural network model, which was trained on the Russian National Corpus, predicts specific words and grammatical properties with precision that is comparable to people’s answers in the experiment. An important observation was that the neural network predicts low-probability words better than humans and predicts high-probability words worse than humans.

The second step in the study was to determine how experimental and corpus-based probabilities impact reading speed. To look into this, the researchers analysed data on eye movement in 96 people who were reading the same 144 sentences. The results showed that first, the higher the probability of guessing the part of speech, gender and number of nouns, as well as the tense of verbs, the faster the person read words with these features.

The researchers say that this proves that for languages with rich morphology, such as Russian, prediction is largely related to guessing words’ grammatical properties.

Second, probabilities of grammatical features obtained from the neural network model explained reading speed as correctly as experimental probabilities. ‘This means that for further studies, we will be able to use corpus-based probabilities from the language model without conducting new cloze task-based experiments,’ commented Anastasiya Lopukhina, author of the paper and Research Fellow at the HSE Centre for Language and Brain.

Third, the probabilities of specific words received from the language model explained reading speed in a different way as compared to experiment-based probabilities. The authors assume that such a result may be related to different sources for corpus-based and experimental probabilities: corpus-based methods are better for low-probability words, and experimental ones are better for high-probability ones.

Anastasiya Lopukhina, Research Fellow at the HSE Centre for Language and Brain.

Two things have been important for us in this work. First, we found out that reading native speakers of languages with rich morphology actively involve grammatical predicting. Second, our colleagues, linguists and psychologists who study prediction got an opportunity to assess word probability with the use of language model. This will allow them to simplify the research process considerably.

Date

11 February 2021

Topics

Research & Expertise

Keywords

publications research projects linguistics psycholinguistics

About

Center for Language and Brain

About persons

Anastasiya Lopukhina

Mistakes That Explain Everything: Scientists Discuss the Future of Psycholinguistics

Today, global linguistics is undergoing a ‘multilingual revolution.’ The era of English-language dominance in the cognitive sciences is drawing to a close as researchers increasingly turn their attention to the diversity of world languages. Moreover, multilingualism is shifting from an exotic phenomenon to the norm—a change that is transforming our understanding of human cognitive abilities. The future of experimental linguistics was the focus of a recent discussion at HSE University.

11 November

Aug

2025

Twenty vs Ten: HSE Researcher Examines Origins of Numeral System in Lezgic Languages

It is commonly believed that the Lezgic languages spoken in Dagestan and Azerbaijan originally used a vigesimal numeral system, with the decimal system emerging later. However, a recent analysis of numerals in various dialects, conducted by linguist Maksim Melenchenko from HSE University, suggests that the opposite may be true: the decimal system was used originally, with the vigesimal system developing later. The study has been published in Folia Linguistica.

20 August

Mar

2025

‘Learning Japanese Is a Long-Distance Race’

How can one master kanji, even with the help of sports, and why is Japanese Studies considered the pinnacle of Asian Studies? In this interview dedicated to the Japanese language, Vasilii Shchepkin and Olga Klimova discuss specific features of the language, the reasons for and experiences of learning it, as well as translation practices.

3 March

Dec

2024

Linguists from Around the World Discuss Current Academic Issues at First Eurasian Congress

HSE University partnered with the First Eurasian Congress of Linguists dedicated to the 300th anniversary of the Russian Academy of Sciences (RAS). The congress served as a platform for discussing relevant issues in linguistics related to all language groups of Eurasia and other regions worldwide. Approximately 200 researchers from 46 foreign countries and 300 Russian linguists from 50 regions of Russia participated in the event.

26 December 2024

Dec

2024

'Back in School, I Decided That I Would No Longer Suppress My Feelings'

Polina Makarova initially planned to pursue a career in programming but soon shifted her focus to theoretical linguistics. In this interview with the HSE Young Scientists project, she discusses her research on grammatical agreement in the names of professions, the importance of emotional intelligence, and the benefits of keeping an eublepharid, or leopard gecko, as a pet.

24 December 2024

May

2024

'Language Surrounds Us at All Times'

The most likely place to find Anton Buzanov is at the HSE building on Staraya Basmannaya Ulitsa, where the researcher spends nearly all his time. In his interview with the HSE Young Scientists project, he recounts his experience of leading a field expedition to Sami communities, shares his affection for teenage television shows, and observes that engaging solely in activities that bring joy can prevent burnout.

30 May 2024

Jul

2023

Expedition Uses New Methods in Minority Language Studies

Researchers from HSE University’s Centre for Language and Brain, together with employees of Adyghe State University’s Laboratory of Experimental Linguistics, are conducting an expedition that is unprecedented in Russia: psycholinguistic field research into the Adyghe language and Russian-Adyghe bilingualism in a village in the Republic of Adygea.

6 July 2023

Jan

2023

HSE University-Developed Linguatest System Launched in Nizhny Novgorod

Linguatest, Russia’s first foreign-language certification system, has been launched in the Nizhny Novgorod region. The system was developed by specialists from HSE University in cooperation with the National Accreditation Agency and the Prosveshchenie group of companies, who are providing certification and publishing support for the project. Nizhny Novgorod is the first city after Moscow to offer testing under the system.

17 January 2023

Aug

2022

'The Applied Linguistics Programme Allowed Me to Try Something I Was Interested in While Continuing What I Am Passionate About'

Austin Garrett-Sites, from the US, is a master's student of the Applied Linguistics and Text Analytics programme in Nizhny Novgorod. Students from around the world to come to Russia to get a European education in English with viable employment prospects. Austin spoke about his impressions after the first year of study and his favourite places in Nizhny Novgorod.

16 August 2022

Aug

2022

What’s It Like to Work as a Computer Linguist

The IT industry is rapidly developing and incorporating new professions. Zoya Mazunina and Arina Mosyagina, linguists with Seldon and graduates of the HSE University Fundamental and Applied Linguistics programme, met with university applicants to talk about the computer linguist profession, issues of automatic language processing, and how linguists use the knowledge they gain at HSE University.

15 August 2022