• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Investigating Loss Functions for Effective Training of Models that Estimate Semantic Textual Proximity

Student: Chernyavskiy Anton

Supervisor: Dmitry Ilvovsky

Faculty: Faculty of Computer Science

Educational Programme: Data Science (Master)

Year of Graduation: 2021

The use of contrastive loss for representation learning has become prominent in computer vision, and it is now getting attention in natural language processing (NLP). Here, we explore the idea of using a batch-softmax contrastive loss when fine-tuning large-scale pre-trained transformer models to learn better task-specific sentence embeddings for pairwise sentence scoring tasks. We introduce and study a number of variations in the calculation of the loss (normalization ways, symmetrization, aligning scores on the similarity matrix diagonal) as well as in the overall training procedure (data shuffling, trainable temperature and sequential pre-training). Our experimental results show sizable improvements on a number of datasets and pairwise sentence scoring tasks including classification, ranking, and regression. In particular, for the Snopes task (ranking task), we implement a full pipeline using BSC loss, which achieves state-of-the-art results. The importance of each proposed modification of the formulation and training methods is demonstrated both theoretically and experimentally using an ablation study. Finally, we offer detailed analysis and discussion, which should be useful for researchers aiming to explore the utility of the contrastive loss in NLP.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses