• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Machine Learning for News Summarization Task

Student: Kechin Sergei

Supervisor: Alexey Masyutin

Faculty: Faculty of Computer Science

Educational Programme: Financial Technology and Data Analysis (Master)

Year of Graduation: 2020

In Master's dissertation corpora of russian news articles (RIA News agency) is investigated. The goal of the work is to try several summarization neural-models for news headlines generation and compare their quality. Comparable metric is rouge. In this work we use three models: encoder-decoder architecture with long short-term memory networks, transformer and recently publicated reformer. We briefly introduce structure of models listed above and key mechanisms of memory saving that are used in reformer: reversible connections and approximate attention calculation with locality-sensitive hashing (LSH). Models are trained in neural network frameworks: openNMT and Trax. Every model shows approximately the same quality. Generated news headlines look relatively friendly for human. Moreover in this work quality demonstrated by the original transformer are compared with quality of the transformer which computes attention only via LSH scheme.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses