• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Application of Extractive Summarization Methods for Summarization of Russian Court Sentences Texts

Student: Poliachek Nina

Supervisor: Alexander I. Khrabrov

Faculty: St. Petersburg School of Physics, Mathematics, and Computer Science

Educational Programme: Big Data Analysis for Business, Economy, and Society (Master)

Year of Graduation: 2021

Nowadays there is a problem of the closed nature of Russian legal sphere. Legal texts are illegible, complicated and almost inaccessible for ordinary citizens. It takes a lot of time for legal professionals to process many documents which delays the processes. Therefore, there is a need to simplify and shorten legal texts which can be done by automatic summarization. This paper examines an extractive approach to summarization of legal texts using Russian court sentences texts as an example. Russian court sentences have a similar structure but they consist of long sentences, often stretching over several paragraphs with an abundance of legal terms and complex constructions. This is the reason why classical extractive methods that mostly work with sentences are inapplicable to these texts. As part of the study, works on LegalTech development and extractive summarization were explored, texts of court sentences were collected, summaries were written for them and they were labelled in the format necessary for the application of the selected methods. The paper develops a new method for automatic extractive summary construction using parts of texts rather than sentences that is suitable for Russian court sentences. There is also a discussion of the obtained results, shortcomings of the method and ways to improve and expand the study. Among deep learning model architectures chosen for the study the best results were demonstrated by a recurrent neural network with Bidirectional LSTM, LSTM, and TimeDistributed Dense layers.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses