• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Authorship Attribution task

Student: Kopysov Mark

Supervisor: Oleg Durandin

Faculty: Faculty of Informatics, Mathematics, and Computer Science (HSE Nizhny Novgorod)

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Year of Graduation: 2020

Due to the necessity to structure a plethora of texts that have become available to the public eye and a blazing fast expansion of the Internet, interest in authorship analysis among natural language processing experts has skyrocketed. It has even led to a creation of competitions (such as PAN) with curated datasets, skilled participants and an overview of traditional and non-traditional approaches. An automated, computationally-supported and statistically accurate system of authorship attribution and identification might be of interest to a wide variety of experts. This thesis presents the practical implementation of various methods for text analysis and classification, which allows to solve the multi-domain open-set problem of attribution of the author with sufficient accuracy.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses