• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Automated Fact-Based Report Generation

Student: Polushin Gleb

Supervisor: Sergey Lisitsyn

Faculty: Graduate School of Business

Educational Programme: Big Data Systems (Master)

Year of Graduation: 2019

One sufficient area of NLP appliance is a controllable text generation. There is a common task of writing short reports based on a set of facts. Usually, the process implies that an analyst should look at raw facts, process them in a particular way and produce a report. It is a pretty routine task and in today’s trend of automating manual processes and with the help of a growing amount of solutions in NLP area possibly can be partly or fully automated. In this research, we investigated so-called data-to-text generation, where as an input we have set of some facts or rows in table or records in a database, and we want to produce short descriptions of that data. We implemented a data-driven solution that consists of two steps — phrases generation from raw input data and then phrases correction and concatenation to make them look more natural. Both steps include the usage of neural networks. It is a new method that allows implementing data-to-text generation system in conditions of no available dataset with minimal efforts. Architecture is influenced by the fact that the solution was implemented with no training data available. We applied the algorithm to real data with information about site audience and showed that it generates satisfying texts. In addition, we implemented an end-to-end system that takes results of two-steps solution as training data.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses