• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Development of a Python Package for Data Analysis in Social Sciences

Student: Zhuchkova Svetlana

Supervisor: Alexey Rotmistrov

Faculty: Faculty of Computer Science

Educational Programme: Financial Technology and Data Analysis (Master)

Year of Graduation: 2020

The thesis is presented as an applied project and is devoted to the development of a Python package for data analysis in social science. The main goal of the project is to create a tool that duplicates the basic functionality of SPSS, the most popular application for data analysis with a graphical user interface. The target audience for such a tool is users with no or minimal background in programming, including students and teachers of the relevant educational programs. We assume that the creation of a tool similar to such a popular software can contribute to the development of data culture and programming skills of the target audience as well as facilitate a further shift to more advanced packages and methods for such users. This shift is becoming relevant both in science and in the industry due to the changing sources and volumes of available data. Currently, the created package (randan) combines nine classes aggregated into modules. These classes correspond to the methods of analysis that are studied at the Sociology bachelor program at HSE. A comparison of the package with the most popular implementations of the same methods in Python showed that randan is ahead of most of these implementations in the proposed functionality, which means it provides a user with more complete results in terms of the purposes of the social data analysis. However, in terms of performance, randan falls behind the existing solutions in most cases because it requires a bigger amount of time and memory for its work. Although there were no performance requirements for the package at the first development stage, the analysis of differences with other solutions gives reasons to improve the classes that have already been created.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses