• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Including Multimodal Data into Channel Suggestions

Student: Ekaterina Koshchenko

Supervisor: Denis Moskvin

Faculty: St. Petersburg School of Physics, Mathematics, and Computer Science

Educational Programme: Software Development and Data Analysis (Master)

Year of Graduation: 2021

Nowadays collaborative platforms, such as GitHub and Slack, are a vital instrument in a day-to-day routine of every IT employee. As a result, data that is aggregated by these platforms has a significant value for algorithms of a socio-technical assistance that improve employees’ working conditions. However, the distribution of this data across different platforms leads to the fact that combining it is a very time-consuming process, and therefore the existing algorithms for socio-technical assistance (such as recommendation systems for channels in messengers) are based only on data directly related to the purpose of the algorithms. In this work, I collected a dataset containing data on interactions between JetBrains employees in channels and repositories on the Space platform, repositories on GitHub, as well as the positions of users in the company, obtained from Space. Several types of Space channel recommendation systems, based on different combinations of data modalities, were built using the collected dataset. The best of the constructed models were combined into a resulting multimodal channel recommendation system. Even though the evaluation of the performance of this multimodal system on historical data showed the worse metrics compared to the best of the unimodal algorithms (matrix factorization on channels), Space users rated the channels recommended by the multimodal system significantly better than the once suggested by matrix factorization. Based on the results of this work, it was concluded that using the data on the organization structure and employees’ technical repositories allows to mitigate the overfitting problem, and solves the problem of users’ cold start. Key words: socio-technical assistance algorithms, channel recommendation system, multimodal data.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses