• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Multimedia Data Analysis of Social Networks

Student: Kuporosov Gennadiy

Supervisor: Nikolay Karpov

Faculty: Faculty of Informatics, Mathematics, and Computer Science (HSE Nizhny Novgorod)

Educational Programme: Computational Linguistics (Master)

Year of Graduation: 2016

Social networks are very popular today and contain a lot of different content, what increases the level of their impact on society. Unfortunately, the content directly contains information materials that violate Russian laws. Manual detection of content with violations of the laws is ineffective. Among technical means for solution of this problem spam filters could be noted, but their realization is not able to fully cope with this problem. Therefore, the problem of multimedia data analysis in social networks on detection of laws' violations is actual and requires new approaches to solve it. This dissertation is devoted to the multimedia data analysis of social networks data on detection of laws' violations of the Russian Federation. The objectives of this work are the development of requirements, analysis of existing methods and tools for creating automatized software tools for solution of this problem, the realization of this system and practical search of the effectiveness of the applied means. For solution the problem of detecting violations in social networks were considered methods such as keyword search, n-gram search and machine learning algorithms. Among the existing social networks for realization of programming tools were chosen VK.com and YouTube. Programs were written in Python v. 3.5 on Ubuntu 16.04 LTS platform. In developing such libraries were used as pymorhy2, sklearn and pymongo. For the analysis of video's audio track was used Yandex SpeechKit Cloud technology, which allows to convert audio to text that makes possible to use the same methods for its analysis, as for text. It was also carried out a practical research of the effectiveness of the methods used for this task. In the results of the work, the objectives were completed in full. It was found that all methods, which were used, and developed software tools are effective for solving the problem.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses