• A
• A
• A
• ABC
• ABC
• ABC
• А
• А
• А
• А
• А
Regular version of the site
Campus inMoscow
Student
Title
Supervisor
Faculty
Educational Programme
Final Grade
Year of Graduation
Konstantin Slavnov
Learning Node Embeddings is Graphs
10
2017
In this work we consider the problem of learning nodes representation in graphs for machine learning problems. The problem is considered as a universal method of processing graph data that does not require supervised approach. In the beginning of the work we describe the problem and its relevance. The approaches based on matrix factorization or random walks for solving this problem are briefly described next. After that, the idea of creating new algorithms is motivated and their concept is briefly described. One method is based on the structural loss function, the second is based on matrix factorization. After the introduction and problem formal formulation, we turn to the description of modern methods for further analysis and comparison with new ones. They are the matrix factorization methods: SVD, NMF, BigClam and random walk: DeepWalk, Node2vec. New methods are Sparse Gamma Model and structural loss density function. The properties desirable for algorithms for searching vector representations are given. We give a brief list of such properties below. The method must be universal for all problems on graphs. Scalable method. Method without parameters. The method does not use complex assumptions about the nature of the data. The method directly uses the graph structure. Automatic selection of the effective dimension of the attachment. All methods are considered by the presented criteria, comparative analysis is carried out. The structural density loss function satisfies all the above properties. The work concludes with experiments and conclusions. During the experiments, a model and real example are considered in the classification problem on graphs. The structural density loss function has shown itself at the same level as the best methods. Special attention is paid to the task of identifying intersecting communities, in which the Sparse Gamma Model has proved to be the best model.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.