• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Sentiment Analysis Using Semantic Features Based on Machine Learning Methods

Student: Smetanin Sergey

Supervisor: Mikhail M. Komarov

Faculty: Graduate School of Business

Educational Programme: Electronic Business (Master)

Year of Graduation: 2018

Nowadays the customer generated reviews on e-commerce sites tend to be a valuable resource regarding the evaluation of customers’ behavior, their preferences and needs. This paper provides an approach for sentiment analysis of product reviews in Russian, which uses machine learning methods with semantic features. The training dataset was collected from order reviews on Pandao and Aliexpress top-ranked goods, where the user-ranked score was used as a class label. Multinomial Naive Bayes classifier was used as the first baseline solution, and as the second baseline solution Yandex.Toloka assessors’ labeling was used. The novel architectures for the one-layer convolutional neural network and long short memory recurrent neural network were introduced in this work. Since these models require specifying an explicit model architecture and range accompanying hyperparameters (e.g., filter region size, regularization parameters, drop-out probabilities, training epochs range), the training experiments were conducted to optimize the architecture and hyperparameters. Word2Vec algorithm was used to set up pre-trained word embeddings for the one-layer convolutional neural network and long short memory recurrent neural network, which were constructed using Keras with Theano backend. Classification experiments with 3 classes gave the F-measure score for the networks ensemble model up to 71.27%, thus exceeding all baseline models scores.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses