Year of Graduation
Sentiment Analysis Using Semantic Features Based on Machine Learning Methods
Nowadays the customer generated reviews on e-commerce sites tend to be a valuable resource regarding the evaluation of customers’ behavior, their preferences and needs. This paper provides an approach for sentiment analysis of product reviews in Russian, which uses machine learning methods with semantic features. The training dataset was collected from order reviews on Pandao and Aliexpress top-ranked goods, where the user-ranked score was used as a class label. Multinomial Naive Bayes classifier was used as the first baseline solution, and as the second baseline solution Yandex.Toloka assessors’ labeling was used. The novel architectures for the one-layer convolutional neural network and long short memory recurrent neural network were introduced in this work. Since these models require specifying an explicit model architecture and range accompanying hyperparameters (e.g., filter region size, regularization parameters, drop-out probabilities, training epochs range), the training experiments were conducted to optimize the architecture and hyperparameters. Word2Vec algorithm was used to set up pre-trained word embeddings for the one-layer convolutional neural network and long short memory recurrent neural network, which were constructed using Keras with Theano backend. Classification experiments with 3 classes gave the F-measure score for the networks ensemble model up to 71.27%, thus exceeding all baseline models scores.