• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Student
Title
Supervisor
Faculty
Educational Programme
Final Grade
Year of Graduation
Evgenij Smirnov
Applying recursive tensor neural network model to sentiment analysis of internet shop reviews in Russian language
School of Applied Mathematics and Information Science
Bachelor’s programme
2014
With an increase in the number of users that post their opinions on the internet, manual processing of opinions has become impossible. Therefore, the field of automatic sentiment analysis is developing rapidly. One of the recent methods of sentiment analysis, Recursive Neural Tensor Model, has showed much better results compared to earlier approaches of text analysis in English. The main goals of this study are to develop a program which implements this technique for texts in Russian, to reveal its advantages and disadvantages, and to assess a possibility of its practical application to sentiment analysis of internet-shop reviews.In order to test the model a sample of 247 reviews on a particular internet shop from Yandex.Market website was extracted. Each review contains a five-point scale rating of the shop’s quality. These ratings are considered to be estimations of the sentiment of each sentence in the review. For each sentence in the sample a binary dependency tree was constructed with the use of “АОТ” morphology analyzer and the programming module for interactive syntax analysis developed by the authors of this study. Reviewers’ original spelling in sample was retained though.The studied model was implemented using C++ programming language. It was compared with classic sentiment analysis methods, based on bag-of-words model, by the accuracy of binary review classification into two classes (with rating from “1” to “2” and with rating from “3” to “5”) using only one sentence from the review. The conducted experiments showed that though the studied model takes syntactic information into account, its accuracy is lower than the accuracy of one of the classic methods that does not require this information. Moreover, it was noted that unlike classic techniques, for which accuracy tends to rise with an increase in the number of words in a sentence, Recursive Neural Tensor Model shows good accuracy for short and medium-sized sentences (up to 15 words) while its accuracy for long sentences is significantly lower.This behavior of the studied model may be caused by accumulating of a big number of grammar mistakes in the sample and also of the fact that unlike the original study an estimation of sentiment of each phrase of each sentence for the used sample is unknown.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses