• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Dependency Parser Using Neural Networks for NLTK

Student: Ghukasyan Tsolak

Supervisor: Denis Y. Turdakov

Faculty: Faculty of Computer Science

Educational Programme: Software Engineering (Bachelor)

Final Grade: 7

Year of Graduation: 2017

The Natural Language Toolkit (NLTK) is a leading suite of libraries for natural language processing, and with the rise of artificial intelligence, its popularity has only increased in the recent years. Since existing parsers in NLTK are either outdated or not implemented in Python, its primary programming language, this project proposes an implementation of a fast and accurate dependency parser to include in the toolkit. In this work, the transition-based parsing algorithm created by Danqi Chen and Christopher D. Manning is implemented and incorporated into NLTK. The algorithm uses artificial neural networks for predicting transitions and introduces several groundbreaking concepts. The parser employs an efficient new way of feature representation, particularly for part-of-speech tags and arc labels. Instead of one-hot encoding, dense representations are used for words, tags, and labels. Another innovation is the use of cube activation function in the neural network classifier instead of the commonly used tanh and sigmoid functions. Index Terms — natural language processing, dependency parsing, artificial neural networks, NLTK.

Full text (added May 26, 2017)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses