• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2020/2021

Programming (Python)

Type: Compulsory course (Applied Linguistics and Text Analytics)
Area of studies: Fundamental and Applied Linguistics
Delivered by: School of Literature and Intercultural Communication
When: 1 year, 2 module
Mode of studies: distance learning
Instructors: Alexander Porshnev
Master’s programme: Прикладная лингвистика и текстовая аналитика
Language: English
ECTS credits: 4

Course Syllabus

Abstract

This course will introduce the main data structures of the Python programming language applied in Data Analysis and Text Analytics. We will move to the basics of procedural programming and explore how we can use the Python built-in data structures such as lists, dictionaries, and tuples to perform simple natural language processing and data analysis.
Learning Objectives

Learning Objectives

  • The course is primarily aimed at helping to start use the Python for Text Analysis and will allow them to plan, design, and conduct natural language processing. The course focuses on the following competencies: programming and natural language processing. The course contains three main topics – Getting started with Python, NLTK and Categorizing and Tagging.
Expected Learning Outcomes

Expected Learning Outcomes

  • Ability to create a program on Python to made simple program using lists, dictionaries and strings.
  • Ability to use NLTK library in your program on Python
  • Ability to create a program on Python to made frequency analysis (words, n-grams, stems, normalized words, parts of speech, sentiment markers for the set of English and Russian texts.
Course Contents

Course Contents

  • Categorizing and Tagging
    Using existing part-of-speech taggers NLTK build and MyStem. Mapping Words to Properties Using Python Dictionaries. N-Gram Tagging
  • NLTK Library for text analytics
    Getting Started with NLTK. Searching Text. Counting Vocabulary. Frequency Distributions. Collocations and Bigrams (N-Grams). The NLP Pipeline: Normalizing Text. Sentence Segmentation.
  • Computing with Language: Texts and Words.
    Getting started with Python (IDE, PyCharm, Jupyter Notebook). Variables and Data structures: lists, dictionaries, strings. Conditionals. Operating on Every Element. Nested Code Blocks. Looping with Conditions. Regular Expressions for Detecting Word Patterns. Accessing Text from the Web and from Disk. Writing Results to a File
Assessment Elements

Assessment Elements

  • non-blocking Programming task
  • non-blocking Programming task 0
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.7 * Programming task + 0.3 * Programming task 0
Bibliography

Bibliography

Recommended Core Bibliography

  • Wagner, W. (2010). Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit. Language Resources & Evaluation, 44(4), 421–424. https://doi.org/10.1007/s10579-010-9124-x

Recommended Additional Bibliography

  • Beysolow, T. (2018). Applied Natural Language Processing with Python : Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1892182
  • Bhavsar, K., Kumar, N., & Dangeti, P. (2017). Natural Language Processing with Python Cookbook : Over 60 Recipes to Implement Text Analytics Solutions Using Deep Learning Principles. Packt Publishing.
  • Dipanjan Sarkar. (2019). Text Analytics with Python : A Practitioner’s Guide to Natural Language Processing: Vol. Second edition. Apress.
  • Nirant Kasliwal. (2018). Natural Language Processing with Python Quick Start Guide : Going From a Python Developer to an Effective Natural Language Processing Engineer. Packt Publishing.
  • Perkins, J. (2010). Python Text Processing with NLTK 2.0 Cookbook : Over 80 Practical Recipes for Using Python’s NLTK Suite of Libraries to Maximize Your Natural Language Processing Capabilities. Packt Publishing.