• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Магистратура 2019/2020

Введение в методы сбора и анализа больших данных

Направление: 39.04.01. Социология
Когда читается: 1-й курс, 2 модуль
Формат изучения: Full time
Прогр. обучения: Сравнительные социальные исследования
Язык: английский
Кредиты: 5

Программа дисциплины

Аннотация

The growth of Internet penetration and the possibility of collecting and analyzing big data have produced new challenges and have offered new opportunities for researchers and official statistics. Within several years nonreactive and big data has become the main trend in the social sciences. Nonreactive methods include nonparticipant observation and analysis of digital fingerprints such as likes or shares, as well as private documents such as blogs, social media profiles and comments, or public online documents such as mass media materials. People post information, tweet, retweet or share information other people post. Social scientists can apply their experience of designing social research, as well as experimental and quasi-experimental studies to use big data for drawing valid inferences. This course will give an introduction to key quantitative approaches to the collection of non-reactive data in social sciences. The course is taught in the form of lectures, seminars, and individual work. The goal of the course is to introduce the opportunities of nonreactive and big data for social scientists and learn basic methods and tools to collect nonreactive data. Within the course some R packages will be used for data analysis (it is freely available at https://www.r-project.org).
Цель освоения дисциплины

Цель освоения дисциплины

  • to learn basic concepts of nonreactive data in social sciences
  • to be able to collect nonreactive data in social sciences
  • to learn opportunities and limitations of applying big data in social sciences
  • to be able to apply big data in social sciences
Результаты освоения дисциплины

Результаты освоения дисциплины

  • to learn basic concepts of nonreactive data in social sciences
  • to learn opportunities and limitations of applying big data in social sciences
  • to be able to apply big data in social sciences
  • to learn text mining and network analysis in R
  • to be able to collect nonreactive data in social sciences
  • to be able to collect nonreactive data in social media
Содержание учебной дисциплины

Содержание учебной дисциплины

  • Introduction to the course. Reactive and nonreactive methods
    Reactive and nonreactive methods. Nonreactive online methods. Nonparticipant observation and analysis of “digital footprints”. Big data. The typology of nonreactive data. Social media, clickstream data, tracking data. The opportunities and limitations of big data in social sciences. Ethical concerns.
  • Introduction to text mining and network analysis in R
    R Markdown. Regular expressions and essential string functions. Basic data visualization. Introduction to text mining. Introduction to network analysis: basic definitions, centrality measures, different approaches.
  • Introduction to webscraping in R.
    Introduction to webscraping in R. Collecting online data. Collecting unstructured and structured data via R. Scraping web data from APIs.
  • Collecting data in Vkontakte, Twitter, Facebook
    Collecting data in Vkontakte, Twitter, Facebook. Opportunities and limitations.
Элементы контроля

Элементы контроля

  • Essay (неблокирующий)
  • class attendance (неблокирующий)
  • class participation (неблокирующий)
Промежуточная аттестация

Промежуточная аттестация

  • Промежуточная аттестация (2 модуль)
    0.1 * class attendance + 0.2 * class participation + 0.7 * Essay
Список литературы

Список литературы

Рекомендуемая основная литература

  • Data mining with R : learning with case studies, Torgo, L., 2017

Рекомендуемая дополнительная литература

  • Gillespie, C., & Lovelace, R. (2016). Efficient R Programming : A Practical Guide to Smarter Programming. Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1435808