Бакалавриат
2020/2021
Анализ данных в социологии
Статус:
Курс обязательный (Социология и социальная информатика)
Направление:
39.03.01. Социология
Кто читает:
Департамент социологии
Где читается:
Санкт-Петербургская школа социальных наук
Когда читается:
4-й курс, 3 модуль
Формат изучения:
без онлайн-курса
Преподаватели:
Широканова Анна Александровна
Язык:
английский
Кредиты:
4
Контактные часы:
32
Course Syllabus
Abstract
The Data Analysis in Sociology in the 4th year focuses on developing a conscious, systematized approach to data analysis and on solving old problems and anxieties at doing data analysis. The course finishes with a discussion of data culture in day-to-day research.
Learning Objectives
- The course covers the foundations and popular techniques of quantitative data analysis with the goal of training students to be informed producers and consumers of quantitative research.
- Two specific goals of this course are to systematize the principles of data analysis for all standard problems and to alleviate old-time anxieties related to any part of the data analysis cycle.
Expected Learning Outcomes
- Students can apply a theoretical framework to define hypotheses and explain the results of a study; they can apply appropriate visualization to communicate the results.
- Students can generalize and analyze the data they have, assess it critically, express their own opinions, and give their interpretation on the best possible decision.
- Students can set research goals, propose a research plan based on the results of previous research and social theory, carry out data analysis and report the results.
Course Contents
- Best Practices in Data WranglingBuilding data acumen: making meaningful, correct and useful judgments about data. Privacy and ethical concerns in data analysis and research. Data culture areas: data life-cycle, data curation, understanding causality, understanding conditional and joint probabilities, false negatives and false positives, critical assessment of popular practices and further use of R functionalities to make sense of the data. The data life-cycle: generation, collection, processing, management, analysis, visualization, interpretation, and delivery.
- Data ManagementGetting and cleaning data in R. Importing data from various formats. Applying standard operations in an industrial setting. Good practices in recoding, rescaling, reordering, discretizing, and renaming. Packages for quick calculations and interactive visualizations. Data curation. Providing reproducible results with documented code and simulations. Delivering results in applications. Data simulation for hypothesis testing. Transforming data values to simplify the structure. Research questions and data types.
- Communicating Data Analysis ResultsLearning from data. Effective communication to decision-makers. Common Best practices in science communication. Understanding the message and audience. Choosing the best visualization. Visualizing complex data: distributions, change over time, correlations. Common pitfalls in data visualization. Customizing graphs and reports in R.
Assessment Elements
- Written ExamThe exam consists of four problems involving the methods covered in this course.
- Test 1If the student has a respected reason to miss the test, the student should inform the instructor about it before the test. The documents confirming the student's absence are to be presented no later than two weeks after the test, otherwise, they will not be considered.
- Test 2If the student has a respected reason to miss the test, the student should inform the instructor about it before the test. The documents confirming the student's absence are to be presented no later than two weeks after the test, otherwise, they will not be considered.
- Practice engagement
- Homework tasksIndividual practice files should be submitted as knitted R Markdown files (HTML) turned in via the MS Teams Assignments section (don't forget to click on the button 'Turn In'). If the students fail to knit their own script, the mark for submission is cut by half.
- Online projects
Interim Assessment
- Interim assessment (3 module)0.2 * Homework tasks + 0.1 * Online projects + 0.3 * Practice engagement + 0.1 * Test 1 + 0.1 * Test 2 + 0.2 * Written Exam
Bibliography
Recommended Core Bibliography
- Inter-university Consortium for Political and Social Research. (2012). Guide to Social Science Data Preparation and Archiving: Best Practice Throughout the Data Life Cycle. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.AA22F59E
- Knaflic, C. N. (2015). Storytelling with Data : A Data Visualization Guide for Business Professionals. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1079665
- Wickham, H., & Grolemund, G. (2016). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1440131
Recommended Additional Bibliography
- Yau, N. (2013). Data Points : Visualization That Means Something. New York: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=566405