• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Combining and Analyzing Complex Data

2019/2020
Academic Year
ENG
Instruction in English
3
ECTS credits
Course type:
Elective course
When:
1 year, 4 module

Instructor

Course Syllabus

Abstract

The course teaches how to use survey weights to estimate descriptive statistics, like means and totals, and more complicated quantities like model parameters for linear and logistic regressions. It also covers the basics of record linkage and statistical matching—both of which are becoming more important as ways of combining data from different sources. Combining of datasets raises ethical issues which the course reviews. Informed consent may have to be obtained from persons to allow their data to be linked. Students learn about differences in the legal requirements in different countries. The course is provided by University of Maryland. The full course description is available here: https://www.coursera.org/learn/data-collection-analytics-projec After completing the course students have to prepare a short 2 pages essay summarizing the main learnings form the course. For the essay the formal requirements for Master Thesis and Term Paper apply. No other format is accepted. The essay is graded and counts for 30% of the final grade. Having submitted the essay students are examined in an oral group examination involving up to 5 students and lasting 30 minutes. The oral group examination weights 70% of the final grade. Each student is assessed individually during the oral group examination.
Learning Objectives

Learning Objectives

  • Ability to design market research tools
Expected Learning Outcomes

Expected Learning Outcomes

  • Skills in survey design and tool development
  • Skills in data collection from different sources
Course Contents

Course Contents

  • Basic Estimation
    After completing Modules 1 and 2 of this course you will understand how to estimate descriptive statistics, overall and for subgroups, when you deal with survey data. We will review software for estimation (R, Stata, SAS) with examples for how to estimate things like means, proportions, and totals. You will also learn how to estimate parameters in linear, logistic, and other models and learn software options with emphasis on R. Module 3 and 4 discuss how you can add additional data to your analysis. This requires knowing about record linkage techniques, and what it takes to get permission to link data.
  • Models
    Module 2 covers how to estimate linear and logistic model parameters using survey data. After completing this module, you will understand how the methods used differ from the ones for non-survey data. We also cover the features of survey data sets that need to be accounted for when estimating standard errors of estimated model parameters.
  • Record Linkage
    Module starts with the current debate on using more (linked) administrative records in the U.S. Federal Statistical System, and a general motivation for linking records. Several examples will be given on why it is useful to link data. Challenges of record linkage will be discussed. A brief overview over key linkage techniques is included as well.
  • Ethics
    This module will discuss key issues in obtaining consent to record linkage. Failure to consent can lead to bias estimates. Current research examples will be given as well as practical suggestions on how to obtain linkage consent.
Assessment Elements

Assessment Elements

  • non-blocking Essay
  • non-blocking Final oral group examination
    The Exam is planned as an ORAL GROUP EXAMINATION, online on ZOOM Platform. A Student should log in 20 minutes prior to Exam Session. Temporary internet breakdown is for up to 10 min. If longer - a written request to the course director, cc study office manager for further decision to reschedule the Exam for another date for examination: with different exam questions.
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.3 * Essay + 0.7 * Final oral group examination
Bibliography

Bibliography

Recommended Core Bibliography

  • Archilla, A. R., & Madanat, S. (2001). Estimation of Rutting Models by Combining Data from Different Sources. Journal of Transportation Engineering, 127(5), 379. https://doi.org/10.1061/(ASCE)0733-947X(2001)127:5(379)
  • Hensher, D., Louviere, J., & Swait, J. (1998). Combining sources of preference data. Journal of Econometrics, (1–2), 197. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsrep&AN=edsrep.a.eee.econom.v89y1998i1.2p197.221
  • Laura Camfield. (2018). Rigor and Ethics in the World of Big-team Qualitative Data: Experiences From Research in International Development. https://doi.org/10.1177/0002764218784636

Recommended Additional Bibliography

  • Conti, G., & Badin, G. (2019). Statistical Measures and Selective Decay Principle for Generalized Euler Dynamics. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1907.05069