Programming for Urban Analytics
- Familiarise students with different types of urban data sources, file and database types used for storage of such data.
- Discuss the origins and associated limitations of various urban data sources.
- Showcase the practices of explanatory data visualisation in urban planning and research.
- Explain the importance of time and space dimensions of urban data.
- Explain how the data is stored and structured.
- Develop basic skills of applying statistical analysis to large and small data sets.
- Teach basic principles of exploratory data analysis.
- Show how to communicate urban data analysis results through explanatory data visualisation.
- Acquire spatial urban data from files, remote servers and databases using R packages, API and web-scraping
- Write readable and error-free data analysis code in R that allows a third party to reproduce and interpret the analysis.
- Apply exploratory data analysis (EDA) to reveal time and space variations and patterns in urban data
- Clean and Transform spatial urban data to prepare it for exploratory and statistical analysis.
- Apply linear and spatial regression models and clustering to interpret space-time variations and patterns of urban processes.
- Introduction to Smart Cities and Urban Data- Smart city as a concept, as a hype, as a marketing phenomenon, as one of the key causes of emer-gence of urban data. City as a corporation vs. city as a living organism. Adaptation of the city to new technologies. - Automated data generation and collection. Urban data ubiquity. The origins of urban data. Urban data sources. Traditional urban data (state urban statistics) vs. new data sources. - Urban data analysis as part of daily routines of urban dwellers, geo-marketing specialists and tech companies. Outcomes of data ubiquity for urban researchers, planners and managers. - Required skill sets for urban data analysist.
- Introduction to Scripted Data Analysis and Reproducible Research- Introduction to scripted data analysis. Point-and-click analysis vs. scripted analysis: head-to-head comparison. Using GUI (graphical user interface) dialog windows vs. calling functions. Importance of reproducible research with motivating examples. - R language as a statistical command line analysis tool. R language as a programming language. Why R. Comparison of R, Python, Julia, and a few other tools. - Basics of RStudio IDE (Integrated Development Environment). Working with RStudio projects. - Reproducible research using R, R Markdown, R Markdown Notebooks, flexdashboard. - Basic plotting in R. Basic functions and routines applied to classic datasets (mtcars, cars, iris, etc.). Basic data import.
- Data Visualisation and Exploratory Data Analysis- Storytelling with data. Exploratory vs explanatory analysis. Choosing effective visuals for explanato-ry analysis. Gestalt principles of visual perception. Spotting bad graphs and maps. - Exploratory data analysis (EDA) process and tools. Plots vs summary statistics. - Rorschach protocol. Line-up protocol. - R tools for Exploratory data analysis. Advanced plotting using ggplot2 and associated tools. - Interactive plots in R, the simple way. - Plot design layer by layer. Plot customisation according to Gestalt principles of visual perception. Plot optimisation for colour blind accessibility.
- Urban Data Types and Sources. Getting Access to Data- Data sources. Open data. Code books. Means of accessing the data. Working with multiple data sources. Data storage file formats. Databases. Getting data from databases. Intro to getting data from web sources using APIs. - Basic types of data, operators, commands, functions. Approaches to working with data using R. Basic data structures. Objects. - R object types: vectors, matrices, “data.frames”, “tibbles” and “data.tables”. Lists. Differences be-tween object types and use cases. - Exporting data to various formats. Choice of storage file format depending on storage goals. - Basic data manipulation using “data.table” and “dplyr”.
- Tidy Data. Data Cleaning and Transformation- Wide vs long data. Data reshaping and manipulation. Shaping data in analysis-appropriate form. - Tidy data concept. Data cleaning. String and date manipulation. - Regular expressions and their applications for data cleaning. Common pitfalls of regular expressions. - Feature creation. Data type conversion. - Building algorithms for data processing. - Creating functions custom functions, conditional statements, loops for data processing and visualisa-tion.
- Statistical Modelling- Correlation. Simple linear regression. Model fit and interpretation. - Multiple regression. Simple feature selection. Parallel slopes models. - Simple cluster analysis techniques. - A unified framework for application of statistical models in R. Visualization of model results and per-formance.
- Spatial Data Analysis and Statistics- Basics of working with spatial data in R. Spatial data storage formats and object types. Importing spa-tial data from various sources. - Visualising spatial data in R. Static plotting of spatial data. Interactive maps. - Merging and joining spatial data. Spatial data analysis. Geometric operations. - Introduction to spatial statistics. Spatial autocorrelation. Spatial segregation. Spatial generalised linear models.
- Working with APIs and Web Scraping- Advanced work with APIs. Reading API documentation. - Constructing API requests. Processing API responses. Data manipulation for converting API re-sponses into analysis-appropriate form. - Building algorithms for automated data retrieval using APIs. - Web scraping and related copyright and ethical issues. - Simple web scraping techniques. Reshaping of scraped data into analysis-appropriate form.
- Interim assessment (2 module)0.3 * Exam + 0.02 * Lab 01 + 0.05 * Lab 02 + 0.05 * Lab 03 + 0.05 * Lab 04 + 0.05 * Lab 05 + 0.08 * Lab 06 + 0.1 * Lab 07 + 0.1 * Lab 08 + 0.1 * Lab 09 + 0.1 * Lab 10
- Arbia G. A Primer for Spatial Econometrics: With Applications in R. Basingstoke: Palgrave Macmillan, 2014.
- Munzert S. Automated data collection with R: a practical guide to Web scraping and text mining. Chichester, West Sussex, United Kingdom: Wiley, 2014. 1 p.
- Pace L., Hlynka M. Beginning R an introduction to statistical programming. New York: Apress, 2012.
- Peng R.D., Dominici F. Statistical methods for environmental epidemiology with R: a case study in air pollution and health. New York ; London: Springer, 2008. 144 p.
- Wickham H. ggplot2: elegant graphics for data analysis. Second edition. Cham: Springer, 2016. 260 p.
- Arbia G. Spatial Econometrics: Statistical Foundations and Applications to Regional Convergence. Springer Science & Business Media, 2006. 220 p.
- Knaflic C.N. Storytelling with data: a data visualization guide for business professionals. New Jersey: Wiley, 2015.
- Offenhuber D., Ratti C. Decoding the city: Urbanism in the age of big data. Birkhäuser, 2014.