• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
2020/2021

SQL и автоматизация текущей отчетности

Статус: Дисциплина общефакультетского пула
Когда читается: 1, 2 модуль
Преподаватели: Шершуков Данила Вадимович
Язык: русский
Кредиты: 3
Контактные часы: 32

Программа дисциплины

Аннотация

The course develops practical and theoretical skills for using SQL in analytics and automating reports. The course covers using SQL for extracting and analysing data and using SQL, Python and Excel for report automation. The course consists of 7 classes and an exam. Each class consits of a lecture and a seminar. Each seminar starts with some simple tasks followed by a case study.SQL is an important skill for analysts who work with large volumes of data and thus a competetive advantage for students during job search. This course is designed to provide a lot of practice for students in writing queries, so that they could use SQL fluently at work.In this course, students can choose between Excel and Python for report automation. This is done toaccount for varying levels of students' programming skills and the difference in technological stacksthat analysts use, depending on their specialization. As Python requires significant time to master, related material is distributed between other topics in order not to overwhelm students.As half of the course is seminars with intence usage of computers and a lot of interaction with the class teacher, the class size is limited to 40 people. Students are expected to be familiar with:•basic set operations (union, intersection, subtraction, cartesian product),•descriptive statistics,•Python (recommended) or ExcelDuring enrollment, a test may be held to check that students are familiar with basic set operations and basic descriptive statistics.
Цель освоения дисциплины

Цель освоения дисциплины

  • To teach students how to use SQL for doing analytical tasks.
  • To improve students' skills in report automation
Планируемые результаты обучения

Планируемые результаты обучения

  • use a database client with a GUI
  • become fluent in SELECT SQL queries
  • use basic data definition and data manipulation SQL queries
  • apply SQL to solve real-life analytical problems
  • improve report automation skills by learning to automate reports in Excel or in Python, depending on their starting programming experience
Содержание учебной дисциплины

Содержание учебной дисциплины

  • Introduction to databases and SQL
    Databases and servers. Overview of contemporary SQL servers. Connecting to a database using DBeaver. Relational model. SQL overview. Basic data types and functions that work with them. Type casting. Using the query editor in DBeaver. Structure of SELECT queries with more details ontable expressions and select lists. Connecting to databases from Excel (ODBC driver) and Python (psycopg2).
  • Table expressions in detail
    Joins, set operations and subqueries. Using subqueries in table references, conditions and common table expressions. Set-returning functions. Query planner, EXPLAIN ANALYZE, overview of algorythms*. Visualizing query results (Excel or Python).
  • Select list and domain-specific functions
    CASEs. Regular expressions: complex matches and data extraction. Dates and time: working with timezones, common mistakes when working with dates. Advanced logical functions. Logging in Python.
  • Window functions
    Using aggregate functions as window functions. Lag, rank. Making window functions cumulative. Jinja2 (Python).
  • Complex computing and ETLs
    Common mistakes when computing statistics in SQL and how to avoid them. Handling multistage computations while keeping them comprehensible. Working with event chains: restoring snapshots, computing statistics. DDL and DML. Making ETL processes in SQL and Python.
  • Working with geospatial data
    Postgres extensions. Data types and functions for working with geospatial data.
  • SQL in interviews
    Common techniques used in test assignment. Revision.
Элементы контроля

Элементы контроля

  • неблокирующий home assignment 1
  • неблокирующий exam
  • неблокирующий home assignment 2
  • неблокирующий home assignment 3
  • неблокирующий home assignment 4
  • неблокирующий home assignment 5
  • неблокирующий home assignment 6
Промежуточная аттестация

Промежуточная аттестация

  • Промежуточная аттестация (2 модуль)
    0.6 * exam + 0.066 * home assignment 1 + 0.066 * home assignment 2 + 0.067 * home assignment 3 + 0.067 * home assignment 4 + 0.067 * home assignment 5 + 0.067 * home assignment 6
Список литературы

Список литературы

Рекомендуемая основная литература

  • Beaulieu, A. (2009). Learning SQL : Master SQL Fundamentals: Vol. 2nd ed. O’Reilly Media.
  • Beighley, L. (2007). Head First SQL : Your Brain on SQL —— A Learner’s Guide. Reilly - O’Reilly Media.
  • Malik, U., Goldwasser, M., & Johnston, B. (2019). SQL for Data Analytics : Perform Fast and Efficient Data Analysis with the Power of SQL. Packt Publishing.

Рекомендуемая дополнительная литература

  • Linoff, G. (2016). Data Analysis Using SQL and Excel: Vol. Second edition. Wiley.