Master
2019/2020
Data Mining
Category 'Best Course for Career Development'
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Category 'Best Course for New Knowledge and Skills'
Type:
Elective course (Applied Statistics with Network Analysis)
Area of studies:
Applied Mathematics and Informatics
Delivered by:
International laboratory for Applied Network Research
When:
1 year, 1, 2 module
Mode of studies:
offline
Master’s programme:
Applied Statistics with Network Analysis
Language:
English
ECTS credits:
4
Contact hours:
48
Course Syllabus
Abstract
Covers topics in data mining, including visualization techniques, elements of machine learning theory, classification and regression trees, Generalized Linear Models, Spline approach, and other related topics.
Learning Objectives
- The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.
Expected Learning Outcomes
- Know well-known sequential pattern mining methods, including methods for mining sequential patterns, such as GSP, SPADE, PrefixSpan, and CloSpan
- Know various pattern mining applications, such as mining spatiotemporal and trajectory patterns and mining quality phrases.
- Know efficient pattern mining methods, such as Apriori, ECLAT, and FPgrowth.
- Know constraint-based pattern mining, including methods for pushing different kinds of constraints, such as data and pattern-based constraints, anti-monotone, monotone, succinct, convertible, and multiple constraints.
- Be able to recall important pattern discovery concepts, methods, and applications, in particular, the basic concepts of pattern discovery, such as frequent pattern, closed pattern, max-pattern, and association rules.
- Be able to compare pattern evaluation issues, especially several popularly used measures, such as lift, chisquare, cosine, Jaccard, and Kulczynski, and their comparative strengths.
- Be able to compare mining diverse patterns, including methods for mining multi-level, multi-dimensional patterns, qualitative patterns,
- Be able to compare negative correlations, compressed and redundancy-aware top-k patterns, and mining long (colossal) patterns.
Course Contents
- IntroductionCourse Orientation; Course Pattern Discovery Overview; Pattern Discovery Basic Concepts; Efficient; Pattern Mining Methods; Pattern Discovery
- Pattern evaluationThe session sets up the framework for pattern evaluation and mining diverse frequent patterns. It also addresses Sequential Pattern Mining; Pattern Mining Applications; Mining Spatiotemporal and Trajectory Patterns.
- Pattern mining IThe session gives an overview into pattern-based mining, graph pattern mining, and pattern-based classification.
- Pattern mining IIThis sessions builds the understanding of Pattern Mining Applications: Mining Quality Phrases-from Text Data; Advanced Topics on Pattern Discovery.
- Cluster analysisCluster Analysis Overview; Cluster Analysis Introduction; Similarity Measures for Cluster Analysis
- Clustering Methods IThis session will continue the topic of clustering with Partitioning-Based Clustering Methods; Hierarchical Clustering Methods.
- Clustering Methods IIHierarchical Clustering Methods (continued); Density-Based and Grid-Based Clustering Methods
- Clustering Methods IIIThis session will conclude clustering with methods for clustering validation.
Assessment Elements
- Final take-home project
- Homework Assignments (5 x Varied points)
- In-Class Labs (9-10 x Varied points)
- Quizzes (Best 9 of 10, Varied points)
Interim Assessment
- Interim assessment (2 module)0.5 * Final take-home project + 0.2 * Homework Assignments (5 x Varied points) + 0.2 * In-Class Labs (9-10 x Varied points) + 0.1 * Quizzes (Best 9 of 10, Varied points)
Bibliography
Recommended Core Bibliography
- ElAtia, S., Ipperciel, D., & Zaiane, O. R. (2017). Data Mining and Learning Analytics : Applications in Educational Research. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1351385
- Han, J., & Kamber, M. (2011). Data Mining: Concepts and Techniques (Vol. 3rd ed). Burlington, MA: Morgan Kaufmann. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=377411
- Larose, D. T., & Larose, C. D. (2015). Data Mining and Predictive Analytics. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=958471
- S. K. Mourya, & Shalu Gupta. (2013). Data Mining and Data Warehousing. [N.p.]: Alpha Science Internation Limited. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1688519
Recommended Additional Bibliography
- Brown, M. S. (2014). Data Mining For Dummies. Hoboken: For Dummies. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=842663
- Knobbe, A. J. (2006). Multi-relational Data Mining. Amsterdam: IOS Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=176061
- Motoda, H. (2002). Active Mining : New Directions of Data Mining. Amsterdam: IOS Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=87558