• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2019/2020

Data Visualization

Type: Elective course (Big Data Systems)
Area of studies: Business Informatics
Delivered by: Department of Innovation and Business in Information Technologies
When: 1 year, 3, 4 module
Mode of studies: offline
Instructors: Petr Panfilov
Master’s programme: Big Data Systems
Language: English
ECTS credits: 4
Contact hours: 48

Course Syllabus

Abstract

Data and information visualization is the graphical communication of data and information for the purposes of presentation, confirmation, exploration, and analysis. Images can be used to convey numbers, concepts, and relationships using techniques such as maps, icons, graphs, and other visual forms. In the past decade, visualization has evolved into a discipline, drawing from such fields as computer graphics, human-computer interaction, perceptual psychology, and art. The emphasis of the course will be on exposing students to the current research issues and on identifying potential research topics in data visualization as it applies to large-scale big data systems.
Learning Objectives

Learning Objectives

  • To introduce students to the fundamental problems, concepts, and approaches in the design and analysis of data visualization systems.
  • To familiarize students with the stages of the visualization pipeline, including data modeling, mapping data attributes to graphical attributes, perceptual issues, existing visualization paradigms, techniques, and tools, and evaluating the effectiveness of visualizations for specific data, task, and user types.
Expected Learning Outcomes

Expected Learning Outcomes

  • know the history of data visualization and its connection with computer graphics
  • understand the visualization pipeline with its relationship to other data analysis pipelines
  • know the definition(s) of the visualization and interpretations of the notion
  • know categories of visualization and application areas
  • understand the foundations and characteristics of data, which forms the beginning of the visualization pipeline
  • understand the types of transformation the data has undergone to improve the effectiveness of the visualization
  • know various types of data
  • understand human component of the visualization pipeline, characteristics of the perceptual system and the roles it plays in understanding and interpreting visualizations
  • know approaches to understand visual perception, cognitive issues and recognition of visuals
  • know basics of physiology of the human visual system, perceptual processing, human cognition framework for data visualization
  • understand the foundations of the visualization processes, from basic building blocks to taxonomies and frameworks
  • understand the visualization pipeline
  • know the techniques and systems developed to date
  • know the visualization techniques, loosely grouped by data characteristics
  • know the methods and algorithms used to map data to graphical depictions
  • know what spatial attributes of data will map to the spatial attributes (locations) on the screen
  • understand the techniques that have been applied to spatial data
  • understand the characteristics and methods that are needed for the visualization of geospatial data
  • know state-of-the-art techniques, geographic information systems (GIS) and cartography tools used for geospatial data visualization
  • describe the methods and algorithms used to map data to graphical depictions
  • know different temporal data visualization techniques represented in TimeViz Browser
  • use TimeBench, a data model and software library for visualization and visual analytics for time-oriented data
  • understand graphical primitives used in the rendering and techniques that combine two or more of these types of primitives
  • knows the methods and algorithms used to map data to graphical depictions of relational information, displaying hierarchical structures
  • understand graphical primitive used in the rendering and techniques that visualize data to convey relational information
  • understand fundamental computational approaches to transforming unstructured text into structured data suitable for visualization and analysis
  • demonstrate knowledge of basic visualizations for document collections data such as node graphs, ThemeRiver, Calendar View
  • understand the role of user interaction within visualizations
  • know classes of interactive operations and can define them in terms of operators and the operand
  • know the interactive visualization architecture that combines the interaction spaces into a single pipeline
  • know a wide range of interaction techniques and styles
  • know algorithms and implementation details for interaction concepts
  • understand the visualization design process
  • understand the design considerations for the components of the good visualization
  • know steps in designing visualization
  • understand problems found in visualizations and techniques for avoiding these problems
  • know principles and guidelines to improve the effectiveness of specific visualizations
  • know techniques for evaluating the resulting visualizations
  • understand components and procedures necessary to assess and compare the effectiveness of visualization techniques
  • know variety of available visualization systems and toolkits
  • can identify key features and observed limitations
  • know some commercial data visualization packages with functionality
  • understand the trends in data visualization field of research and application development
  • know the web-sources of up to date information on hot topics in data visualization research and development
  • can identify directions for future work to advance the knowledge of the field
Course Contents

Course Contents

  • Introduction
    A high-level introduction to data visualization, what visualizations are, and why imagery is so important. A reasons for using visualization are discussed. Applications of visualizations to problem solving are shown. The process of visualization is discussed. The visualization pipeline is presented with its relationship to other data analysis pipelines.
  • Data foundation
    Every visualization starts with the data that is to be displayed, a first step in addressing the design of visualizations is to examine the characteristics of the data.
  • Human Perception and Information Processing
    Focus on human perception and the different ways in which graphics and images are seen and interpreted. The early approach to the study of perception focused on the vision system and its capabilities. Later approaches looked at cognitive issues and recognition.
  • Visualization foundations
    The review of the visualization pipeline and the discussion on various ways to view the multitudes of techniques and systems that have been developed to date.
  • Visualization Techniques for Spatial Data
    Spatial data visualization assumes that the data has an implicit or explicit spatial or spatiotemporal attribute. This constraint facilitates both the creation and interpretation of the visualization, as there is an intuitive, and often straightforward, mapping of the data attributes to graphical attributes of the entities conveying the information in the visualization.
  • Visualization Techniques for Geospatial Data
    An overview of the special characteristics and methods that are needed for the visualization of geospatial data, sometimes called geovisualization. An introduction to the most important basics of geospatial visualization, such as map projections, and discuss visualization techniques for point, line, area, and surface data.
  • Visualization Techniques for Time-Oriented Data
    The importance of handling the temporal dimension is discussed. The needed concepts and aspects of time and time-oriented data are defined. When considering data we focus particularly on the characteristics of time. Then an overview of different temporal data visualization techniques is given, together with an explanation of TimeBench, a data model and software library for visualization and visual analytics of time-oriented data.
  • Visualization Techniques for Multivariate Data
    Discussion on techniques for the visualization of data that does not generally have an explicit spatial attribute. We organize the presentations based on the graphical primitive used in the rendering, namely, points, lines, or regions, followed by techniques that combine two or more of these types of primitives.
  • Visualization Techniques for Trees, Graphs, and Networks
    Another important application of visualization is the conveying of relational information, e.g., how data items or records are related to each other. These interrelationships can take many forms: part/subpart, parent/child, or other hierarchical relation. Relationships can be simple or complex: unidirectional or bi-directional, nonweighted or weighted, certain or uncertain. Indeed, the relationships may provide more and richer information than that contained in the data records.
  • Text and Document Visualization
    Visualization is a great aid in analyzing data from libraries, e-mail archives, all facets of applications running on the World Wide Web. We can visualize in many different ways things such as a blog, a wiki, a twitter feed, billions of words, a collection of papers, or a digital library. Since visualizations are task dependent, we can look at what tasks are necessary for dealing with text, documents, or web-based objects.
  • Interaction Concepts
    Interaction within the data and information visualization context is a mechanism for modifying what the users see and how they see it. Many classes of interaction techniques exist. We describe a framework for interaction techniques, identifying distinct classes and shared concepts that will help facilitate discussions and focus future research.
  • Interaction Techniques
    Examples from each of the interaction spaces, and guidelines for implementing some of the user interaction dialogs needed to specify some of the types of interaction. Note that many of the algorithms discussed here could actually be applied to interactions in multiple spaces.
  • Designing Effective Visualizations
    Some guidelines for designing successful visualizations. A successful visualization is one that efficiently and accurately conveys the desired information to the targeted audience, while bearing in mind the task or purpose of the visualization (exploration, confirmation, presentation). For any particular set of data there is a myriad of possible methods for mapping data components to graphical entities and attributes. Similarly, there exists a wide range of interactive tools that the user may be provided. Selecting the most effective combinations of techniques is by no means a straightforward process.
  • Comparing and Evaluating Visualization Techniques
    Some of the components and procedures necessary to assess and compare the effectiveness of visualization techniques. We outline a procedure for designing a benchmark process that would enable a person charged with comparing two or more visualization techniques to approach the problem in a methodical way.
  • Visualization Systems
    An overview of a number of data and information visualization systems and toolkits. We have concentrated primarily on software that is freely available, to enable students interested in exploring further in the field of visualization to try out existing technology.
  • Research Directions in Visualization
    We identify and elaborate upon some of the common themes within agenda reports. Interested students are directed to the original reports for in-depth descriptions, as well as identification of research areas specific to particular branches of the ever-enlarging field of visualization.
Assessment Elements

Assessment Elements

  • non-blocking In Class Activity
  • Partially blocks (final) grade/grade calculation Referate (Individual Study)
    Study material is based on analysis of one-two recent research papers on topic.
  • Partially blocks (final) grade/grade calculation Homework Assignment (Group Project)
    Project involves programming or use of existing tools for visualization of selected datasets.
  • Partially blocks (final) grade/grade calculation Final Examination
    Examination format: The examination is taken written (essay) with asynchronous proctoring. Asynchronous proctoring means that all the student's actions during the exam will be “watched” by the computer. The exam process is recorded and analyzed by artificial intelligence and a human (proctor). Please be careful and follow the instructions clearly! The platform: The exam is conducted on the StartExam platform. StartExam is an online platform for conducting test tasks of various levels of complexity. The link to pass the exam task will be available to the student in the RUZ. The computers must meet the following technical requirements: https://eduhseru-my.sharepoint.com/:b:/g/personal/vsukhomlinov_hse_ru/EUhZkYaRxQRLh9bSkXKptkUBjy7gGBj39W_pwqgqqNo_aA?e=fn0t9N A student is supposed to follow the requirements below: Prepare identification documents (а passport on a page with name and photo) for identification before the beginning of the examination task; Check your microphone, speakers or headphones, webcam, Internet connection (we recommend connecting your computer to the network with a cable, if possible); Prepare the necessary writing equipment, such as pens, pencils, pieces of paper, and others. Disable applications on the computer's task other than the browser that will be used to log in to the StartExam program. If one of the necessary requirements for participation in the exam cannot be met, a student is obliged to inform a program manager 7 days before the exam date to decide on the student's participation in the exams. Important rules: All rules are available in exam regulations using asynchronous proctoring technology in the framework of intermediate certification. Connection failures: A short-term connection failure during the exam is considered to be the loss of a student's network connection with the StartExam platform for no longer than 5 minutes per exam. A long-term connection failure during the exam is considered to be the loss of a student's network connection with the StartExam platform for longer than 5 minutes per exam and will be the basis for the decision to terminate the exam. In case of long-term connection failure in the StartExam platform during the examination task, the student must record the fact of connection failure (screenshot, a response from the Internet provider). Then contact the program manager with an explanatory note about the incident to decide on retaking the exam.
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.2 * Final Examination + 0.4 * Homework Assignment (Group Project) + 0.1 * In Class Activity + 0.3 * Referate (Individual Study)
Bibliography

Bibliography

Recommended Core Bibliography

  • Ward, M., Grinstein, G. G., & Keim, D. (2015). Interactive Data Visualization : Foundations, Techniques, and Applications, Second Edition (Vol. Second edition). Boca Raton: A K Peters/CRC Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1763678

Recommended Additional Bibliography

  • Bertamini, M., & Kubovy, M. (2018). Human Perception. Routledge.
  • Brath, R., & Jonker, D. (2015). Graph Analysis and Visualization : Discovering Business Opportunity in Linked Data. Wiley.
  • Cao, N., & Cui, W. (2016). Introduction to Text Visualization. Atlantis Press.
  • Dimara, E., & Perin, C. (2020). What is Interaction for Data Visualization? https://doi.org/10.1109/TVCG.2019.2934283
  • Federico, P., & Miksch, S. (2016). Evaluation of two interaction techniques for visualization of dynamic graphs.
  • Geospatial data and knowledge on the Web : Knowledge-based geospatial data integration and visualisation with Semantic Web technologies. (2020). [Lund University, Faculty of Science, Department of Physical Geography and Ecosystem Science].
  • GHINEA, M., FRUNZA, D., Chardonnet, J.-R., Merienne, F., & Kemeny, A. (2018). Perception of Absolute Distances Within Different Visualization Systems: HMD and CAVE.
  • Grinstein, G., Sieg, J. C., Jr., Smith, S., & Williams, M. G. (1992). Visualization for Knowledge Discovery. International Journal of Intelligent Systems, 7(7), 637–648. https://doi.org/10.1002/int.4550070706
  • Hauser, H., Rheingans, P., & Scheuermann, G. (2018). Foundations of Data Visualization (Dagstuhl Seminar 18041). https://doi.org/10.4230/DAGREP.8.1.100
  • He, X., Tao, Y., Wang, Q., & Lin, H. (2019). Multivariate Spatial Data Visualization: A Survey. https://doi.org/10.1007/s12650-019-00584-3
  • Kuntal, B. K., & Mande, S. S. (2017). Web-igloo: a web based platform for multivariate data visualization. Bioinformatics (Oxford, England), 33(4), 615–617. https://doi.org/10.1093/bioinformatics/btw669
  • Logre, I., & Dery-Pinna, A.-M. (2018). MDE in Support of Visualization Systems Design: a Multi-Staged Approach Tailored for Multiple Roles. https://doi.org/10.1145/3229096
  • Mohammad Alharbi, & Robert S. Laramee. (2019). SoS TextVis: An Extended Survey of Surveys on Text Visualization. Computers, 1, 17. https://doi.org/10.3390/computers8010017
  • Patterson, D., Hicks, T., Dufilie, A., Grinstein, G., & Plante, E. (2015). Dynamic Data Visualization with Weave and Brain Choropleths. Plos One, 10(9), e0139453. https://doi.org/10.1371/journal.pone.0139453
  • Plaisant, C., Fekete, J.-D., & Grinstein, G. (2008). Promoting Insight-Based Evaluation of Visualizations: From Contest to Benchmark Repository. https://doi.org/10.1109/TVCG.2007.70412
  • Rind, A. (2017). A software framework for visual analytics of time-oriented data.
  • Tominski, C. (2015). Interaction for Visualization. Morgan & Claypool Publishers.
  • Uchida, Y., Xinyun, M., Matsuno, S., Iha, Y., & Sakamoto, M. (2017). Preliminary Study on a System for Visualization of Big Data in SMEs.
  • Walny, J., Frisson, C., West, M., Kosminsky, D., Knudsen, S., Carpendale, S., & Willett, W. (2019). Data Changes Everything: Challenges and Opportunities in Data Visualization Design Handoff.
  • Wang, C., & Tao, J. (2017). Graphs in Scientific Visualization: A Survey. Computer Graphics Forum, 36(1), 263–287. https://doi.org/10.1111/cgf.12800
  • Werner Purgathofer, & Helwig Löffelmann. (1997). Selected New Trends in Scientific Visualization.
  • Wolfgang Aigner, Silvia Miksch, Wolfgang Müller, Heidrun Schumann, & Christian Tominski. (2008). Visual Methods for Analyzing Time-Oriented Data.