• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Магистратура 2017/2018

Научный семинар "Интеллектуальные системы и структурный анализ"

Статус: Курс по выбору (Науки о данных)
Направление: 01.04.02. Прикладная математика и информатика
Когда читается: 2-й курс, 1-3 модуль
Формат изучения: Full time
Прогр. обучения: Науки о данных
Язык: английский
Кредиты: 6

Программа дисциплины


The discipline goal is to develop students' professional skills required for independent analytical work in applied fields of the computer science. Also, this course aims to improve skills of students in developing their research projects related with dialogue systems and chat bots. This course focuses on analysis of scientific and industrial linguistic system developing and motivates visiting different scientific colloquium at the university, especially at the faculty of computer science.
Цель освоения дисциплины

Цель освоения дисциплины

  • The Research Seminar should help students to form the basic skills training to make and present their own research, motivate to engage in the scientific activity.
Результаты освоения дисциплины

Результаты освоения дисциплины

  • Know basic principles of developing task-oriented linguistic dialogue systems.
  • Formulate the task and goals for an independent research and/or scientific programing system development.
  • Prepare a presentation based on his research and/or scientific programing system.
  • Know main principles of social bots.
  • Know main principles of task-oriented bots.
  • Know fundamental approaches to natural language understanding and dialogue management in the task-oriented dialogue systems.
  • Know basic principles of assuring chat bot relevance at syntactic level.
  • Know basic principles of Q/A for Bots.
  • Know basic principles of discourse-level structures.
  • Know basic principles of building taxonomy and thesaurus for chat bots.
  • Know basic principles of chat bot content processing pipeline.
  • Know basic principles of managing rhetorical agreement in dialogue utterances.
  • Know basic principles of discourse-level dialogue management.
  • Know basic principles of argumentation for chat bot.
Содержание учебной дисциплины

Содержание учебной дисциплины

  • A basic chat bot
    <ul><li>Building transactional chatbots with Api.ai;</li> <li>Building FAQ chatbot with Microsoft QnA Maker;</li> <li>A chatbot with rule-based dialogue management.</li></ul>
  • Social Bots
    <ul><li>Main principles.</li></ul>
  • Task-oriented Bots
    <ul><li>Main principles.</li></ul>
  • NL Understanding
    <ul><li>Introduction to NLP and NLU.</li></ul>
  • Assuring chat bot relevance at syntactic level
    <ul><li>Syntactic Generalization in search and relevance assessment;</li> <li>Generalizing portions of text;</li> <li>Generalizing at various levels: From words to paragraphs;</li> <li>Equivalence transformation on phrases;</li> <li>Simplified example of generalization of sentences;</li> <li>From syntax to inductive semantics;</li> <li>Nearest-neighbor learning of generalizations;</li> <li>Syntactic generalization-based search engine and its evaluation;</li> <li>User interface of search engine;</li> <li>Qualitative evaluation of search;</li> <li>Evaluation of web search relevance improvement;</li> <li>Evaluation of product search;</li> <li>Comparison with other means of search relevance improvement;</li> <li>Evaluation of text classification problems;</li> <li>Comparative performance analysis in text classification domains;</li> <li>Example of recognizing meaningless sentences;</li> <li>Commercial evaluation of text similarity improvement.</li> </ul>
  • Q/A for Bots: Semantic headers and semantic skeletons
  • Learning Discourse-level structures
    <ul><li>Answering paragraph-size questions;</li> <li>From sentence-level to paragraph-level generalization;</li> <li>Rhetoric structures and speech acts as inter-sentence links;</li> <li>Adapting RST for multi-sentence search;</li> <li>Adapting Speech Act Theory for multi-sentence search;</li> <li>Parse thickets and their graph representation;</li> <li>Equivalence transformation of phrases;</li> <li>Finding similarity between two paragraphs of text;</li> <li>How coreferences help search recall;</li> <li>How rhetoric relation improve search accuracy;</li> <li>Thicket Phrases and their generalization;</li> <li>Example of parse thicket;</li> <li>Generalization of parse thickets;</li> <li>Generalization for RST arcs;</li> <li>Generalization for CA arcs;</li> <li>Computing maximal common sub-PTs;</li> <li>Architecture of PT processing system;</li> <li>Evaluation of PT-supported search relevance;</li> <li>Evaluation settings;</li> <li>Pair-wise sentence generalization for question-answer similarity;</li> <li>Single sentence query and answer distributed through multiple sentences;</li> <li>Query is a paragraph and answer is a paragraph;</li> <li>Phrase-based and graph-based implementation of generalization;</li> <li>Comparison of search performance with other studies. </li></ul>
  • Building taxonomy and thesaurus for chat bots
    <ul><li>Improving search relevance by taxonomies;</li> <li>Must-occur keywords;</li> <li>Must-occur keywords in a taxonomy;</li> <li>Constructing relevance score function;</li> <li>Examples of filtering answers based on taxonomy;</li> <li>Taxonomy-based algorithm for filtering search results;</li> <li>Building taxonomies by web mining;</li> <li>Building taxonomy by generalizing search results;</li> <li>Practical considerations;</li> <li>Evaluation of search relevance improvement by taxonomies;</li> <li>Evaluation settings of search relevance improvement;</li> <li>Vertical search;</li> <li>Web search relevance improvement;</li> <li>Taxonomy-supported search engine in news domain;</li> <li>Taxonomies for query expansion;</li> <li>Using search in Similarity component;</li> <li>Running taxonomy learner.</li> &lt;/li&gt;</ul>
  • Chat bot content processing pipeline
    <ul><li>From search to personalized recommendations;</li> <li>A content pipeline and its relevance-related problems Content pipeline architecture;</li> <li>Content processing engines;</li> <li>Content processing units;</li> <li>Harvesting unit;</li> <li>Content mining unit Taxonomy unit;</li> <li>Opinion mining unit De-duplication unit Search Engine Marketing unit;</li> <li>Speech recognition semantics unit;</li> <li>Search unit;</li> <li>Personalization unit;</li> <li>Generalization of texts;</li> <li>Simplified example of generalization of sentences;</li> <li>Sample generalization between phrases;</li> <li>Tree Kernel approach for text similarity;</li> <li>Phrase-level generalization;</li> <li>Generalization of expressions of interest;</li> <li>Personalization algorithm as intersection of likes;</li> <li>Mapping categories of interest / taxonomies;</li> <li>Defeasible logic programming-based rule engine;</li> <li>Content pipeline algorithms;</li> <li>Taxonomy construction algorithm;</li> <li>De-duplication algorithms Sentiment analysis algorithm;</li> <li>Search engine marketing ad construction algorithm. </li></ul>
  • Managing Rhetorical Agreement in Dialogue Utterances
    <ul><li>Communicative Discourse Trees;</li> <li>Representing rhetorical relations and communicative actions;</li> <li>Greedy representations for a Q/A pair;</li> <li>Communicative actions and their generalization;</li> <li>Generalization for RST relations;</li> <li>Representing a Request-Response chain;</li> <li>Classification settings for Request-Response pairs;</li> <li>Nearest Neighbor graph-based classification;</li> <li>Thicket Kernel learning for CDT;</li> <li>Implementation of Rhetorical Agreement classifier;</li> <li>Discourse Structure-Driven Dialogue Management;</li> <li>Maintaining cohesive session flow in a chat bot;</li> <li>Personalized Domain Exploration Scenarios;</li> <li>Navigation with the Extended Discourse Tree;</li> <li>Recognizing valid and invalid R-R pairs;</li> <li>CDT Construction Task;</li> <li>Managing dialogues and question answering;</li> <li>Analytical approaches to RR Agreement;</li> <li>Rhetorical relations and argumentation.</li> &lt;/li&gt;</ul>
  • Discourse-level Dialogue management
    <ul><li>Finding Answers with Optimal Rhetoric Representation;</li> <li>Adjusting rhetoric representation of answer to that of a question;</li> <li>Maintaining a sequence of discourse trees;</li> <li>Identifying rhetoric correlation;</li> <li>Building Dialogue Structure from Discourse Tree of a Query;</li> <li>Maintaining communicative discourse for Q and A;</li> <li>Learning complement relation. </li></ul>
  • Data for chat bot training
  • Argumentation for chat bot
Элементы контроля

Элементы контроля

  • неблокирующий Created with Sketch. Presentation
    Progress report on the programming project. <br /> Speaking time is no more 15 min.
  • неблокирующий Created with Sketch. Programming project
    Report on the programming project: individual paper report and group presentation.
Промежуточная аттестация

Промежуточная аттестация

  • Промежуточная аттестация (2 модуль)
    The final mark is evaluated like: <br /> О<sub>final</sub>= 1•О<sub>project</sub> <br />It also includes providing final report on the project and public defense of the project in the form of presentation.
Список литературы

Список литературы

Рекомендуемая основная литература

  • Manning C. D., Schutze H. Foundations of statistical natural processing. – 1999. – 719 pp.

Рекомендуемая дополнительная литература

  • Perkins J. Python text processing with NLTK 2.0 cookbook. – Packt Publishing Ltd, 2010. – 336 pp.