• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
For visually-impairedUser profile (HSE staff only)SearchMenu

Decision Tree Based Ensemble Methods and Lattices of Closed Descriptions

Student: Egor Dudyrev

Supervisor: Sergei Kuznetsov

Faculty: Faculty of Computer Science

Educational Programme: Data Science (Master)

Final Grade: 9

Year of Graduation: 2021

Formal Concept Analysis is mathematically-founded theory well suited for developing models of knowledge discovery and data mining. But it is hardly applicable in practice since classic algorithms of FCA operate in exponential time to a size of a data. On the other hand there are ensembles of decision trees (random forest and gradient boosting over decision trees) which use a number of decision trees constructed by greedy algorithms to result in a fast supervised machine learning models with state-of-the-art prediction accuracy. But the ensemble approach makes them hardly interpretable, i.e. it is hard to explain why the model output a specific prediction. In this thesis we introduce a new notation to describe ensembles of decision trees within FCA framework in order to create both fast and highly interpretable supervised machine learning model. We define a differential decision semilattice model which – as we show – can represent ensembles of decision trees by a single partially ordered set of decision rules. We also show that decision based models can be viewed as a decision quiver: a multidigraph with closed description as nodes and decision rules as edges. Thus we establish a connection between decision based models and lattices of closed descriptions and, consequently, concept lattices. We also present ideas for further development of decision semilattice model. These approaches can potentially increase both interpretability and prediction accuracy of the model.

Full text (added May 24, 2021)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses