• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
  • HSE University
  • Student Theses
  • Recognition of Patterns of Association of DNA Secondary Structures and Epigenetic Code by Machine-Learning Methods

Recognition of Patterns of Association of DNA Secondary Structures and Epigenetic Code by Machine-Learning Methods

Student: Kubaeva Assol

Supervisor: Maria Poptsova

Faculty: Faculty of Computer Science

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Year of Graduation: 2020

This work is devoted to solving the problem of recognition of patterns of the associations of G-quadruplexes and the histone mark H3K27ac by machine learning methods including deep learning. G-quadruplexes are DNA secondary structures enriched in guanine and consisting of four chains. According to many studies, quadruplexes play a significant role in the regulation of gene expression. In addition, a violation of their formation or non-formation in the genome can be associated with various diseases. The histone mark H3K27ac is an epigenetic factor traditionally associated with an enhancer — a short region of DNA that enhances the transcription of genes. The aim of this work is to build a model that identifies regions of the genome containing patterns of association of quadruplexes and the histone mark H3K27ac. The paper considers models of classical machine learning — random forest, gradient boosting and models of deep learning: convolutional neural network and neural networks of mixed architecture. It has been shown that the productivity of deep learning models is higher than that of classical machine learning models. In addition, the applicability of deep learning approaches to the recognition of patterns of association of DNA secondary structures and epigenetic marks has been demonstrated.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses