• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Prediction of Functionality of DNA Secondary Structures with Deep Learning Methods

Student: Tepliakova Natalia

Supervisor: Maria Poptsova

Faculty: Faculty of Computer Science

Educational Programme: Data Analysis for Biology and Medicine (Master)

Year of Graduation: 2019

G-quadruplex is a DNA secondary structure which can be formed by sequences rich in guanine. There are more than 700 000 sites that can potentially form G-quadruplex structure in the human genome. G-quadruplexes have been located in promoter regions of many genes. They took part in transcriptional regulation. However, there is no method to predict quadruplex formation in different types of cells. In this master thesis deep convolutional neural network has been trained to predict histone modifications in different types of tissues (median AUC ROC 0.83) based only on information about DNA sequence. The analysis of first convolutional layer filters identified motifs, some of them appear to be known transcriptional factor binding sites. Method to predict quadruplex functional significance in different tissues was suggested, based on evaluation of neural network prediction difference for original and mutated sequences near potential quadruplex sites. The efficiency of this method was demonstrated on the known G-to-A mutation in the promoter region of c-MYC. Suggested method was applied to quadruplex dataset for myelogenous leukemia cell line K562. It was shown that significant differences on mutation maps are associated with sites that form quadruplex structures in this cell line. This thesis demonstrated the effectiveness of deep learning methods to recognize relationships between patterns of DNA secondary structures and epigenetic code.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses