• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Recognition of Quadruplexes by Methods of Deep Learning in the Mouse Genome

Student: Burdanova Sofya

Supervisor: Maria Poptsova

Faculty: Faculty of Computer Science

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Final Grade: 9

Year of Graduation: 2020

G-quadruplexes (G4) are nucleic acid sequences enriched with guanine and capable to form four-chain structures. Quadruplexes are actively studied, but the definite laws of their formation are still unknown, therefore, the problem of G-quadruplexes identification by deep learning methods plays a significant role in this phenomenon research. The solution of this problem by machine learning methods is complicated by the imbalance of the sample. The subject of this work is to generate quadruplexes with Generative Adversarial Network and add the generated data to a training set in order to improve the quality of various machine learning models for quadruplex recognition. This research demonstrates the results of four variations of Generative Adversarial Network (simple GAN, WGAN, WGAN-GP and LSGAN) on six combinations of convolutional networks from a generator and discriminator. CNN with a maximum ACCURACY of 0.96. The best and the most stable result of all the implemented architectures for this problem was shown by GEN2 + DISC1 with WGAN-GP. This research is one of the pioneering studies in the application area of Generative Adversarial Network networks in genomics problems and particulary in quadruplex recognition problems.

Full text (added May 20, 2020)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses