Year of Graduation
Making Sense of Genomic Data With Machine Learning Methods
Applied Mathematics and Information Science
Z-DNA is an alternative form of a DNA molecule that plays a vital role in gene expression regulation. However, the precise biological properties of this molecule are still unknown. Today there is a need for recognition of Z-DNA for annotation of the human genome, as well as for studying patterns of association of Z-DNA with other functional elements at the whole-genome scale. The existing thermodynamic algorithms for Z-DNA recognition yield too many false-positive results, which makes it impossible to use them for a genome annotation. In this work, we propose more efficient machine learning algorithms for Z-DNA recognition. We consider various ways of extracting features from a DNA sequence; assess the accuracy of an algorithm prediction.