Year of Graduation
CRISPR cassettes − reconstruction of symbol sequences from observed subsequences with bioinformatics applications
School of Applied Mathematics and Information Science
This study examines CRISPR (Clustered, Regularly Interspaced Short Palindromic Repeats) cassettes which are special parts of genome. CRISPR systems provide defensive function and have similar structural features. Presented work performed as the part of the joint project with A. Lipnyagov supported by RTCB IITP RAS and describes analysis of existing information about CRISPR systems using mathematical approaches. The paper examines in detail the creation of the special object for every group of CRISPR cassettes with similar structural characteristics. The problem of building such object belongs to NP class, and the study describes heuristic approaches for real data solution, in particular 4-approximation algorithm. It was necessary to find all of the exact solutions of this problem and the study describes approaches and methods for it using several sequence-based graph algorithms.