Goal of research
The aim of the work is to develop and study new methods for the analysis and verification of automata-based models of information systems as well as new approaches to simulation, visualization, synthesis, repair and conformance checking of complex information system models according to examples of their observed behavior registered in event logs.
Methodology
Theory of finite transducers and computation tree logic (CTL) fragments and extensions are used to develop new methods for the analysis and verification of automata-based information system models. Theory of formal languages, Petri nets and their extensions are used to develop new approaches to analyze the observed behavior of distributed information systems.
Empirical base of research
The specific examples of real distributed information systems and event logs containing history of their execution constitute the empirical base within this research project.
Results of research
The following results have been achieved in the area of the analysis and verification of automata-based information system models:
The transducer equivalence problem has been studied. Within this problem, a class of prefix transducers has been identified. It has been proven that there exists an algorithm of checking the equivalence of real-time transducers. This algorithm has the quadratic complexity with respect to the size of automata. It has also been proven that there exists an algorithm of checking the equivalence of transducers with epsilon-transitions. This algorithm has the cubic complexity with respect to the size of automata. In addition, a linear translation of deterministic two-tape automata into prefix transducers has been shown. Thus, there exists an algorithm of checking the equivalence of deterministic two-tape automata. The proposed algorithms are based on the new method for solving systems of linear language equations.
The problem of checking the equivalence of non-deterministic transducers operating over a unary input alphabet has been studied. It has been proven that if projections of transducers on the input alphabet are deterministic automata, then the problem of checking the equivalence of transducers is decidable.
The satisfiability problem for the regular extension of the computation tree logic Reg‑CTL* has been examined within sequential reactive systems. It has been proven that this problem is PSPACE-complete.
The following results have been achieved in the area of simulation, visualization, synthesis, repair and conformance checking of complex information system models according to examples of their observed behavior registered in event logs:
The problem of comparing BPMN (Business Process Model and Notation) process models has been studied. Within this task, the existing heuristic algorithms, including tabu search algorithm, colony optimization algorithm and simulated annealing algorithm, have been adapted. It has been shown that these algorithms allow to find the minimal editing distance between two BPMN-models. Experimental evaluation results show that these algorithms work significantly faster in comparison with the exact algorithms of finding the minimal editing distance.
A new algorithm for simulating Petri nets with reset and inhibitor arcs has been proposed. This algorithm has been implemented within an open-source prototype software that records the observed behavior of a Petri net with reset and inhibitor arcs in a corresponding event log.
Transition firing priorities are one of the techniques to control Petri net behavior to provide their liveness and boundedness. These properties are necessary since most of real information systems are characterized with a limited number of available resources. The existing algorithm for computing transition priorities has been implemented. This algorithm has been tested with models of complex distributed information systems.
Typical asynchronous agent interaction patterns have been identified to model multi-agent systems. These patterns have been modeled using Petri net composition. Semantic properties of a corresponding composition have also been investigated. It has been shown how to apply these asynchronous interaction patterns to discover structured models of multi-agent information systems from their event logs.
A new technique to check conformance between information system models and event logs has been proposed. This technique evaluates not only complete, but also partial correspondence between sequences of observed and model behavior. Within this technique a special conformance checking measure has also been developed. It has been proven that this measure meets the monotonicity property, i.e. the more the level of partial correspondence between observed and model behavior is, the bigger the value of the conformance checking measure is.
Exploitation of information systems often results in changes to the organization of its processes. There has been studied a problem of identifying fragments of BPMN-models that still conform with the observed behavior from event logs. The proposed methods is based on the identification and analysis of dependencies between process events.
There has been made an improvement to the existing method of Petri net decomposition for repairing its fragments with respect to event logs. The proposed improvement is based on constructing decomposition graph, showing the maximal Petri net decomposition, and on the minimal spanning tree including all fragments to be repaired.
There has been studied a problem of constructing a formal model of a real stock trading system according to its description. This is a high-level Petri net model describing behavior stock trading system components regarding the way how stock exchange orders to buy and sell securities are processed. To simulate this model, a prototype software has been implemented which registers system behavior in terms of order book dynamics.
To preprocess raw data extracted form stock trading systems, a specific technique has been proposed. This technique transforms event logs making their automated processing possible in order synthesize formal models of stock trading systems.
To monitor process of managing criminal cases, an original technique has been developed. It allows to identify anomalies which are forbidden by the corresponding regulations. This method is based on event log analysis. There has been constructed a formal model of criminal case management process considering requirements imposed by the regulations. It has been shown how to identify anomalies caused by time non-compliance and looping with the help of event logs.
A new method for storing event logs in autonomous relational databases has been proposed. It is based on using SQLite RDBMS software. These event logs are called SQLite-EventLogs. This method is implemented as a component library in LDOPA and evaluated on the set of artificial and real event logs. Experiment results in comparing methods for storing event logs have shown that SQLite-EventLogs are more efficient than a conventional approach to store event logs.
The B-tree data structure has been examined, and there has been proposed a modification called B*+-trees. They are used to index tables in relational databases. There has been developed an extension to SQLite software which implements indexing with the help of B*+-trees. The main advantage to B*+-trees is the optimization of processing event logs represented by SQLite-EventLog.
There has been proposed a new method for the automated synthesis of UML activity diagrams from event logs. This methods is based on the parameterized synthesis scheme including three sequential stages of model transformation – from an event log to a UML activity diagram.
Level of implementation, recommendations on implementation or outcomes of the implementation of the results
The proposed theoretical methods of the analysis and verification of automata-based information system models are mainly focused on their practical implementation. The proposed methods of simulation, visualization, synthesis, repair and conformance checking of process models have been evaluated using event logs of real information systems.
Additional results
In 2019 the PhD thesis of the PAIS Lab researcher A.A. Mitsyuk “Structure-Preserving Process Model Repair Based on Event Logs” has been successfully defended at the National Research University Higher School of Economics, Dissertation Council in Computer Science (scientific advisor: the head of PAIS Lab I.A. Lomazova).
The paper “Monotone Conformance Checking for Partially Matching Designed and Observed Processes” by the leading PAIS Lab researcher A.A. Kalenkova and A.A. Polyvyanyy (University of Melbourne, Australia) has been awarded a best paper award at the international conference “International Conference on Process Mining (ICPM 2019)” which took place on June 24-26, 2019 in Aachen (Germany).
The paper “A Method to Improve Workflow Net Decomposition for Process Model Repair” by the PAIS Lab researcher A.A. Mitsyuk and the research assistant S.E. Tikhonov has been awarded a best paper award at the international conference “Analysis of Images, Social Networks and Texts (AIST 2019)” which took place on July 17-19, 2019 in Kazan (Russia).