Year of Graduation
Comparison of methods for divisive clustering
School of Applied Mathematics and Information Science
The graduation paper on the theme “Comparison of methods for divisive clustering” refers to an area of data analysis, namely to clustering. The whole work consists of 43 pages, among them 8 tables and 11 pictures are represented. In total 18 works were cited in this paper.The main aim of this work is the comparison of different methods for divisive clustering such as concept method and methods based on the distance between objects and also on the projections. To achieve that, the author has:· to study and to program the algorithms,· to program a generator of cluster structure,· to conduct experiments and to analyze their results. The paper consists of an introductory part, four main chapters, concluding remarks and bibliography page. The preface states the problem, explains its topicality, the main subject and the main object of the research, and also states its practical novelty and scientific importance. In first chapter the meaning of basic concepts such as clustering, hierarchical clustering, Ward distance is presented. Second chapter contains extensive theoretical description of methods that are under study. As to third chapter, it fully describes conducted experiments, namely the generator of cluster structure, gives a short review of methods, and explains the criteria for estimation of the results. Last chapter presents the results of the experiments with different parameters of data and different methods, and also their analysis. The conclusion overviews the paper and future work.In order to achieve the main goal the generator of cluster structure was programmed; it allows modeling different data sets. Then, the experiments on these synthetic data showed that there are some cases when the accuracy of concept method is comparable with the accuracy of others. This would give a clear advantage, because concept method provides easily interpretive categories. However, it is not advisable now to apply this method to a random data set. The future work is to include more complex data sets in order to get a better advice.