• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Reduction of dimensionality for the indicators that have a mixed structure

Student: Yuliya Shalimova

Supervisor: Elena R. Goryainova

Faculty: School of Applied Mathematics and Information Science

Educational Programme: Bachelor

Final Grade: 9

Year of Graduation: 2014

<p>With a rapid technological change, solving practical tasks requires to process increasing amounts of data. That is why dimensionality reduction techniques are assuming greater importance nowadays. This graduation paper focuses on factor analysis, which is a set of dimensionality reduction techniques, assuming that observed values of features of investigated objects are governed by a smaller number of hidden causes called common factors. Hence, the dimensionality of the investigated system can be reduced from the number of features to the number of these factors.</p><p>It is assumed in factor analysis that features depend linearly on common factors and, consequently, on each other. Thus, factor analysis methods are theoretically unfounded or even inapplicable for the cases of nonlinear dependences between features.</p><p>The aim of this graduation project is to adapt one of factor analysis methods &ndash; the maximum likelihood method (MLM) &ndash; for nonlinear dependent features. The achievement of this goal requires fulfilling the following tasks:</p><ol><li>Implementation of the MLM for the factor analysis model;</li><li>Making modifications to this method in order to increase its efficiency for nonlinearly dependent features;</li><li>Comparison of the traditional methods and the modified ones in terms of efficiency for modeled and real data.</li></ol><p>The iterative algorithm which realizes the MLM of factor analysis has been developed. Also, two methods for determination of the common factor number have been implemented.</p><p>Replacing of sample correlation coefficient with dependence measures which are more informative for the nonlinear dependence cases has been proposed as an adaptation of the method to these cases. These measures are Spearman&#39;s rank correlation coefficient and Cramer&rsquo;s coefficient.</p><p>The concept of efficiency has been formalized for these methods in order to compare their quality of operation. In this concept the efficiency of these methods has been compared for modeled data with different types of dependency and for real data.</p><p>The experiment with modeled data shows that traditional MLM and its modification which uses Spearman&#39;s rank correlation work well for monotonic type of dependence and are equally efficient for these data. They don&rsquo;t work for nonmonotonic type. The third method, which uses Cramer&rsquo;s coefficient, works for both monotonic and nonmonotonic types of dependence, but for monotonic ones it is less efficient than the other methods.</p><p>Incremental weekly price rates of the certain foodstuffs have been chosen as the real data. Each of the three methods manages to reduce dimensionality of these data from eleven to three. The modified method which uses Spearman&#39;s rank correlation turns out to be the most efficient for these data. In terms of efficiency the next one is the traditional method. And the least efficient one is the method which uses Cramer&rsquo;s coefficient.&nbsp;</p>

Full text (added June 5, 2014) (1.10 Kb)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses