Year of Graduation
Revealing Diversity and Specialization in Scientific Publications by Natural Language Processing Methods
Governance of Science, Technology and Innovation
This work is devoted to a division of cognitive labour and important aspects of the phenomenon of science, such as diversity and specialization. The work demonstrates that the modeling of a division cognitive of labour can be used for modeling of scientific activities, and this, in turn, can be the basis for the development of new methods of governance of science, technologies, and innovations. For this, new tools for recovering of science configuration, extraction of research strategies patterns of the participants of the scientific process should be developed, because existing bibliometrics methods such as the analysis of citations and co-citations cannot be applied in real time. In this work I propose to use natural language processing techniques as a tool for recovering the structure of scientific disciplines. I demonstrate and prove by qualitative assessments, that by use of TF-IDF algorithm and automatic classification k-means++, one can construct quantitative methods for recovering of "specialization" and "diversity" of a scientific field from the raw texts of publications from this scientific field.