• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Using Synthetic Data for Domain Adaptation and Improving Supervised Models in Open Set and Closed Set Tasks

Student: Goriachko Viktor

Supervisor: Stanislav N. Fedotov

Faculty: Faculty of Computer Science

Educational Programme: Data Science (Master)

Year of Graduation: 2021

The paper presents a method for expanding the dataset with the generated images to improve supervised models for various computer vision tasks. Modern generative models produce high-quality images, but their distribution estimate is quite inaccurate, so simply adding generated data does not boost the model. Using this disadvantage, we decided to generate the hardest samples for a supervised neural network. In domain adaptation setup, the accuracy on the new domain improves by 3%, and on others, it slightly increases by 1-2%. On the whole dataset, the supposed method is comparable with state-of-the-art paper. Hard synthetic faces look pretty similar, but identities are easily distinguished. The Triplet loss function uses them as a negative anchor. This loss function penalizes the model for a large distance between examples of the same class and small ones for different ones. Hard negative examples change the model boundaries in multidimensional feature space. Since generative networks precisely reproduce each class of small datasets in the closed set task, we found our method useless in such a setup. Therefore, hard samples could be positive to anchors, and such an approach only slightly improves accuracy, especially when the real data is limited. Researchers can easily use the proposed method on top of any other model or new network. Our work provides a detailed analysis of metrics, models, and their latent space. The results of this research show that data from generative models could be used to improve supervised models in the context of imbalanced sampling, domain adaptation, and limited labeled data.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses