Depth Inpainting via Vision Transformer

Student: Borisenko Gleb

Educational Programme: Financial Technology and Data Analysis (Master)

Year of Graduation: 2021

Depth inpainting is a crucial task for working with augmented reality. In previous works missing depth values are completed by convolutional encoder-decoder networks, which is a kind of bottleneck. But nowadays vision transformers showed very good quality in various tasks of computer vision and some of them became state of the art. In this study, we presented a supervised method for depth inpainting by RGB images and sparse depth maps via vision transformers. The proposed model was trained and evaluated on the NYUv2 dataset. Experiments showed that a vision transformer with a restrictive convolutional tokenization model can improve the quality of the inpainted depth map.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses