Шевченко Александр Сергеевич
End-to-End Training of Deep Structured Models
Прикладная математика и информатика
Over the last few years machine learning systems which incorporate expressive power of neural networks and output structure introduced by a graphical model achieved impressive results on various practical tasks, including image segmentation, pose and handwritten characters recognition. A common way to learn parameters of structured prediction system involves multi-stage training procedure which implies separate training of the system components. This approach is time-consuming as stage procedure typically requires learning the components till convergence and hyperparameter tuning. Joint training procedure, which presumes learning the components in one stage, is more convenient, but often leads to worse results due to the increased optimization complexity. In this work we tackle the issue of joint training in context of handwritten characters recognition, binary segmentation and the task of splitting sentences into semantic blocks, and propose a stack of approaches based on adversarial training, direct structured risk optimization and calibration to make the joint training feasible and beneficial. We evaluate the qualitative comparison between proposed and well-established techniques on Stanford OCR, CoNLL chunking and Weizmann Horses datasets and show that our approaches lend itself to the joint procedure and are superior to the well-defined techniques which utilize stage training.