Year of Graduation
Study of Bayesian Regularization of Neural Networks
Applied Mathematics and Information Science
Deep neural networks have shown state-of-the-art performance in many machine learning tasks. However, such models with a large number of parameters are prone to overfitting, and this problem must be addressed for obtaining good generalization ability. Commonly used regularization techniques to tackle this issue are binary dropout or its fast approximation Gaussian dropout. In this work, we study a novel approach for neural network regularization via injecting noise on weight vector magnitude and direction instead of independently perturbing individual scalar weights. We consider several direction noise distributions and, further, propose a probabilistic model where variational inference can be applied for automatic hyperparameter tuning. Incorporating appropriate prior distributions in such a model can also potentially lead to structural sparsity and model compression.