Year of Graduation
Empirical Study of Deep Neural Network Loss Surfaces
Applied Mathematics and Information Science
The loss functions of deep neural networks are complex and their geometric prop- erties are not well understood. In recent papers it has been empirically shown that neural network loss minima found with stochastic gradient descent form a connected manifold. In this work, we research loss function behavior along a path that connects optima with low gradient norm and connectivity of overfitted minima and minima with good generalization. The results show that optima with lower gradient norm are not connected with simple curve but the increase in loss is not big. Also optima with dif- ferent generalization ability of the specific network architecture are not connected.