Seoul National University
154 Hurley Hall
Fast Learning With Deep Learning Architectures For Classification
We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity) is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and give conditions for fast convergence rates. If time is allowed, computational algorithms to achieve a right size of deep architectures for fast convergence rates is discussed.
This is a joint work with my PH.D. students Ilsang Ohn and Dongha Kim.