首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Understanding convolutional neural networks with a mathematical model
Institution:1. University of São Paulo, Institute of Mathematics and Computer Sciences, Av. Trabalhador Sãocarlense, 400,  São Carlos, SP 13566-590, Brazil;2. University of Western Australia, School of Mathematics and Statistics, 35 Stirling Highway, Crawley, Perth, Western Australia 6009
Abstract:This work attempts to address two fundamental questions about the structure of the convolutional neural networks (CNN): (1) why a nonlinear activation function is essential at the filter output of all intermediate layers? (2) what is the advantage of the two-layer cascade system over the one-layer system? A mathematical model called the “REctified-COrrelations on a Sphere” (RECOS) is proposed to answer these two questions. After the CNN training process, the converged filter weights define a set of anchor vectors in the RECOS model. Anchor vectors represent the frequently occurring patterns (or the spectral components). The necessity of rectification is explained using the RECOS model. Then, the behavior of a two-layer RECOS system is analyzed and compared with its one-layer counterpart. The LeNet-5 and the MNIST dataset are used to illustrate discussion points. Finally, the RECOS model is generalized to a multilayer system with the AlexNet as an example.
Keywords:Convolutional neural network (CNN)  Nonlinear activation  RECOS model  Rectified linear unit (ReLU)  MNIST dataset
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号