Information Flows of Diverse Autoencoders期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Information Flows of Diverse Autoencoders

Authors:	Sungyeop Lee Junghyo Jo

Affiliation:	1.Department of Physics and Astronomy, Seoul National University, Seoul 08826, Korea;2.Department of Physics Education and Center for Theoretical Physics and Artificial Intelligence Institute, Seoul National University, Seoul 08826, Korea;3.School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Korea

Abstract:	Deep learning methods have had outstanding performances in various fields. A fundamental query is why they are so effective. Information theory provides a potential answer by interpreting the learning process as the information transmission and compression of data. The information flows can be visualized on the information plane of the mutual information among the input, hidden, and output layers. In this study, we examine how the information flows are shaped by the network parameters, such as depth, sparsity, weight constraints, and hidden representations. Here, we adopt autoencoders as models of deep learning, because (i) they have clear guidelines for their information flows, and (ii) they have various species, such as vanilla, sparse, tied, variational, and label autoencoders. We measured their information flows using Rényi’s matrix-based $α$ -order entropy functional. As learning progresses, they show a typical fitting phase where the amounts of input-to-hidden and hidden-to-output mutual information both increase. In the last stage of learning, however, some autoencoders show a simplifying phase, previously called the “compression phase”, where input-to-hidden mutual information diminishes. In particular, the sparsity regularization of hidden activities amplifies the simplifying phase. However, tied, variational, and label autoencoders do not have a simplifying phase. Nevertheless, all autoencoders have similar reconstruction errors for training and test data. Thus, the simplifying phase does not seem to be necessary for the generalization of learning.

Keywords:	information bottleneck theory mutual information matrix-based kernel estimation autoencoders

设为首页 | 免责声明 | 关于勤云 | 加入收藏