Discrete Infomax Codes for Supervised Representation Learning |
| |
Authors: | Yoonho Lee Wonjae Kim Wonpyo Park Seungjin Choi |
| |
Affiliation: | 1.Stanford AI Lab, Stanford University, Stanford, CA 94305, USA;2.NAVER AI Lab, Seongnam 13561, Korea;3.Standigm, Seoul 06234, Korea;4.Intellicode & BARO AI Academy, Seoul 06367, Korea; |
| |
Abstract: | For high-dimensional data such as images, learning an encoder that can output a compact yet informative representation is a key task on its own, in addition to facilitating subsequent processing of data. We present a model that produces discrete infomax codes (DIMCO); we train a probabilistic encoder that yields k-way d-dimensional codes associated with input data. Our model maximizes the mutual information between codes and ground-truth class labels, with a regularization which encourages entries of a codeword to be statistically independent. In this context, we show that the infomax principle also justifies existing loss functions, such as cross-entropy as its special cases. Our analysis also shows that using shorter codes reduces overfitting in the context of few-shot classification, and our various experiments show this implicit task-level regularization effect of DIMCO. Furthermore, we show that the codes learned by DIMCO are efficient in terms of both memory and retrieval time compared to prior methods. |
| |
Keywords: | infomax discrete codes representation learning few-shot classification |
|
|