Changing the Geometry of Representations: α-Embeddings for NLP Tasks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Changing the Geometry of Representations: α-Embeddings for NLP Tasks

Authors:	Riccardo Volpi,Uddhipan Thakur,Luigi Malagò

Affiliation:	1.Romanian Institute of Science and Technology (RIST), 400022 Cluj-Napoca, Romania; (U.T.); (L.M.);2.Transylvanian Institute of Neuroscience, 400157 Cluj-Napoca, Romania

Abstract:	Word embeddings based on a conditional model are commonly used in Natural Language Processing (NLP) tasks to embed the words of a dictionary in a low dimensional linear space. Their computation is based on the maximization of the likelihood of a conditional probability distribution for each word of the dictionary. These distributions form a Riemannian statistical manifold, where word embeddings can be interpreted as vectors in the tangent space of a specific reference measure on the manifold. A novel family of word embeddings, called $α$ -embeddings have been recently introduced as deriving from the geometrical deformation of the simplex of probabilities through a parameter $α$ , using notions from Information Geometry. After introducing the $α$ -embeddings, we show how the deformation of the simplex, controlled by $α$ , provides an extra handle to increase the performances of several intrinsic and extrinsic tasks in NLP. We test the $α$ -embeddings on different tasks with models of increasing complexity, showing that the advantages associated with the use of $α$ -embeddings are present also for models with a large number of parameters. Finally, we show that tuning $α$ allows for higher performances compared to the use of larger models in which additionally a transformation of the embeddings is learned during training, as experimentally verified in attention models.

Keywords:	word embeddings, α -embeddings, information geometry, attention mechanism

设为首页 | 免责声明 | 关于勤云 | 加入收藏