A Method for Clustering and Screening of Long-dimensional Chemical Data Based on Fingerprints and Similarity Measurements期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

A Method for Clustering and Screening of Long-dimensional Chemical Data Based on Fingerprints and Similarity Measurements

Authors:	Manuel Urbano Cuadrado Gonzalo Cerruela García Irene Luque Ruiz Miguel Ángel Gómez-Nieto

Affiliation:	(1) Department of Computing and Numerical Analysis, University of Córdoba, Campus Universitario de Rabanales, Albert Einstein Building, E-14071 Córdoba, Spain

Abstract:	A method for the treatment of long-dimensional chemical data arrays is presented in this work with the aim of maximising classification models. The method is based on the construction of fingerprints and the subsequent generation of a similarity matrix. The similarity calculation has been modified through a scaling process to take into account different significance shown by the variables. The method was applied to spectral measurements of wines and several aspects were studied, namely: threshold considered in the construction of fingerprints and patterns, weighting factor used for scaling, normalisation method, etc. The application of both Principal Components Analysis and Soft-Independent Modelling of Class Analogies to the similarity matrices gave better classifications of the information than those obtained using original data.

Keywords:	data preparation similarity calculation fingerprints clustering screening
本文献已被 SpringerLink 等数据库收录！