A benchmark spike‐in data set for biomarker identification in metabolomics期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A benchmark spike‐in data set for biomarker identification in metabolomics

Authors:	Pietro Franceschi Domenico Masuero Urska Vrhovsek Fulvio Mattivi Ron Wehrens

Abstract:	The development and the validation of innovative approaches for biomarker selection are of paramount importance in many ‐omics technologies. Unfortunately, the actual testing of new methods on real data is difficult, because in real data sets, one can never be sure about the “true” biomarkers. In this paper, we present a publicly available metabolomic ultra performance liquid chromatography–mass spectrometry spike‐in data set for apples. The data set consists of 10 control samples and three spiked sets of the same size, where naturally occurring compounds are added in different concentrations. In this sense, the data set can serve as a test bed to assess the performance of new algorithms and compare them with previously published results. We illustrate some of the possibilities provided by this spike‐in data set by comparing the performance of two popular biomarker‐selection methods, the univariate t‐test and the multivariate variable importance in projection. To promote a widespread use of the data, raw data files as well as preprocessed peak lists are made available. Copyright © 2012 John Wiley & Sons, Ltd.

Keywords:	metabolomics spike‐in data biomarker selection benchmark