Information theory applied to feature selection of binary-coded infrared spectra for automated interpretation by retrieval of reference data |
| |
Authors: | PF Dupuis P Cleij HA Van T Klooster A Dijkstra |
| |
Institution: | Analytisch Chemisch Laboratorium, Rijksuniversiteit Utrecht, Croesestraat 77A, UtrechtThe Netherlands |
| |
Abstract: | A method is described for feature selection from infrared spectra, intended for identification of organic compounds by computer-aided retrieval of reference data contained in small files. Complete discrimination of the binary-coded spectra is achieved by selecting a minimum number of spectral features; the information content is used as the selection criterion. The selection procedure is applied to five data sets (saturated and unsaturated hydrocarbons, alcohols, ethers and aldehydes/ketones) involving some 400 spectra. Each spectrum is uniquely coded by using about 10% of the 140 spectral features (binary-coded peak positions) available originally. For the intensity, a threshold of 50% appears to be applicable in some cases. For coding the frequency or wavelength parameter, wavenumbers (cm-1) are preferred to wavelengths (mm). The method takes into account the a priori probabilities of spectral features and their correlations. Results of a retrieval program for a few “unknown” spectra are given. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|