共查询到20条相似文献,搜索用时 15 毫秒
1.
Microcomputer software for chemists is becoming available at an ever increasing rate. Following Professor Forina's general pattern recognition program, published in last month's Computer Corner, we present a specific technique, the linear learning machine. This is also available in Professor Forina's PARVUS but the software presented here has additional possibilities. It is proposed by Dr D.C. Leegwater from the Central Institute of Food and Nutrition Research of the Dutch organization of Applied Scientific Research (CIVO-TNO) at Zeist and by his son, who is studying physics and mathematics at the State University of Utrecht. 相似文献
2.
Christer Albano William Dunn Ulf Edlund Erik Johansson Bo Nordén Michael Sjöström Svante Wold 《Analytica chimica acta》1978,103(4):429-443
Problems of pattern recognition in chemistry and other subjects can be divided conveniently into four different types depending on the level of scope of the problem.(1) Classification into one of a number of defined classes. As an example blood samples taken from persons known to be either controls or welders are considered. The problem is whether trace element concentrations in these samples contain information on whether or not a person is a welder.(2) Level 1 plus the possibility that an object is an outlier, i.e. does not belong to any of the defined classes. As an example, the üse of 13C-n.m.r. data to decide whether 2-substituted norbornanes have the exo or endo structure is discussed. (2A) Level 2, asymmetric. This situation occurs when one class does not have a systematic structure, but another class is homogeneous and can be described by a level 2 model. This occurs in the classification of materials or compounds as good or bad, active or inactive, and in binary classifications. As an example the use of trace element data to classify steel samples as having good or poor properties of strength is discussed.(3) Level 2 plus the ability to relate the variables measured to external properties of continuous character. As an example, the classification of a series of chemical compounds as β -receptor blockers, β -receptor stimulants, or neither, on the basis of their structural variables is discussed. In addition, relations between these structural variables and the measured biological activity are sought within each of the two classes.(4) Level 3 with the difference that several external property variables in the objects are measured. It may be desirable to use variables of the objects both for classification and for relations to several property variables: such examples are numerous in analytical chemistry. 相似文献
3.
The applicability of potential functions in unsupervised pattern recognition is demonstrated on the basis of a new clustering technique called CLUPOT. CLUPOT is a centrotype sorting technique which means that for each detected cluster of objects a representative object can be selected. CLUPOT uses a reliability curve which permits the detection of significant clusters. Applications to four data sets (Kowalski's archeological artefact data, Ruspini's fuzzy set data. Fisher's Iris data and a part of Esbensen's meteorite data) show that CLUPOT yields significant clusterings. 相似文献
4.
The main topics studied by chemical pattern-recognition techniques are computer-aided inorganic synthesis, investigation of macroscopic chemical kinetics and optimization of industrial chemical process, geological exploration by geochemical methods, and early diagnosis of cancer. 相似文献
5.
Spectral pattern recognition using self-organizing MAPS 总被引:2,自引:0,他引:2
Lavine BK Davidson CE Westover DJ 《Journal of chemical information and computer sciences》2004,44(3):1056-1064
A Kohonen neural network is an iterative technique used to map multivariate data. The network is able to learn and display the topology of the data. Self-organizing maps have advantages as well as drawbacks when compared to principal component plots. One advantage is that data preprocessing is usually minimal. Another is that an outlier will only affect one map unit and its neighborhood. However, outliers can have a drastic and disproportionate effect on principal component plots. Removing them does not always solve the problem for as soon as the worst outliers are deleted, other data points may appear in this role. The advantage of using self-organizing maps for spectral pattern recognition is demonstrated by way of two studies recently completed in our laboratory. In the first study, Raman spectroscopy and self-organizing maps were used to differentiate six common household plastics by type for recycling purposes. The second study involves the development of a potential method to differentiate acceptable lots from unacceptable lots of avicel using diffuse reflectance near-infrared spectroscopy and self-organizing maps. 相似文献
6.
Supervised pattern recognition in food analysis 总被引:8,自引:0,他引:8
Data analysis has become a fundamental task in analytical chemistry due to the great quantity of analytical information provided by modern analytical instruments. Supervised pattern recognition aims to establish a classification model based on experimental data in order to assign unknown samples to a previously defined sample class based on its pattern of measured features. The basis of the supervised pattern recognition techniques mostly used in food analysis are reviewed, making special emphasis on the practical requirements of the measured data and discussing common misconceptions and errors that might arise. Applications of supervised pattern recognition in the field of food chemistry appearing in bibliography in the last two years are also reviewed. 相似文献
7.
The operational research model is based on a facility location problem and is solved by using heuristic and branch-and-bound methods. The method is particularly useful for clustering because it contains an algorithm that permits a conclusion about the significance of a cluster, without imposing a priori conditions. The method is also applied to supervised learning, for which it is not expected to be better than existing methods. However, it could be an interesting aid to those methods because it allows reduction of large data sets. The application of the method is illustrated with a few examples. 相似文献
8.
PRIMA, a new supervised classification method is based on the concept of class distance (Euclidean distance). For each class, a separate class distance is defined on the basis of the centre of gravity and inhomogeneity for the class; this class distance is then used to produce the classification. The PRIMA classifier based on class distance can be applied in different, complex cases. The conditions of applicability are less strict than those of other methods. The algorithm is simple; efficiency and stability are good. The simplicity of the method even for complex cases, e.g., with very many variables or multicategory classification, or with noisy or incomplete data processing, is noteworthy when compared with other effective pattern recognition methods. 相似文献
9.
Methods have been developed that allow important chemical effects to be quantified. Parameters calculated with these procedures can be used to investigate both quantitative and qualitative information on chemical reactivity. A variety of statistical and pattern recognition methods is used for that purpose. These studies lead to reactivity functions that allow the prediction of the course of complex organic reactions. 相似文献
10.
11.
A new paradigm is suggested for pattern recognition of drugs. The approach is based on the combined application of the 4D/3D quantitative structure-activity relationship (QSAR) algorithms BiS and ConGO. The first algorithm, BiS/MC (multiconformational), is used for the search for the conformers interacting with a receptor. The second algorithm, ConGO, has been suggested for the detailed study of the selected conformers' electron density and for the search for the electron structure fragments that determine the pharmacophore and antipharmacophore parts of the compounds. In this work we suggest using a new AlteQ method for the evaluation of the molecular electron density. AlteQ describes the experimental electron density (determined by low-temperature highly accurate X-ray analysis) much better than a number of quantum approaches. Herein this is shown using a comparison of the computed electron density with the results of highly accurate X-ray analysis. In the present study the desirability function is used for the first time for the analysis of the effects of the electron structure in the process of pattern recognition of active and inactive compounds. The suggested method for pattern recognition has been used for the investigation of various sets of compounds such as DNA-antimetabolites, fXa inhibitors, 5-HT(1A), and alpha(1)-AR receptors inhibitors. The pharmacophore and antipharmacophore fragments have been found in the electron structures of the compounds. It has been shown that the pattern recognition cross-validation quality for the datasets is unity. 相似文献
12.
Stanimirova I Kubik A Walczak B Einax JW 《Analytical and bioanalytical chemistry》2008,390(5):1273-1282
Biofilms are complex aggregates formed by microorganisms such as bacteria, fungi and algae, which grow at the interfaces between
water and natural or artificial materials. They are actively involved in processes of sorption and desorption of metal ions
in water and reflect the environmental conditions in the recent past. Therefore, biofilms can be used as bioindicators of
water quality. The goal of this study was to determine whether the biofilms, developed in different aquatic systems, could
be successfully discriminated using data on their elemental compositions. Biofilms were grown on natural or polycarbonate
materials in flowing water, standing water and seawater bodies. Using an unsupervised technique such as principal component
analysis (PCA) and several supervised methods like classification and regression trees (CART), discriminant partial least
squares regression (DPLS) and uninformative variable elimination–DPLS (UVE-DPLS), we could confirm the uniqueness of sea biofilms
and make a distinction between flowing water and standing water biofilms. The CART, DPLS and UVE-DPLS discriminant models
were validated with an independent test set selected either by the Kennard and Stone method or the duplex algorithm. The best
model was obtained from CART with 100% correct classification rate for the test set designed by the Kennard and Stone algorithm.
With CART, one variable describing the Mg content in the biofilm water phase was found to be important for the discrimination
of flowing water and standing water biofilms. 相似文献
13.
The different criteria that should be considered in selecting a supervised pattern recognition technique for a particular application are discussed. An overview is given of the most important and most frequently-used supervised techniques and the extent to which they meet the criteria. The possibilities of two rule-building expert systems are also discussed. 相似文献
14.
Summary Two different pattern recognition methods have been used to interpret low resolution mass spectra of steroids. Classifiers have been evaluated that detect type and position of some substituents in a steroid molecule with predictive abilities between 70% and 90%. Measurement of distances in the hyperspace from mass spectra points to centres of gravity yielded some insight into clustering. Adaptive, linear decision vectors gave somewhat better results in classification than did distance measurement. 相似文献
15.
16.
The triglyceride profiles for corn, soybean, sunflower, grapestone, olive and olive husks oils were obtained by high temperature capillary gas chromatography on HT-FS capillary columns coated with OV-17-OH. The data were treated by methods of computerized pattern analysis, involving principal component analysis, discriminant analysis, and hierarchical clustering. A graphical representation of results as “star symbol plots” allows prompt visual recognition of the characteristic profiles and straightforward qualitative identification of unknowns as well as recognition of adulterations. 相似文献
17.
A crucial point in pattern recognition methods is the extraction of features to determine the pattern vectors. Orthogonal transformations, e.g., Fourier, Walsh and Haar, are investigated as preprocessing methods for feature extraction. The theoretical considerations and conclusions are compared with computational results obtained by applying different pattern recognition methods to two different but similar collections of low-resolution mass spectra. 相似文献
18.
The purpose of this study is to compare the performance of UNEQ, a supervised pattern recognition technique of the classmodelling type, with other classification techniques and to demonstrate its use for a practical application, namely the classification of coals. 相似文献
19.
Lavine BK Davidson CE Rayens WS 《Combinatorial chemistry & high throughput screening》2004,7(2):115-131
Motivation: Microarrays have allowed the expression level of thousands of genes or proteins to be measured simultaneously. Data sets generated by these arrays consist of a small number of observations (e.g., 20-100 samples) on a very large number of variables (e.g., 10,000 genes or proteins). The observations in these data sets often have other attributes associated with them such as a class label denoting the pathology of the subject. Finding the genes or proteins that are correlated to these attributes is often a difficult task since most of the variables do not contain information about the pathology and as such can mask the identity of the relevant features. We describe a genetic algorithm (GA) that employs both supervised and unsupervised learning to mine gene expression and proteomic data. The pattern recognition GA selects features that increase clustering, while simultaneously searching for features that optimize the separation of the classes in a plot of the two or three largest principal components of the data. Because the largest principal components capture the bulk of the variance in the data, the features chosen by the GA contain information primarily about differences between classes in the data set. The principal component analysis routine embedded in the fitness function of the GA acts as an information filter, significantly reducing the size of the search space since it restricts the search to feature sets whose principal component plots show clustering on the basis of class. The algorithm integrates aspects of artificial intelligence and evolutionary computations to yield a smart one pass procedure for feature selection, clustering, classification, and prediction. 相似文献