首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
This paper describes a clustering method on three‐way arrays making use of an exploratory visualization approach. The aim of this study is to cluster samples in the object mode of a three‐way array, which is done using the scores (sample loadings) of a three‐way factor model, for example, a Tucker3 or a PARAFAC model. Further, tools are developed to explore and identify reasons for particular clusters by visually mining the data using the clustering results as guidance. We introduce a three‐way clustering tool and demonstrate our results on a metabolite profiling dataset. We explore how high performance liquid chromatography (HPLC) measurements of commercial extracts of St. John's wort (natural remedies for the treatment of mild to moderate depression) differ and which chemical compounds account for those differences. Using common distance measures, for example, Euclidean or Mahalanobis, on the scores of a three‐way model, we verify that we can capture the underlying clustering structure in the data. Beside this, by making use of the visualization approach, we are able to identify the variables playing a significant role in the extracted cluster structure. The suggested approach generalizes straightforwardly to higher‐order data and also to two‐way data. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

2.
Feature selection is a valuable technique in data analysis for information-preserving data reduction. This paper describes a feature selection approach for hierarchical clustering based on genetic algorithms using a fitness function that tries to minimize the difference between the dissimilarity matrix of the original feature set and the one of the reduced feature sets. Clustering trees based on reduced feature sets are comparable with those based on the complete feature set. Special measures to favor small reduced feature sets are discussed.  相似文献   

3.
The generation of a hierarchical tree of 500 infrared spectra, using the recently proposed fractal or 3-distances-clustering method is described and discussed. The objects of clustering are infrared spectra of polymer compounds which are represented as sets of 80 complex Fourier coefficients, obtained by fast Fourier transformation of digitized absorbance spectra. The generated hierarchical tree, with a maximum height of 20 and an average height of 12 levels, yields a very satisfactory clustering scheme with respect to the structure of the compounds involved. In addition to very good clustering, a 100% retrieval (prediction) ability was obtained. This was achieved by the use of an iterative procedure after the initial tree had been generated. Additionally, the tree was tested with 240 infrared spectra of different compounds which were taken into account during the generation of the tree. The retrieval success of these test runs is discussed with respect to the structural similarity of the compounds to which the “unknown” spectra were linked.  相似文献   

4.
5.
A contour map of the minimum values for three-body rate constants and for equilibrium constants has been constructed for the initial three-body clustering of gas molecules to positive ions. The map was constructed from laboratory measurements of the clustering of gas molecules to alkali ions and consolidates into a readily useful form several recent measurements. Use of the map shows that labile clustering can be extremely important in laboratory experiments and in the earth's upper atmosphere.  相似文献   

6.
An ant colony approach for clustering   总被引:2,自引:0,他引:2  
This paper presents an ant colony optimization methodology for optimally clustering N objects into K clusters. The algorithm employs distributed agents which mimic the way real ants find a shortest path from their nest to food source and back. This algorithm has been implemented and tested on several simulated and real datasets. The performance of this algorithm is compared with other popular stochastic/heuristic methods viz. genetic algorithm, simulated annealing and tabu search. Our computational simulations reveal very encouraging results in terms of the quality of solution found, the average number of function evaluations and the processing time required.  相似文献   

7.
Two traditional clustering algorithms are applied to configurations from a long molecular dynamics trajectory and compared using two sets of test data. First, a subset of atoms was chosen to present conformations which naturally fall into a number of clusters. Second, a subset of atoms was selected to span a relatively continuous region of conformational space rather than form discrete conformational classes. Of the two algorithms used, the single linkage method is inappropriate for this kind of data. The divisive hierarchical method, based on minimizing the difference between cluster centroids and extrema, is successful but also prone to imposing clustering hierarchy where none can be justified. © 1994 by John Wiley & Sons, Inc.  相似文献   

8.
Xu J  Ahn B  Lee H  Xu L  Lee K  Panchapakesan R  Oh KW 《Lab on a chip》2012,12(4):725-730
We present a multiple-droplet clustering device that can perform sequential droplet trapping and storing. Shape-dependent droplet manipulation in forward and backward flows has been incorporated to achieve high trapping and storing efficiency in a 10 × 12 array of clustering structures (e.g., storing well, storing chamber, trapping well, and guiding track). In the forward flow, flattened droplets are trapped in each trapping well. In the backward flow, the trapped droplets are released from the trapping well and follow the guiding tracks to their corresponding storing wells. The guided droplets float up out of the confining channel to the super stratum of the storing chamber due to interfacial energy and buoyancy effects. This forward/backward flow-based trapping/storing process can be repeated several times to cluster droplets with different contents and samples in the storing chambers. We expect that the proposed platform will be a valuable tool to study complex droplet-based reactions in clustered droplets.  相似文献   

9.
Water activity is an important macroscopic property of aerosol particles and droplets in the atmosphere as well as aqueous solutions in many other fields of physical chemistry. This study focuses on relating water activity, described using osmotic coefficients, to the microscopic water structure in systems of atmospheric relevance, namely, aqueous solutions of each of the four electrolytes: NaCl, (NH(4))(2)SO(4), NH(4)Cl, and Na(2)SO(4). The osmotic coefficients of these compounds, as reported in literature based on thermodynamic measurements, decrease as a function of molality for dilute solutions and increase as a function of molality for concentrated solutions. At an intermediate molality, a minimum value of the osmotic coefficient is observed. We explain this behavior by describing osmotic coefficients as the product of two concentration-dependent effects: incomplete electrolyte dissociation and variations in the microphysical water structure. The degree of dissociation in electrolyte solutions can be obtained directly from literature or derived from reported pK values, and in this work the water structure is quantified using low-wavenumber Raman spectroscopy. We use the band at 180 cm(-1) in Raman spectra of aqueous electrolyte solutions, which has been assigned to the displacement of the central oxygen atom in a tetrahedral hydrogen bonding environment composed of five H(2)O units. The abundance of such translationally restricted water molecules is essential in describing the local microphysical structure of water, and the height of the band is used to estimate the amount of such translationally restricted water molecules in solution. We were able to qualitatively reproduce and explain literature values of osmotic coefficients for the four studied electrolytes. Our results indicate that the effect of electrolyte dissociation, which decreases as a function of molality, dominates in dilute solutions, whereas changes in water structure are more significant at higher concentrations.  相似文献   

10.
Accelerated K-means clustering in metric spaces   总被引:1,自引:0,他引:1  
The K-means method is a popular technique for clustering data into k-partitions. In the adaptive form of the algorithm, Lloyds method, an iterative procedure alternately assigns cluster membership based on a set of centroids and then redefines the centroids based on the computed cluster membership. The most time-consuming part of this algorithm is the determination of which points being clustered belong to which cluster center. This paper discusses the use of the vantage-point tree as a method of more quickly assigning cluster membership when the points being clustered belong to intrinsically low- and medium-dimensional metric spaces. Results will be discussed from simulated data sets and real-world data in the clustering of molecular databases based upon physicochemical properties. Comparisons will be made to a highly optimized brute-force implementation of Lloyd's method and to other pruning strategies.  相似文献   

11.
Hierarchical clustering algorithms such as Wards or complete-link are commonly used in compound selection and diversity analysis. Many such applications utilize binary representations of chemical structures, such as MACCS keys or Daylight fingerprints, and dissimilarity measures, such as the Euclidean or the Soergel measure. However, hierarchical clustering algorithms can generate ambiguous results owing to what is known in the cluster analysis literature as the ties in proximity problem, i.e., compounds or clusters of compounds that are equidistant from a compound or cluster in a given collection. Ambiguous ties can occur when clustering only a few hundred compounds, and the larger the number of compounds to be clustered, the greater the chance for significant ambiguity. Namely, as the number of "ties in proximity" increases relative to the total number of proximities, the possibility of ambiguity also increases. To ensure that there are no ambiguous ties, we show by a probabilistic argument that the number of compounds needs to be less than 2(n 1/4), where n is the total number of proximities, and the measure used to generate the proximities creates a uniform distribution without statistically preferred values. The common measures do not produce uniformly distributed proximities, but rather statistically preferred values that tend to increase the number of ties in proximity. Hence, the number of possible proximities and the distribution of statistically preferred values of a similarity measure, given a bit vector representation of a specific length, are directly related to the number of ties in proximities for a given data set. We explore the ties in proximity problem, using a number of chemical collections with varying degrees of diversity, given several common similarity measures and clustering algorithms. Our results are consistent with our probabilistic argument and show that this problem is significant for relatively small compound sets.  相似文献   

12.
13.
14.
Hierarchical clustering is the most often used method for grouping similar patterns of gene expression data. A fundamental problem with existing implementations of this clustering method is the inability to handle large data sets within a reasonable time and memory resources. We propose a parallelized algorithm of hierarchical clustering to solve this problem. Our implementation on a multiple instruction multiple data (MIMD) architecture shows considerable reduction in computational time and inter-node communication overhead, especially for large data sets. We use the standard message passing library, message passing interface (MPI) for any MIMD systems.  相似文献   

15.
Integrins are transmembrane proteins that allow cells to bind to their external environment. They are the primary regulators of cell-matrix interactions, with direct roles in cell motility and signaling, which in turn regulate numerous physiological processes. Under common experimental conditions, integrins tend to cluster for sturdy and effective binding to extracellular matrix molecules. These clusters often evolve into focal adhesions, which regulate downstream signaling. However, integrin clusters are more pronounced and have longer lifetimes in two-dimensional assays than in more realistic three-dimensional environments. While a number of models and theoretical approaches have focused on integrin binding and diffusion, the reasons for the differences between two- and three-dimensional clustering have remained elusive. In this study, we model an individual cluster attached to a two-dimensional collagen film and attached to collagen fibers of various sizes in three-dimensional matrices. We then discuss how our results explain differences in size and lifetime, and how they hint at reasons for other differences between the two environments. Further, we make predictions regarding the stability of clusters based on different overall intracellular conditions. Our results show good agreement with experiments and provide a quantitative basis for understanding how matrix dimensionality and structure regulate integrin behavior in environments that mimic in vivo conditions.  相似文献   

16.
In this work we present a new method for investigating local energy minima on a protein energy landscape. The CABS (CAlpha, CBeta and the center of mass of the Side chain) method was employed for generating protein models, but any other method could be used instead. Cα traces from an ensemble of models are hierarchical clustered with the HCPM (Hierarchical Clustering of Protein Models) method. The efficiency of this method for sampling and analyzing energy landscapes is shown.  相似文献   

17.
Benzene equilibrium absorption curve by EPR rubber (50 wt % C3) was measured at 23°C. Benzene activity a1 was plotted against its volume fraction ?1. Benzene clustering function G11/v1 was calculated according to Zimm and the mechanism of formation of benzene clusters is discussed. It was found that clusters at very low benzene partial pressures are monomolecular and randomly distributed: at higher partial pressures, clusters increase in size. Flory-Rehner theory, giving activity α1 against ?1, was checked; predictions of cluster formation were made but there were some discrepancies with Zimm's conclusions.  相似文献   

18.
The far-infrared spectra of polystyrene methacrylic acid/(PSMA) ionomers have been investigated as a function of cation and ion-site concentration to obtain spectroscopic evidence of domain formation. A far-infrared band, due to the vibration of a higher-order cluster, is found at 170 cm?1in Na+-form PSMA. This band, which is observed in addition to the known cation-motion bands, is assigned to the vibrations of aggregates involving many cations and anionic sites close together, and the results are discussed in light of ion aggregation models.  相似文献   

19.
We have fabricated biocompatible nanofiber hydrogels with diverse sizes of ferritin clusters according to the mixing temperature of solutions employing electrospinning. Poly(vinyl alcohol) (PVA) was used as a polymeric matrix for fabricating nanocomposites. By thermal means we controlled the interaction between the host PVA hydrogel and the protein shell on ferritin bionanoparticles to vary the size and concentration of ferritin clusters. The clustering of ferritin was based on the partial unfolding of a protein shell of ferritin. By studying the magnetic properties of the PVA/ferritin nanofibers according to the mixing temperature of the PVA/ferritin solutions, we confirmed that the clustering process of the ferritin was related to changes in the superparamagnetic properties and magnetic resonance imaging (MRI) contrast of the PVA/ferritin nanofibers. PVA/ferritin nanofiber hydrogels with diverse spatial distributions of ferritin nanoparticles are applicable as MRI-based noninvasive detectable cell culture scaffolds and as artificial muscles because of their improved superparamagnetic properties.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号