期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Comparison of similarity coefficients for clustering and compound selection

Haranczyk M Holliday J 《Journal of chemical information and modeling》2008,48(3):498-508

Recent studies into the use of a selection of similarity coefficients, when applied to searches of chemical databases represented by binary fingerprints, have shown considerable variation in their retrieval performance and in the sets of compounds being retrieved. The main factor influencing performance is the density distribution of the bitstrings for the active class, a feature which is closely related to molecular size. If this is the case when these coefficients are applied to similarity searches, then we would expect considerable variation in performance when applied to dissimilarity methods, namely clustering and compound selection. Here we report on several studies which have been undertaken to investigate the relative performance of 13 association and correlation coefficients, which have been shown to exhibit complementary performance in similarity searches, when applied to hierarchical and nonhierarchical clustering methods and to a compound selection methodology. Results suggest that the correlation coefficients perform consistently well for clustering and compound selection, as does the Baroni-Urbani/Buser association coefficient. Surprisingly, these often outperform the Tanimoto coefficient, while the Simple Match (effectively the complement of the Squared Euclidean Distance) performs very poorly. 相似文献

2.

Protocols for bridging the peptide to nonpeptide gap in topological similarity searches

Sheridan RP Singh SB Fluder EM Kearsley SK 《Journal of chemical information and computer sciences》2001,41(5):1395-1406

相似文献

3.

GPU accelerated chemical similarity calculation for compound library comparison

Ma C Wang L Xie XQ 《Journal of chemical information and modeling》2011,51(7):1521-1527

相似文献

4.

New diversity calculations algorithms used for compound selection

Trepalin SV Gerasimenko VA Kozyukov AV Savchuk NP Ivaschenko AA 《Journal of chemical information and computer sciences》2002,42(2):249-258

Some modifications were introduced into the previously described Centroid diversity sorting algorithm, which uses cosine similarity metric. The modified algorithm is suitable for the work with large databases on personal computers. For example, for diversity sorting of the database with the size greater than a million of records, less than 9 h are required (Pentium III, 800 MHz). The problem of selecting new compounds into the existing collection is examined to reach the maximum diversity of the collection. The article describes the new algorithm for the selection of heterocyclic compounds. 相似文献

5.

Classification scheme for the design of serine protease targeted compound libraries

Lang SA Kozyukov AV Balakin KV Skorenko AV Ivashchenko AA Savchuk NP 《Journal of computer-aided molecular design》2002,16(11):803-807

相似文献

6.

An algorithm for astm infrared file searches based on intensity data

K. Tanabe T. Tamura J. Hiraishi S. Saëki 《Analytica chimica acta》1979,112(3):211-218

The computer program described utilizes peak intensity data for ASTM infrared file searching. Peaks in unknown spectra are classified into five groups according to their rela-tive intensities, and the scores for matches with the ASTM data are calculated by means of the intensity data. A test search with 135 compounds proves the excellent performance of the proposed method. The advantages of the system are that correct answers can be found easily and that no particular attention is necessary for selection of the peaks to be entered. 相似文献

7.

Linear notation for benzenoid aromatic hydrocarbons. Molecular similarity based on notation similarity

W. C. Herndon A. J. Bruce 《Journal of mathematical chemistry》1988,2(2):155-169

Two succinct linear notation systems to encode the structure of polybenzenoid aromatic hydrocarbons are exemplified. Both notation systems use a labeled dual inner graph to represent the hydrocarbon. A molecular similarity index ranging from unity (identical molecules) to zero (completely different molecules) is defined based on a comparison of the linear notations for a pair of compounds. The similarity index procedure is applied to a correlation of the carcinogenic properties of the benzenoid hydrocarbons. 相似文献

8.

SVM-based feature selection for characterization of focused compound collections

Byvatov E Schneider G 《Journal of chemical information and computer sciences》2004,44(3):993-999

相似文献

9.

Rational principles of compound selection for combinatorial library design

Tropsha A Zheng W 《Combinatorial chemistry & high throughput screening》2002,5(2):111-123

It is practically impossible in a short period of time to synthesize and test all compounds in any large exhaustive chemical library. We discuss rational approaches to selecting representative subsets of virtual libraries that help direct experimental synthetic efforts for both targeted and diverse library design. For targeted library design, we consider principles based on the similarity to lead molecules. In the case of diverse library design, we discuss algorithms aimed at the selection of both diverse and representative subsets of the entire chemical library space. We illustrate methodologies with several practical examples. 相似文献

10.

Visualization of multi-property landscapes for compound selection and optimization

Antonio de la Vega de León Shilva Kayastha Dilyana Dimova Thomas Schultz Jürgen Bajorath 《Journal of computer-aided molecular design》2015,29(8):695-705

相似文献

11.

Heikamp K Bajorath J 《Journal of chemical information and modeling》2011,51(8):1831-1839

A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (～76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets. 相似文献

12.

A chemometric approach based on a novel similarity/diversity measure for the characterisation and selection of electronic nose sensors

Ballabio D Cosio MS Mannino S Todeschini R 《Analytica chimica acta》2006,578(2):170-177

Electronic nose sensor signals provide a digital fingerprint of the product in analysis, which can be subsequently investigated by means of chemometrics. In this paper, the fingerprint characterisation of electronic nose data has been studied by means of a novel chemometric approach based on the partial ordering technique and the Hasse matrix. This matrix can be associated to each data sequence and the similarity between two sequences can be evaluated with the definition of a distance between the corresponding Hasse matrices. Since all the signals achieved along time are intrinsically ordered, the data provided by electronic nose can be also considered as sequential data and consequently characterized by means of the proposed approach. The similarity/diversity measure has been here applied in order to characterize the class discrimination capability of each electronic nose sensor: extra virgin olive oil samples of different geographical origin have been considered and Hasse distances have been used to select the sensors which appear more able to discriminate the olive oil origins. The distance based on the Hasse matrix has showed some useful properties and proved to be able to link each electronic nose time profile to a meaningful mathematical term (the Hasse matrix), which can be consequently studied by multivariate analysis. 相似文献

13.

Kernel approach to molecular similarity based on iterative graph similarity

Rupp M Proschak E Schneider G 《Journal of chemical information and modeling》2007,47(6):2280-2286

Similarity measures for molecules are of basic importance in chemical, biological, and pharmaceutical applications. We introduce a molecular similarity measure defined directly on the annotated molecular graph, based on iterative graph similarity and optimal assignments. We give an iterative algorithm for the computation of the proposed molecular similarity measure, prove its convergence and the uniqueness of the solution, and provide an upper bound on the required number of iterations necessary to achieve a desired precision. Empirical evidence for the positive semidefiniteness of certain parametrizations of our function is presented. We evaluated our molecular similarity measure by using it as a kernel in support vector machine classification and regression applied to several pharmaceutical and toxicological data sets, with encouraging results. 相似文献

14.

A new class of organometallic compound

Virendra P. Singh Vidya B. Pandey Sanjay K. Pandey 《Transition Metal Chemistry》1990,15(1):16-18

Summary Organometallic compounds of general formula (SCN)₂M(NCSeHgR)₂ (M=Co^II, Ni^II, R=n-C₅H₁₁,i-C₅H₁₁) have been prepared. They behave as Lewis acids, forming complexes with pyridine and 2,2-bipyridyl, characterized by elemental analysis, molecular weight, molar conductance, i.r. spectral (4000–200 cm^–1), electronic spectral and magnetic susceptibility measurements. The Lewis acids are monomeric with bridging thiocyanate, or selenocyanate between M²⁺ and Hg²⁺. Cobalt and nickel acquire tetrahedral and octahedral configurations respectively through axial bridging, whereas mercury retains its linearity. Pyridine links to the metal in the Lewis acid and forms L₂(SCN)₂M(NCSeHgR)₂ complexes. Bipyridyl ruptures the NCX bridge and forms cationic-anionic [M(bipy)₃][(NCS)(NCSe)HgR]₂ complexes. 相似文献

15.

Identification of liquid crystalline phases in 7O.O9 compound based on structural similarity index measure

B.T.P. Madhav M. Venu Gopala Rao 《Liquid crystals》2013,40(2):198-203

Textural analysis is done in the case of the thermotropic liquid crystal (LC), 4-heptyloxybenzylidene-4′-nonyloxyaniline, 7O.O9, using a polarising microscope attached with a hot stage with a high-resolution camera. Natural images are highly structured: their pixels exhibit strong dependencies and carry important information about the structure of the objects. In this article, we consider the structural similarity index measure parameter computed as a function of the temperature. The results exhibit abrupt changes with temperature showing different liquid crystalline phases. This statistical image analysis is compared with the differential scanning calorimeter data and good agreement was found. The proposed methodology is very sensitive and reliable technique to identify the LC phases. 相似文献

16.

Column selection and optimization for sulfur compound analyses by gas chromatography

Richard S. Hutte Neil G. Johansen Marianne F. Legier 《Journal of separation science》1990,13(6):421-426

The analysis of sulfur-containing compounds using fused silica capillary columns and the Sulfur Chemiluminescence Detector has been investigated. This combination of an inert chromatographic system and a high sensitivity, selective detector provides significant advantages for the analysis of low levels of sulfur compounds in complex matrices over existing techniques. Capillary columns coated with thick films (1–4 μm) of methyl silicone stationary phase permit separation of most sulfur containing compounds and, when used with sub-ambient column temperatures, these columns can be used for the separation of sulfur gases. The effects of stationary film thickness, column length, and internal diameter for the measurement of sulfur compounds in hydrocarbon matrices has been determined. 相似文献

17.

Yao YH Dai Q Nan XY He PA Nie ZM Zhou SP Zhang YZ 《Journal of computational chemistry》2008,29(10):1632-1639

On the basis of a class of 2D graphical representations of DNA sequences, sensitivity analysis has been performed, showing the high-capability of the proposed representations to take into account small modifications of the DNA sequences. And sensitivity analysis also indicates that the absolute differences of the leading eigenvalues of the L/L matrices associated with DNA increase with the increase of the number of the base mutations. Besides, we conclude that the similarity analysis method based on the correlation angles can better eliminate the effects of the lengths of DNA sequences if compared with the method using the Euclidean distances. As application, the examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of different species has been performed by our method, and the reasonable results verify the validity of our method. 相似文献

18.

Wale N Watson IA Karypis G 《Journal of chemical information and modeling》2008,48(4):730-741

Methods that can screen large databases to retrieve a structurally diverse set of compounds with desirable bioactivity properties are critical in the drug discovery and development process. This paper presents a set of such methods that are designed to find compounds that are structurally different to a certain query compound while retaining its bioactivity properties (scaffold hops). These methods utilize various indirect ways of measuring the similarity between the query and a compound that take into account additional information beyond their structure-based similarities. The set of techniques that are presented capture these indirect similarities using approaches based on analyzing the similarity network formed by the query and the database compounds. Experimental evaluation shows that most of these methods substantially outperform previously developed approaches both in terms of their ability to identify structurally diverse active compounds as well as active compounds in general. 相似文献

19.

Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase

Xing L Goulet R Johnson K 《Journal of chemical information and modeling》2011,51(7):1582-1592

相似文献

20.

Synthesis of a dysidiolide-inspired compound library and discovery of acetylcholinesterase inhibitors based on protein structure similarity clustering (PSSC)

Michael Scheck Marcus A. Koch Herbert Waldmann 《Tetrahedron》2008,64(21):4792-4802

Biologically relevant compound collections are a major prerequisite for efficient protein ligand development and ultimately for drug discovery. We herein describe the development of a compound collection inspired by the decalin core motif of two natural products, dysidiolide 1 and sulfiricin 2, both inhibitors of the Cdc25A phosphatase. Several keto-functionalized decalinols were synthesized in solution, immobilized on Merrifield resin equipped with a dihydropyranyl linker, and then subjected to aldol condensation reactions with different aldehydes leading to exocyclic E-configured olefins. Further diversity-increasing transformations on the solid support included Sonogashira, Suzuki, and Heck reactions, Cu-catalyzed conjugate addition and Grignard reactions, alkylation reactions in the α-position to a ketone, Wittig reactions, and reductive animations. In total, 483 compounds were synthesized.Cdc25A and AChE exhibit structural similarity in their ligand-sensing cores and were thus grouped into a protein structure similarity cluster (PSSC). A screen for AChE inhibition of a subset of 162 compounds yielded three micromolar inhibitors of AChE with IC₅₀ values <20 μM. 相似文献