期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Representation of molecular structure using quantum topology with inductive logic programming in structure–activity relationships

Buttingsrud B Ryeng E King RD Alsberg BK 《Journal of computer-aided molecular design》2006,20(6):361-373

The requirement of aligning each individual molecule in a data set severely limits the type of molecules which can be analysed with traditional structure activity relationship (SAR) methods. A method which solves this problem by using relations between objects is inductive logic programming (ILP). Another advantage of this methodology is its ability to include background knowledge as 1st-order logic. However, previous molecular ILP representations have not been effective in describing the electronic structure of molecules. We present a more unified and comprehensive representation based on Richard Bader's quantum topological atoms in molecules (AIM) theory where critical points in the electron density are connected through a network. AIM theory provides a wealth of chemical information about individual atoms and their bond connections enabling a more flexible and chemically relevant representation. To obtain even more relevant rules with higher coverage, we apply manual postprocessing and interpretation of ILP rules. We have tested the usefulness of the new representation in SAR modelling on classifying compounds of low/high mutagenicity and on a set of factor Xa inhibitors of high and low affinity. 相似文献

2.

Scaffold hopping in drug discovery using inductive logic programming

Tsunoyama K Amini A Sternberg MJ Muggleton SH 《Journal of chemical information and modeling》2008,48(5):949-957

In chemoinformatics, searching for compounds which are structurally diverse and share a biological activity is called scaffold hopping. Scaffold hopping is important since it can be used to obtain alternative structures when the compound under development has unexpected side-effects. Pharmaceutical companies use scaffold hopping when they wish to circumvent prior patents for targets of interest. We propose a new method for scaffold hopping using inductive logic programming (ILP). ILP uses the observed spatial relationships between pharmacophore types in pretested active and inactive compounds and learns human-readable rules describing the diverse structures of active compounds. The ILP-based scaffold hopping method is compared to two previous algorithms (chemically advanced template search, CATS, and CATS3D) on 10 data sets with diverse scaffolds. The comparison shows that the ILP-based method is significantly better than random selection while the other two algorithms are not. In addition, the ILP-based method retrieves new active scaffolds which were not found by CATS and CATS3D. The results show that the ILP-based method is at least as good as the other methods in this study. ILP produces human-readable rules, which makes it possible to identify the three-dimensional features that lead to scaffold hopping. A minor variant of a rule learnt by ILP for scaffold hopping was subsequently found to cover an inhibitor identified by an independent study. This provides a successful result in a blind trial of the effectiveness of ILP to generate rules for scaffold hopping. We conclude that ILP provides a valuable new approach for scaffold hopping. 相似文献

3.

Quantitative structure-activity relationships by neural networks and inductive logic programming. I. The inhibition of dihydrofolate reductase by pyrimidines 总被引：1，自引：0，他引：1

Jonathan D. Hirst Ross D. King Michael J. E. Sternberg 《Journal of computer-aided molecular design》1994,8(4):405-420

Summary Neural networks and inductive logic programming (ILP) have been compared to linear regression for modelling the QSAR of the inhibition of E. coli dihydrofolate reductase (DHFR) by 2,4-diamino-5-(substitured benzyl)pyrimidines, and, in the subsequent paper [Hirst, J.D., King, R.D. and Sternberg, M.J.E., J. Comput.-Aided Mol. Design, 8 (1994) 421], the inhibition of rodent DHFR by 2,4-diamino-6,6-dimethyl-5-phenyl-dihydrotriazines. Cross-validation trials provide a statistically rigorous assessment of the predictive capabilities of the methods, with training and testing data selected randomly and all the methods developed using identical training data. For the ILP analysis, molecules are represented by attributes other than Hansch parameters. Neural networks and ILP perform better than linear regression using the attribute representation, but the difference is not statistically significant. The major benefit from the ILP analysis is the formulation of understandable rules relating the activity of the inhibitors to their chemical structure. 相似文献

4.

Quantitative structure-activity relationships by neural networks and inductive logic programming. II. The inhibition of dihydrofolate reductase by triazines 总被引：1，自引：0，他引：1

Jonathan D. Hirst Ross D. King Michael J. E. Sternberg 《Journal of computer-aided molecular design》1994,8(4):421-432

Summary One of the largest available data sets for developing a quantitative structure-activity relationship (QSAR) — the inhibition of dihydrofolate reductase (DHFR) by 2,4-diamino-6,6-dimethyl-5-phenyl-dihydrotriazine derivatives — has been used for a sixfold cross-validation trial of neural networks, inductive logic programming (ILP) and linear regression. No statistically significant difference was found between the predictive capabilities of the methods. However, the representation of molecules by attributes, which is integral to the ILP approach, provides understandable rules about drug-receptor interactions. 相似文献

5.

Warmr: a data mining tool for chemical data 总被引：5，自引：0，他引：5

King RD Srinivasan A Dehaspe L 《Journal of computer-aided molecular design》2001,15(2):173-181

相似文献

6.

A method of microarray data storage using array data type

Tsoi LC Zheng WJ 《Computational Biology and Chemistry》2007,31(2):143-147

相似文献

7.

A novel logic-based approach for quantitative toxicology prediction

Amini A Muggleton SH Lodhi H Sternberg MJ 《Journal of chemical information and modeling》2007,47(3):998-1006

相似文献

8.

Finding rule groups to classify high dimensional gene expression datasets

An J Chen YP 《Computational Biology and Chemistry》2009,33(1):108-113

相似文献

9.

The discovery of indicator variables for QSAR using inductive logic programming

Ross D. King Ashwin Srinivasan 《Journal of computer-aided molecular design》1997,11(6):571-580

相似文献

10.

An alignment‐free methodology for modelling field‐based 3D‐structure activity relationships using inductive logic programming

Brd Buttingsrud Ross Donald King Bjrn Kre Alsberg 《Journal of Chemometrics》2007,21(12):509-519

Traditional 3D‐quantitative structure–activity relationship (QSAR)/structure–activity relationship (SAR) methodologies are sensitive to the quality of an alignment step which is required to make molecular structures comparable. Even though many methods have been proposed to solve this problem, they often result in a loss of model interpretability. The requirement of alignment is a restriction imposed by traditional regression methods due to their failure to represent relations between data objects directly. Inductive logic programming (ILP) is a class of machine‐learning methods able to describe relational data directly. We propose a new methodology which is aimed at using the richness in molecular interaction fields (MIFs) without being restricted by any alignment procedure. A set of MIFs is computed and further compressed by finding their minima corresponding to the sites of strongest interaction between a molecule and the applied test probe. ILP uses these minima to build easily interpretable rules about activity expressed as pharmacophore rules in the powerful language of first‐order logic. We use a set of previously published inhibitors of factor Xa of the benzamidine family to discuss the problems, requirements and advantages of the new methodology. Copyright © 2007 John Wiley & Sons, Ltd. 相似文献

11.

Effect of local background intensities in the normalization of cDNA microarray data with a skewed expression profiles

Kim JH Shin DM Lee YS 《Experimental & molecular medicine》2002,34(3):224-232

Normalization of the data of cDNA microarray is an obligatory step during microarray experiments due to the relatively frequent non-specific errors. Generally, normalization of microarray data is based on the null hypothesis and variance model. In the Yang's model (Yang et al., 2001), at least two types of noises are included. The one is additive noise and the other is multiplicative noise. Usually, background is considered as one of additive noise to the signal and the variation between the signal pixels is the representative multiplicative noise. In this study, the relation between the signal (spot intensity minus background intensity) and background was observed and the influence of background on normalization as a representative additive factor was investigated. Although the relation has not been considered as a factor affecting the normalization, it could improve the accuracy of microarray data when the normalization was carried out considering signal/background ratio. The background dependent normalization decreased the number of genes whose expression levels were changed significantly and it could make their distribution more consistent through the whole range of signal intensities. In this study, printing pin dependent normalization was also carried out regarding the printing pin as a representative multiplicative noise. It improved the distribution of spots in the Cy3-Cy5 scatter plot, but its effect was slight. These studies suggest that there are some influences of the signals on the local backgrounds and they must be considered for the normalization of cDNA microarray data. 相似文献

12.

Identification and characterization of differentially expressed genes in Type 2 Diabetes using in silico approach

《Computational Biology and Chemistry》2019

相似文献

13.

Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data

Li-Juan Tang Hai-Long Wu 《Talanta》2009,79(2):260-1694

One problem with discriminant analysis of microarray data is representation of each sample by a large number of genes that are possibly irrelevant, insignificant or redundant. Methods of variable selection are, therefore, of great significance in microarray data analysis. To circumvent the problem, a new gene mining approach is proposed based on the similarity between probability density functions on each gene for the class of interest with respect to the others. This method allows the ascertainment of significant genes that are informative for discriminating each individual class rather than maximizing the separability of all classes. Then one can select genes containing important information about the particular subtypes of diseases. Based on the mined significant genes for individual classes, a support vector machine with local kernel transform is constructed for the classification of different diseases. The combination of the gene mining approach with support vector machine is demonstrated for cancer classification using two public data sets. The results reveal that significant genes are identified for each cancer, and the classification model shows satisfactory performance in training and prediction for both data sets. 相似文献

14.

Improved silicon nitride surfaces for next-generation microarrays

Terry JG Campbell CJ Ross AJ Livingston AD Buck AH Dickinson P Mountford CP Evans SA Mount AR Beattie JS Crain J Ghazal P Walton AJ 《Langmuir : the ACS journal of surfaces and colloids》2006,22(26):11400-11404

This work reports how the use of a standard integrated circuit (IC) fabrication process can improve the potential of silicon nitride layers as substrates for microarray technology. It has been shown that chemical mechanical polishing (CMP) substantially improves the fluorescent intensity of positive control gene and test gene microarray spots on both low-pressure chemical vapor deposition (LPCVD) and plasma-enhanced chemical vapor deposition (PECVD) silicon nitride films, while maintaining a low fluorescent background. This results in the improved discrimination of low expressing genes. The results for the PECVD silicon nitride, which has been previously reported as unsuitable for microarray spotting, are particularly significant for future devices that hope to incorporate microelectronic control and analysis circuitry, due to the film's use as a final passivating layer. 相似文献

15.

Gene selection from microarray data for cancer classification--a machine learning approach 总被引：1，自引：0，他引：1

Wang Y Tetko IV Hall MA Frank E Facius A Mayer KF Mewes HW 《Computational Biology and Chemistry》2005,29(1):1384-46

A DNA microarray can track the expression levels of thousands of genes simultaneously. Previous research has demonstrated that this technology can be useful in the classification of cancers. Cancer microarray data normally contains a small number of samples which have a large number of gene expression levels as features. To select relevant genes involved in different types of cancer remains a challenge. In order to extract useful gene information from cancer microarray data and reduce dimensionality, feature selection algorithms were systematically investigated in this study. Using a correlation-based feature selector combined with machine learning algorithms such as decision trees, nave Bayes and support vector machines, we show that classification performance at least as good as published results can be obtained on acute leukemia and diffuse large B-cell lymphoma microarray data sets. We also demonstrate that a combined use of different classification and feature selection approaches makes it possible to select relevant genes with high confidence. This is also the first paper which discusses both computational and biological evidence for the involvement of zyxin in leukaemogenesis. 相似文献

16.

Unimodal transform of variables selected by interval segmentation purity for classification tree modeling of high-dimensional microarray data

Du W Gu T Tang LJ Jiang JH Wu HL Shen GL Yu RQ 《Talanta》2011,85(3):1689-1694

As a greedy search algorithm, classification and regression tree (CART) is easily relapsing into overfitting while modeling microarray gene expression data. A straightforward solution is to filter irrelevant genes via identifying significant ones. Considering some significant genes with multi-modal expression patterns exhibiting systematic difference in within-class samples are difficult to be identified by existing methods, a strategy that unimodal transform of variables selected by interval segmentation purity (UTISP) for CART modeling is proposed. First, significant genes exhibiting varied expression patterns can be properly identified by a variable selection method based on interval segmentation purity. Then, unimodal transform is implemented to offer unimodal featured variables for CART modeling via feature extraction. Because significant genes with complex expression patterns can be properly identified and unimodal feature extracted in advance, this developed strategy potentially improves the performance of CART in combating overfitting or underfitting while modeling microarray data. The developed strategy is demonstrated using two microarray data sets. The results reveal that UTISP-based CART provides superior performance to k-nearest neighbors or CARTs coupled with other gene identifying strategies, indicating UTISP-based CART holds great promise for microarray data analysis. 相似文献

17.

Chemical data mining of the NCI human tumor cell line database

Wang H Klinginsmith J Dong X Lee AC Guha R Wu Y Crippen GM Wild DJ 《Journal of chemical information and modeling》2007,47(6):2063-2076

The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective. 相似文献

18.

High throughput and global approaches to gene expression

Ghosh D 《Combinatorial chemistry & high throughput screening》2000,3(5):411-420

In the past several years, a new set of technologies based on whole genome analysis have revolutionized the study of gene expression. These microarray or "gene chip" technologies, which arose out of the development of large-scale sequencing approaches, are now coming into increasing use, generating a far greater volume of data than the data representing the sequences themselves. This review focuses on the current state of development of these technologies, and the available approaches to manage and analyze the information they generate. The applicability of this technology to general problems in biomedicine is also discussed. 相似文献

19.

Enhanced Pharmaceutically Active Compounds Productivity from Streptomyces SUK 25: Optimization,Characterization, Mechanism and Techno-Economic Analysis

Muhanna Mohammed Al-Shaibani Radin Maya Saphira Radin Mohamed Noraziah Mohamad Zin Adel Al-Gheethi Mohammed Al-Sahari Hesham Ali El Enshasy 《Molecules (Basel, Switzerland)》2021,26(9)

The present research aimed to enhance the pharmaceutically active compounds’ (PhACs’) productivity from Streptomyces SUK 25 in submerged fermentation using response surface methodology (RSM) as a tool for optimization. Besides, the characteristics and mechanism of PhACs against methicillin-resistant Staphylococcus aureus were determined. Further, the techno-economic analysis of PhACs production was estimated. The independent factors include the following: incubation time, pH, temperature, shaker rotation speed, the concentration of glucose, mannitol, and asparagine, although the responses were the dry weight of crude extracts, minimum inhibitory concentration, and inhibition zone and were determined by RSM. The PhACs were characterized using GC-MS and FTIR, while the mechanism of action was determined using gene ontology extracted from DNA microarray data. The results revealed that the best operating parameters for the dry mass crude extracts production were 8.20 mg/L, the minimum inhibitory concentrations (MIC) value was 8.00 µg/mL, and an inhibition zone of 17.60 mm was determined after 12 days, pH 7, temperature 28 °C, shaker rotation speed 120 rpm, 1 g glucose /L, 3 g mannitol/L, and 0.5 g asparagine/L with R² coefficient value of 0.70. The GC-MS and FTIR spectra confirmed the presence of 21 PhACs, and several functional groups were detected. The gene ontology revealed that 485 genes were upregulated and nine genes were downregulated. The specific and annual operation cost of the production of PhACs was U.S. Dollar (U.S.D) 48.61 per 100 mg compared to U.S.D 164.3/100 mg of the market price, indicating that it is economically cheaper than that at the market price. 相似文献

20.

Detection of Mutations in RNA Polymerase Beta Subunit Gene Encoding Resistance to Rifampin in Mycobacterium tuberculosis by DNA Microarray

《Analytical letters》2012,45(13):2117-2134

Abstract

Rapid and efficient diagnosis is essential in the management of drug‐resistant tuberculosis. A DNA microarray technique based on differential hybridization method was described in the present study for detecting mutations in the RNA polymerase beta subunit (rpoB) gene of Mycobacterium tuberculosis (M. tuberculosis) cultures and in clinical specimens. The mutations in rpoB confer resistance to rifampin, an important first‐line antituberculosis drug. The differential hybridization approach was mainly based on the effect of a single base mismatch on the melting temperature of the hybridized DNA; therefore, any point mutation of rpoB gene resulting in the rifampin resistance can be detected efficiently. The development of the DNA microarray involves the design of dozens of oligonucleotide probes for identifying rifampin‐resistant and ‐sensitive strains. The method comprises isolating genomic DNA from the samples containing M. tuberculosis cells, amplifying rpoB gene coding sequence to produce fluorescently labelled product, and hybridization with the oligonucleotide arrays. The results demonstrated the capability of DNA microarray to provide important clinically relevant information about the rpoB gene of mycobacterial organisms. The DNA microarray offers a reliable diagnostic test for rapidly detecting multidrug resistance caused by gene mutations of mycobacteria. 相似文献