首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective.  相似文献   

3.
The similarity of drug targets is typically measured using sequence or structural information. Here, we consider chemo-centric approaches that measure target similarity on the basis of their ligands, asking how chemoinformatics similarities differ from those derived bioinformatically, how stable the ligand networks are to changes in chemoinformatics metrics, and which network is the most reliable for prediction of pharmacology. We calculated the similarities between hundreds of drug targets and their ligands and mapped the relationship between them in a formal network. Bioinformatics networks were based on the BLAST similarity between sequences, while chemoinformatics networks were based on the ligand-set similarities calculated with either the Similarity Ensemble Approach (SEA) or a method derived from Bayesian statistics. By multiple criteria, bioinformatics and chemoinformatics networks differed substantially, and only occasionally did a high sequence similarity correspond to a high ligand-set similarity. In contrast, the chemoinformatics networks were stable to the method used to calculate the ligand-set similarities and to the chemical representation of the ligands. Also, the chemoinformatics networks were more natural and more organized, by network theory, than their bioinformatics counterparts: ligand-based networks were found to be small-world and broad-scale.  相似文献   

4.
The field of chemoinformatics has developed from different roots, starting in the 1960s. These branches have now merged into a scientific discipline of its own, exchanging ideas and methods across different areas of chemistry. In the last 40 years chemoinformatics has achieved a lot. Without access to the databases in chemistry developed with chemoinformatics methods, modern chemical research would not be able to work at its present high level of competence. However, there are quite a few challenges, such as drug design and understanding the effect of chemicals on human health and on the environment, as well as furthering our knowledge of chemistry and of biological systems, that can benefit from a more intensive use of chemoinformatics methods. Approaches to meet these challenges will be briefly outlined. All this emphasizes that chemoinformatics has matured into a scientific discipline of its own that reaches out to many other chemical fields and will increase in attractiveness to students and researchers.  相似文献   

5.
At a time when the demand for people with expertise in chemoinformatics is increasing, there is still only a very small number of academic institutions that offer chemoinformatics-related classes and degrees. The distance education (DE) approach allows both learning and research to be carried out at multiple geographic locations and institutions, thus leveraging the few educational offerings that are available. In this paper, distance education techniques and technologies (with emphasis on videoconferencing) are reviewed, and examples of how they are used to increase the accessibility of chemoinformatics education and research at the Indiana University School of Informatics are presented.  相似文献   

6.
There is no particular point in time that determines when chemoinformatics was founded or established. It slowly evolved from several, often quite humble beginnings. Scientists in various fields of chemistry struggled with the development of computer methods which allowed them to manage the enormous amount of chemical information and to find relationships between the structure and properties of a compound. During the 1960s some early developments appeared that led to a flurry of activities in the 1970s. This review provides a general overview of basic methods in the specific fields of chemoinformatics, from encoding chemical compounds, storing and searching data in databases, to generating and analyzing these data. In addition, the chief interconnecting points of chemoinformatics applications are highlighted including the contributions of Johann Gasteiger to this field.  相似文献   

7.
8.
The rapid development of computational methods and the increasing volume of chemical and biological data have contributed to an immense growth in chemical research. This field of study is known as “chemoinformatics,” which is a discipline that uses machine-learning techniques to extract, process, and extrapolate data from chemical structures. One of the significant lines of research in chemoinformatics is the study of blood–brain barrier (BBB) permeability, which aims to identify drug penetration into the central nervous system (CNS). In this research, we attempt to solve the problem of BBB permeability by predicting compounds penetration to the CNS. To accomplish this goal: (i) First, an overview is provided to the field of chemoinformatics, its definition, applications, and challenges, (ii) Second, a broad view is taken to investigate previous machine-learning and deep-learning computational models to solve BBB permeability. Based on the analysis of previous models, three main challenges that collectively affect the classifier performance are identified, which we define as “the triple constraints”; subsequently, we map each constraint to a proposed solution, (iii) Finally, we conclude this endeavor by proposing a deep learning based Recurrent Neural Network model, to predict BBB permeability (RNN-BBB model). Our model outperformed other studies from the literature by scoring an overall accuracy of 96.53%, and a specificity score of 98.08%. The obtained results confirm that addressing the triple constraints substantially improves the classification model capability specifically when predicting compounds with low penetration.  相似文献   

9.
化学信息学与生物信息学开放性比较   总被引:1,自引:0,他引:1  
乔园园  鹿涛  车云霞 《化学进展》2007,19(4):624-632
通过对化学信息学与生物信息学在公共资源、数据文件格式、软件编程语言、科学文献与教材等诸方面的比较与分析,指出生物信息学的成功因素在于其开放性;与之相比,虽然高等学校、研究所等教育科研领域和商业公司(特别是药物研发公司)都发布了不少化学信息学开源或自由软件以及一些开放数据,但化学信息学仍然由于各种原因使得开放性不足。有鉴于此,国内建立了化学化工资源导航系统,启动了《计算化学e-science研究以及示范应用》等国家自然科学基金重大项目。在教育、教学方面也应继续改革,使化学信息学能真正适应和促进当前药物发现研究。  相似文献   

10.
Internet的普及为专业人员获取数据信息、利用计算工具提供了统一的平台,由此为化学信息学的发展带来了新的空间,推动了化学信息学以网络为基础,以化学相关的数据、信息及计算资源共享为目标的快速发展。本文将从不同侧面回顾近10年来化学信息学的重要进展, 包括:(1) 网络化学信息检索:索引对象从化学浅层网向化学深层网发展;检索工具从Web化学信息资源导航向化学专业搜索引擎(包括文本信息和化合物标识信息)、及化学深层网检索引擎 (化合物物性数据提取)发展;索引粒度从Web站点向页面、乃至页面中的特定内容发展,一般页面特定内容的数据提取(即非结构化数据提取)是未来发展的方向。(2)可共享的化学数据库:从可免费访问和使用的化学数据库向数据库内容通过集成多来源数据(包括数据库拥有者主动收集、多来源数据主动提交达到共享的方式,repository)实现数据库内容免费下载和共享,以及不同数据库之间的相关内容实现无缝连接的方向发展(如NIH建成的药物小分子共享数据库PubChem)。(3) 开源(open source)化学软件工具包:从化学结构基本处理模块如CDK、JOELib向集成开发环境如化学信息学与生物信息学集成环境Bioclipse发展。(4) 与化合物及其数据共享相关的推荐标准:包括用于共享数据交换的化学标记语言CML、IUPAC推荐的学术论文相关热力学实验数据提交标准ThermoML及化合物结构唯一描述码InChI。(5) 计算化学资源共享及基于网格的应用:从可执行程序的下载向在线计算、基于网格的应用发展。(6) eChemistry和虚拟研究环境:网络也成为化学相关日常的科学活动中不可缺少的平台。构建以网络为平台、支持开展科研活动的数字化基础设施和服务的eChemstry探索开始出现,根据需要自主集成多来源数据和计算资源,形成不同层次的支持协同工作的虚拟研究环境是未来数据和计算资源共享方式的发展方向。  相似文献   

11.
The Blue Obelisk-interoperability in chemical informatics   总被引:1,自引:0,他引:1  
The Blue Obelisk Movement (http://www.blueobelisk.org/) is the name used by a diverse Internet group promoting reusable chemistry via open source software development, consistent and complimentary chemoinformatics research, open data, and open standards. We outline recent examples of cooperation in the Blue Obelisk group: a shared dictionary of algorithms and implementations in chemoinformatics algorithms drawing from our various software projects; a shared repository of chemoinformatics data including elemental properties, atomic radii, isotopes, atom typing rules, and so forth; and Web services for the platform-independent use of chemoinformatics programs.  相似文献   

12.
信息化学研究进展   总被引:2,自引:0,他引:2  
信息化学(chemoinformatics)作为信息学与传统化学的交叉学科,是当前化学领域上的一个研究热点,本文论述了它的产生背景、理论基础、研究任务,提出了信息化学学科在总体上应具有的三个层次(接口结构),详细地阐明了信息化学的工作方式,并探讨了各个层次和信息采集接口之间的相互关系,最后,我们结合自己的工作预测了信息化学研究今后的发展方向。  相似文献   

13.
Voltage-gated ion channels are a diverse family of pharmaceutically important membrane proteins for which limited 3D information is available. A number of virtual screening tools have been used to assist with the discovery of new leads and with the analysis of screening results. One such tool, and the subject of this paper, is binary kernel discrimination (BKD), a machine-learning approach that has recently been applied to applications in chemoinformatics. It uses a training set of compounds, for which both structural and qualitative activity data are known, to produce a model that can then be used to rank another set of compounds in order of likely activity. Here, we report the use of BKD to build models for the prediction of five different ion channel targets using two types of activity data. The results obtained suggest that the approach provides an effective way of prioritizing compounds for acquisition and testing.  相似文献   

14.
15.
Chemists have to a large extent gained their knowledge by doing experiments and thus gather data. By putting various data together and then analyzing them, chemists have fostered their understanding of chemistry. Since the 1960s, computer methods have been developed to perform this process from data to information to knowledge. Simultaneously, methods were developed for assisting chemists in solving their fundamental questions such as the prediction of chemical, physical, or biological properties, the design of organic syntheses, and the elucidation of the structure of molecules. This eventually led to a discipline of its own: chemoinformatics. Chemoinformatics has found important applications in the fields of drug discovery, analytical chemistry, organic chemistry, agrichemical research, food science, regulatory science, material science, and process control. From its inception, chemoinformatics has utilized methods from artificial intelligence, an approach that has recently gained more momentum.  相似文献   

16.
Finding the most stable tautomer or a set of low-energy tautomers of molecules is critical in many aspects of molecular modelling or virtual screening experiments. Enumeration of low-energy tautomers of neutral molecules in the gas-phase or typical solvents can be performed by applying available organic chemistry knowledge. This kind of enumeration is implemented in a number of software packages and it is relatively reliable. However, in esoteric cases such as charged molecules in uncommon, non-aqueous solvents there is simply not enough available knowledge to make reliable predictions of low energy tautomers. Over the last few years we have been developing an approach to address the latter problem and we successfully applied it to discover the most stable anionic tautomers of nucleic acid bases that might be involved in the process of DNA damage by low-energy electrons and in charge transfer through DNA. The approach involves three steps: (1) combinatorial generation of a library of tautomers, (2) energy-based screening of the library using electronic structure methods, and (3) analysis of the information generated in step (2). In steps 1–3 we employ combinatorial, computational and chemoinformatics techniques, respectively. Therefore, this hybrid approach is named “Combinatorial*Computational*Chemoinformatics”, or just abbreviated as C3 (or C-cube) approach. This article summarizes our developments and most interesting methodological aspects of the C3 approach. It can serve as an example how to identify the most stable tautomers of molecular systems for which common chemical knowledge had not been sufficient to make definite predictions.  相似文献   

17.
18.
Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.  相似文献   

19.
Chemoinformatics: a new field with a long tradition   总被引:2,自引:0,他引:2  
Chemoinformatics is the application of informatics methods to solve chemical problems. Although this term was introduced only a few years ago, this field has a long history with its roots going back more than 40 years. Work on chemical structure representation and searching, quantitative structure–activity relationships, chemometrics, molecular modeling as well as computer-assisted structure elucidation and synthesis design was initiated in the 1960s. These different origins have now merged into a discipline of its own that is in full bloom. All areas of chemistry from analytical chemistry to drug design can benefit from chemoinformatics methods. And there are still many challenging chemical problems waiting for solutions through the further development of chemoinformatics.  相似文献   

20.
In recent years classifiers generated with kernel-based methods, such as support vector machines (SVM), Gaussian processes (GP), regularization networks (RN), and binary kernel discrimination (BKD) have been very popular in chemoinformatics data analysis. Aizerman et al. were the first to introduce the notion of employing kernel-based classifiers in the area of pattern recognition. Their original scheme, which they termed the potential function method (PFM), can basically be viewed as a kernel-based perceptron procedure and arguably subsumes the modern kernel-based algorithms. PFM can be computationally much cheaper than modern kernel-based classifiers; furthermore, PFM is far simpler conceptually and easier to implement than the SVM, GP, and RN algorithms. Unfortunately, unlike, e.g., SVM, GP, and RN, PFM is not endowed with both theoretical guarantees and practical strategies to safeguard it against generating overfitting classifiers. This is, in our opinion, the reason why this simple and elegant method has not been taken up in chemoinformatics. In this paper we empirically address this drawback: while maintaining its simplicity, we demonstrate that PFM combined with a simple regularization scheme may yield binary classifiers that can be, in practice, as efficient as classifiers obtained by employing state-of-the-art kernel-based methods. Using a realistic classification example, the augmented PFM was used to generate binary classifiers. Using a large chemical data set, the generalization ability of PFM classifiers were then compared with the prediction power of Laplacian-modified naive Bayesian (LmNB), Winnow (WN), and SVM classifiers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号