首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
化学信息学与生物信息学开放性比较   总被引:1,自引:0,他引:1  
乔园园  鹿涛  车云霞 《化学进展》2007,19(4):624-632
通过对化学信息学与生物信息学在公共资源、数据文件格式、软件编程语言、科学文献与教材等诸方面的比较与分析,指出生物信息学的成功因素在于其开放性;与之相比,虽然高等学校、研究所等教育科研领域和商业公司(特别是药物研发公司)都发布了不少化学信息学开源或自由软件以及一些开放数据,但化学信息学仍然由于各种原因使得开放性不足。有鉴于此,国内建立了化学化工资源导航系统,启动了《计算化学e-science研究以及示范应用》等国家自然科学基金重大项目。在教育、教学方面也应继续改革,使化学信息学能真正适应和促进当前药物发现研究。  相似文献   

2.
3.
Internet的普及为专业人员获取数据信息、利用计算工具提供了统一的平台,由此为化学信息学的发展带来了新的空间,推动了化学信息学以网络为基础,以化学相关的数据、信息及计算资源共享为目标的快速发展。本文将从不同侧面回顾近10年来化学信息学的重要进展, 包括:(1) 网络化学信息检索:索引对象从化学浅层网向化学深层网发展;检索工具从Web化学信息资源导航向化学专业搜索引擎(包括文本信息和化合物标识信息)、及化学深层网检索引擎 (化合物物性数据提取)发展;索引粒度从Web站点向页面、乃至页面中的特定内容发展,一般页面特定内容的数据提取(即非结构化数据提取)是未来发展的方向。(2)可共享的化学数据库:从可免费访问和使用的化学数据库向数据库内容通过集成多来源数据(包括数据库拥有者主动收集、多来源数据主动提交达到共享的方式,repository)实现数据库内容免费下载和共享,以及不同数据库之间的相关内容实现无缝连接的方向发展(如NIH建成的药物小分子共享数据库PubChem)。(3) 开源(open source)化学软件工具包:从化学结构基本处理模块如CDK、JOELib向集成开发环境如化学信息学与生物信息学集成环境Bioclipse发展。(4) 与化合物及其数据共享相关的推荐标准:包括用于共享数据交换的化学标记语言CML、IUPAC推荐的学术论文相关热力学实验数据提交标准ThermoML及化合物结构唯一描述码InChI。(5) 计算化学资源共享及基于网格的应用:从可执行程序的下载向在线计算、基于网格的应用发展。(6) eChemistry和虚拟研究环境:网络也成为化学相关日常的科学活动中不可缺少的平台。构建以网络为平台、支持开展科研活动的数字化基础设施和服务的eChemstry探索开始出现,根据需要自主集成多来源数据和计算资源,形成不同层次的支持协同工作的虚拟研究环境是未来数据和计算资源共享方式的发展方向。  相似文献   

4.
The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks.  相似文献   

5.
In recent years classifiers generated with kernel-based methods, such as support vector machines (SVM), Gaussian processes (GP), regularization networks (RN), and binary kernel discrimination (BKD) have been very popular in chemoinformatics data analysis. Aizerman et al. were the first to introduce the notion of employing kernel-based classifiers in the area of pattern recognition. Their original scheme, which they termed the potential function method (PFM), can basically be viewed as a kernel-based perceptron procedure and arguably subsumes the modern kernel-based algorithms. PFM can be computationally much cheaper than modern kernel-based classifiers; furthermore, PFM is far simpler conceptually and easier to implement than the SVM, GP, and RN algorithms. Unfortunately, unlike, e.g., SVM, GP, and RN, PFM is not endowed with both theoretical guarantees and practical strategies to safeguard it against generating overfitting classifiers. This is, in our opinion, the reason why this simple and elegant method has not been taken up in chemoinformatics. In this paper we empirically address this drawback: while maintaining its simplicity, we demonstrate that PFM combined with a simple regularization scheme may yield binary classifiers that can be, in practice, as efficient as classifiers obtained by employing state-of-the-art kernel-based methods. Using a realistic classification example, the augmented PFM was used to generate binary classifiers. Using a large chemical data set, the generalization ability of PFM classifiers were then compared with the prediction power of Laplacian-modified naive Bayesian (LmNB), Winnow (WN), and SVM classifiers.  相似文献   

6.
This article presents an open‐source object‐oriented C++ library of classes and routines to perform tensor algebra. The primary purpose of the library is to enable post‐Hartree–Fock electronic structure methods; however, the code is general enough to be applicable in other areas of physical and computational sciences. The library supports tensors of arbitrary order (dimensionality), size, and symmetry. Implemented data structures and algorithms operate on large tensors by splitting them into smaller blocks, storing them both in core memory and in files on disk, and applying divide‐and‐conquer‐type parallel algorithms to perform tensor algebra. The library offers a set of general tensor symmetry algorithms and a full implementation of tensor symmetries typically found in electronic structure theory: permutational, spin, and molecular point group symmetry. The Q‐Chem electronic structure software uses this library to drive coupled‐cluster, equation‐of‐motion, and algebraic‐diagrammatic construction methods. © 2013 Wiley Periodicals, Inc.  相似文献   

7.
The encoding and searching of generic chemical structures, so-called Markush structures, have received little attention in the literature of late. The ability to encode and search these complex entities is of use in various branches of chemoinformatics. We describe a general language for encoding Markush structures and algorithms for searching them and give three examples of the utility of such a system: development of general Free-Wilson analyses of chemical series, detection of controlled substances within a large database of molecular structures, and searching of large databases of virtual compounds.  相似文献   

8.
There is no particular point in time that determines when chemoinformatics was founded or established. It slowly evolved from several, often quite humble beginnings. Scientists in various fields of chemistry struggled with the development of computer methods which allowed them to manage the enormous amount of chemical information and to find relationships between the structure and properties of a compound. During the 1960s some early developments appeared that led to a flurry of activities in the 1970s. This review provides a general overview of basic methods in the specific fields of chemoinformatics, from encoding chemical compounds, storing and searching data in databases, to generating and analyzing these data. In addition, the chief interconnecting points of chemoinformatics applications are highlighted including the contributions of Johann Gasteiger to this field.  相似文献   

9.
The similarity of drug targets is typically measured using sequence or structural information. Here, we consider chemo-centric approaches that measure target similarity on the basis of their ligands, asking how chemoinformatics similarities differ from those derived bioinformatically, how stable the ligand networks are to changes in chemoinformatics metrics, and which network is the most reliable for prediction of pharmacology. We calculated the similarities between hundreds of drug targets and their ligands and mapped the relationship between them in a formal network. Bioinformatics networks were based on the BLAST similarity between sequences, while chemoinformatics networks were based on the ligand-set similarities calculated with either the Similarity Ensemble Approach (SEA) or a method derived from Bayesian statistics. By multiple criteria, bioinformatics and chemoinformatics networks differed substantially, and only occasionally did a high sequence similarity correspond to a high ligand-set similarity. In contrast, the chemoinformatics networks were stable to the method used to calculate the ligand-set similarities and to the chemical representation of the ligands. Also, the chemoinformatics networks were more natural and more organized, by network theory, than their bioinformatics counterparts: ligand-based networks were found to be small-world and broad-scale.  相似文献   

10.
11.
Chemists have to a large extent gained their knowledge by doing experiments and thus gather data. By putting various data together and then analyzing them, chemists have fostered their understanding of chemistry. Since the 1960s, computer methods have been developed to perform this process from data to information to knowledge. Simultaneously, methods were developed for assisting chemists in solving their fundamental questions such as the prediction of chemical, physical, or biological properties, the design of organic syntheses, and the elucidation of the structure of molecules. This eventually led to a discipline of its own: chemoinformatics. Chemoinformatics has found important applications in the fields of drug discovery, analytical chemistry, organic chemistry, agrichemical research, food science, regulatory science, material science, and process control. From its inception, chemoinformatics has utilized methods from artificial intelligence, an approach that has recently gained more momentum.  相似文献   

12.
The field of chemoinformatics has developed from different roots, starting in the 1960s. These branches have now merged into a scientific discipline of its own, exchanging ideas and methods across different areas of chemistry. In the last 40 years chemoinformatics has achieved a lot. Without access to the databases in chemistry developed with chemoinformatics methods, modern chemical research would not be able to work at its present high level of competence. However, there are quite a few challenges, such as drug design and understanding the effect of chemicals on human health and on the environment, as well as furthering our knowledge of chemistry and of biological systems, that can benefit from a more intensive use of chemoinformatics methods. Approaches to meet these challenges will be briefly outlined. All this emphasizes that chemoinformatics has matured into a scientific discipline of its own that reaches out to many other chemical fields and will increase in attractiveness to students and researchers.  相似文献   

13.
The rapid development of computational methods and the increasing volume of chemical and biological data have contributed to an immense growth in chemical research. This field of study is known as “chemoinformatics,” which is a discipline that uses machine-learning techniques to extract, process, and extrapolate data from chemical structures. One of the significant lines of research in chemoinformatics is the study of blood–brain barrier (BBB) permeability, which aims to identify drug penetration into the central nervous system (CNS). In this research, we attempt to solve the problem of BBB permeability by predicting compounds penetration to the CNS. To accomplish this goal: (i) First, an overview is provided to the field of chemoinformatics, its definition, applications, and challenges, (ii) Second, a broad view is taken to investigate previous machine-learning and deep-learning computational models to solve BBB permeability. Based on the analysis of previous models, three main challenges that collectively affect the classifier performance are identified, which we define as “the triple constraints”; subsequently, we map each constraint to a proposed solution, (iii) Finally, we conclude this endeavor by proposing a deep learning based Recurrent Neural Network model, to predict BBB permeability (RNN-BBB model). Our model outperformed other studies from the literature by scoring an overall accuracy of 96.53%, and a specificity score of 98.08%. The obtained results confirm that addressing the triple constraints substantially improves the classification model capability specifically when predicting compounds with low penetration.  相似文献   

14.
Mahon P  Dupree P 《Electrophoresis》2001,22(10):2075-2085
Quantitative two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) is used to determine changes in individual protein levels in complex protein mixtures. To provide reliable data, the software used for 2-D gel image analysis must provide a linear response over a wide dynamic range of data output. Here, we show that Phoretix 2D Full analysis of 2-D gels stained with colloidal Coomassie Brilliant Blue G-250 can provide a linear measure of changes in protein quantity. We show using a complex mixture of Arabidopsis thaliana proteins, that this is true for essentially all focused proteins, in a data output range greater than three orders of magnitude. An analysis of the factors that affect errors in the results demonstrated that reproducibility of the data is significantly improved by user seeding, whereas it is reduced by use of the background subtraction algorithms.  相似文献   

15.
This paper is focused on modern approaches to machine learning, most of which are as yet used infrequently or not at all in chemoinformatics. Machine learning methods are characterized in terms of the "modes of statistical inference" and "modeling levels" nomenclature and by considering different facets of the modeling with respect to input/ouput matching, data types, models duality, and models inference. Particular attention is paid to new approaches and concepts that may provide efficient solutions of common problems in chemoinformatics: improvement of predictive performance of structure-property (activity) models, generation of structures possessing desirable properties, model applicability domain, modeling of properties with functional endpoints (e.g., phase diagrams and dose-response curves), and accounting for multiple molecular species (e.g., conformers or tautomers).  相似文献   

16.
17.
Open source paradigm is becoming widely accepted in scientific communities and open source hardware is finding its steady place in chemistry research. In this review article, we provide the reader with the most up‐to‐date information on open source hardware and software resources enabling the construction and utilization of an “open source capillary electrophoresis instrument”. While CE is still underused as a separation technique, it offers unique flexibility, low‐cost, and high efficiency and is particularly suitable for open source instrumental development. We overview the major parts of CE instruments, such as high voltage power supplies, detectors, data acquisition systems, and CE software resources with emphasis on availability of the open source information on the web and in the scientific literature. This review is the first of its kind, revealing accessible blueprints of most parts from which a fully functional open source CE system can be built. By collecting the extensive information on open source capillary electrophoresis in this review article, the authors aim at facilitating the dissemination of knowledge on CE within and outside the scientific community, fosters innovation and inspire other researchers to improve the shared CE blueprints.  相似文献   

18.
Laser-induced breakdown spectroscopy has been applied to layer-by-layer pigment material microanalysis from the different sections of Hubert Robert’s (1733–1808) painting “Landscape of a Pool with an Obelisk and Ruins of an Aqueduct”. This painting consists of two sections and, therefore, requires thorough examination of the pigments from both sections in order to identify their authenticity. The data obtained on the elemental composition of the paint layers including the ground layer alongside with art examination have formed the basis for the identification, attribution and restoration of both investigated sections of the painting.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号