首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Molecular databases obtained either by combinatorial chemistry tools or by more traditional methods are usually organized according to a set of molecular properties. A database may be regarded as a multidimensional collection of points within a space spanned by the various molecular properties of interest, the property space. Some properties are likely to be more important than others, those considered important form the essential dimensions of the molecular database. How many properties are essential, this depends on the molecular problem addressed, however, the search in property space is usually limited to a few dimensions. Two types of search strategies are related either to search by property or search by lead compound. The first case corresponds to a lattice model, where the search is based on sets of adjacent blocks, usually hypercubes in property space, whereas lead-based searches in databases can be regarded as search around a center in property space. A natural model for lead-based searches involves a hyperspherical model. In this contribution a theoretical optimum dimension is determined that enhances the effectiveness of lead-based searches in property space of molecular databases.  相似文献   

2.
Abstract

Due to the high rate of data production and the need of researchers to have rapid access to new data, public databases have become the major medium through which genome mapping and sequencing data as well as macromolecular structural data are published. There are now more than 250 databases of biomolecular, structural, genetic, or phenotypic data, many of which are doubling in size annually. These databases, many of which were created and are maintained by experimentalists for their own research use, provide valuable collections of organized, validated data. However, the very number and diversity of databases now make efficient data resource discovery as important as effective data resource use. Existing autonomous biological databases contain related data which are more valuable when interconnected than when isolated. Political and scientific realities dictate that these databases will be built by different teams, in different locations, for different purposes, and using different data models and supporting DBMSs. As a consequence, connecting the related data they contain is not straightforward. Experience with existing biological databases indicates that it is possible to form useful queries across these databases, but that doing so usually requires expertise in the semantic structure of each source database. Advancing to the next level of integration among biological information resources poses significant technical and sociological challenges.  相似文献   

3.
Two-dimensional (2-D) polyacrylamide gel electrophoresis can detect thousands of polypeptides, separating them by apparent molecular weight (Mr) and isoelectric point (pI). Thus it provides a more realistic and global view of cellular genetic expression than any other technique. This technique has been useful for finding sets of key proteins of biological significance. However, a typical experiment with more than a few gels often results in an unwiedly data management problem. In this paper, the GELLAB-II system is discussed with respect to how data reduction and exploratory data analysis can be aided by computer data management and statistical search techniques. By encoding the gel patterns in a "three-dimensional" (3-D) database, an exploratory data analysis can be carried out in an environment that might be called a "spread sheet for 2-D gel protein data". From such databases, complex parametric network models of protein expression during events such as differentiation might be constructed. For this, 2-D gel databases must be able to include data from other domains external to the gel itself. Because of the increasing complexity of such databases, new tools are required to help manage this complexity. Two such tools, object-oriented databases and expert-system rule-based analysis, are discussed in this context. Comparisons are made between GELLAB and other 2-D gel database analysis systems to illustrate some of the analysis paradigms common to these systems and where this technology may be heading.  相似文献   

4.
5.
This communication describes a facile but effective method to prepare graphene film electrodes with tunable dimensions with Vaseline as the insulating binder. Cyclic voltammetry (CV) studies reveal that the as-prepared graphene film electrodes have tunable dimensions ranging from a conventional electrode to a nanoelectrode ensemble, depending on the amount of graphene dispersed into the insulting Vaseline matrix. A large amount of graphene (typically, 10.0 μg/mL) leads to the formation of the film electrodes with a conventional dimension, while a small amount of graphene (typically, 1.0 μg/mL) essentially yields the graphene film electrodes like a nanoelectrode ensemble. As one new kind of carbon-based film electrodes with tailor-made dimensions and a good electrochemical activity as well as a high stability, the graphene film electrodes are believed to be potentially useful for fundamental electrochemical studies and for practical applications.  相似文献   

6.
Access to desk-top structure and reaction databases through applications such as Chemical Abstracts' SciFinder, MDL's Beilstein CrossFire, and ISIS Reaction Browser has led to changes in information seeking habits of research chemists, the impact of which has implications when database purchasing decisions are made. A semiquantitative assessment is proposed which takes into account key aspects of structure and reaction databases. Assessment criteria are identified which can be weighted according to an organization's information needs. Values are then assigned to criteria for each data source, after which a formula is applied which leads to an indication of the relative value of systems under consideration. The formula takes into account the cost of database products and also the incremental benefit of adding a new system to an existing collection. This work is presented as a generic approach to the evaluation of databases and is not limited in scope to only structure and reactions databases.  相似文献   

7.
Bioinformatics can play an important role in developing improved technology for the detection and characterization of food allergens. However, the full realization of this potential will depend on the development of allergen-specific databases as well as improved methods for data mining within these databases. Examples of existing allergen databases and analysis tools are described, as are the most important issues that need to be addressed in the next stage of database development.  相似文献   

8.
以国家材料科学数据共享网为依托,我们建立了一个大型的高分子材料数据库并已向公众开放访问。数据库主要面向科研工作人员和工业企业等,提供基于高分子材料工业产品信息的数据共享服务。该数据库涵盖塑料、橡胶、纤维、涂料、胶粘剂、加工助剂等高分子领域主要的材料类型,目前已纳入7 000余个牌号的50 000多条数据。针对高分子材料类型的复杂性,该数据库采取了网状带冗余的分类方式以使一些组成复杂的材料可通过多种分类路径查询到。数据入库前,通过在数据生产、搜集、整合等多个过程中的评估以确保数据质量和可靠性。入库的数据也从数据来源、评估结果、修改记录等多个方面进行标记,以便后期进一步进行核对与评估。  相似文献   

9.
With the accelerated accumulation of genomic sequence data, there is a pressing need to develop computational methods and advanced bioinformatics infrastructure for reliable and large-scale protein annotation and biological knowledge discovery. The Protein Information Resource (PIR) provides an integrated public resource of protein informatics to support genomic and proteomic research. PIR produces the Protein Sequence Database of functionally annotated protein sequences. The annotation problems are addressed by a classification-driven and rule-based method with evidence attribution, coupled with an integrated knowledge base system being developed. The approach allows sensitive identification, consistent and rich annotation, and systematic detection of annotation errors, as well as distinction of experimentally verified and computationally predicted features. The knowledge base consists of two new databases, sequence analysis tools, and graphical interfaces. PIR-NREF, a non-redundant reference database, provides a timely and comprehensive collection of all protein sequences, totaling more than 1,000,000 entries. iProClass, an integrated database of protein family, function, and structure information, provides extensive value-added features for about 830,000 proteins with rich links to over 50 molecular databases. This paper describes our approach to protein functional annotation with case studies and examines common identification errors. It also illustrates that data integration in PIR supports exploration of protein relationships and may reveal protein functional associations beyond sequence homology.  相似文献   

10.
反转数据库常被用于估算大规模蛋白质组研究中串联质谱搜索数据库结果的可靠性。然而,对于经典的且现在依然在产出的肽质量指纹谱的数据,这种方法并不适用。为解决该问题,构建了另外一种随机数据库,称为反转错位数据库。这种数据库是在反转数据库的基础上,将序列中的K和R及其后的氨基酸交换位置(对于胰蛋白酶切割的结果)获得。这种处理避免了某些肽段因前后胰蛋白酶酶切位点氨基酸相同而在序列反转后质量依然不变,导致肽质量指纹谱法无法区分的问题。通过串联质谱和肽质量指纹谱测试数据的搜索结果,证明了这种方法同时适用于串联质谱和肽质量指纹谱的数据。这种方法扩大了经典反转数据库的适用范围,将对评估和整合串联质谱和肽质量指纹谱的数据具有重要意义。  相似文献   

11.
nmrshiftdb2 supports with its laboratory information management system the integration of an electronic lab administration and management into academic NMR facilities. Also, it offers the setup of a local database, while full access to nmrshiftdb2's World Wide Web database is granted. This freely available system allows on the one hand the submission of orders for measurement, transfers recorded data automatically or manually, and enables download of spectra via web interface, as well as the integrated access to prediction, search, and assignment tools of the NMR database for lab users. On the other hand, for the staff and lab administration, flow of all orders can be supervised; administrative tools also include user and hardware management, a statistic functionality for accounting purposes, and a ‘QuickCheck’ function for assignment control, to facilitate quality control of assignments submitted to the (local) database. Laboratory information management system and database are based on a web interface as front end and are therefore independent of the operating system in use. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.
May PM  Muray K 《Talanta》1991,38(12):1419-1426
The thermodynamic database of the JESS (Joint Expert Speciation System) software package is described. It overcomes many existing problems associated with solution-chemistry databases. The system is fully interactive. Reactions can be expressed in any form. Any number of equilibrium constants, enthalpy, entropy and Gibbs-free energy values can be associated with a reaction. Supplementary data such as background electrolyte, temperature, ionic strength, method of determination and original literature reference are also stored. Data can be readily transferred between databases. Currently, the thermodynamic database that is being distributed with JESS contains over 12,000 reactions and over 20,000 equilibrium constants. These data span interactions in aqueous solution of some 100 metal ions with more than 650 ligands.  相似文献   

13.
In the evaluation of large or complex data sets the use of visualization methods can be of great benefit. In this paper, the use of parallel co-ordinate geometry (PCG) plots, principal component analysis (PCA) and N-way PCA as visualization procedures for large multi-response experimental designs was compared with the more traditional approach of calculating factor effects by multiple linear regression. PCG plots are a recent addition to the visualization tools and offer a possibility to visualize multi-dimensional data in two dimensions while no calculations are required. It was found that PCA and PCG each have their own benefits and disadvantages. Both methods can be used to some extent to select optimal conditions. Moreover, it was possible to use the PCA score plot as a Pareto optimality plot that allowed to select the experiments considered to be Pareto optimal. Therefore, the examined visualization methods can be useful and complementary aids in the interpretation of large multi-response experimental design data and they add a multivariate dimension to the more classical univariate analysis of such data.  相似文献   

14.
15.
The importance of databases of reliable and accurate data in chemistry has substantially increased in the past two decades. Their main usage is to parametrize electronic structure theory methods, and to assess their capabilities and accuracy for a broad set of chemical problems. The collection we present here—ACCDB—includes data from 16 different research groups, for a total of 44,931 unique reference data points, all at a level of theory significantly higher than density functional theory, and covering most of the periodic table. It is composed of five databases taken from literature (GMTKN, MGCDB84, Minnesota2015, DP284, and W4-17), two newly developed reaction energy databases (W4-17-RE and MN-RE), and a new collection of databases containing transition metals. A set of expandable software tools for its manipulation is also presented here for the first time, as well as a case study where ACCDB is used for benchmarking commercial CPUs for chemistry calculations. © 2018 Wiley Periodicals, Inc.  相似文献   

16.
Fermentation diagnosis by multivariate statistical analysis   总被引:1,自引:0,他引:1  
During the course of fermentation, online measuring procedures able to estimate the performance of the current operation are highly desired. Unfortunately, the poor mechanistic understanding of most biologic systems hampers attempts at direct online evaluation of the bioprocess, which is further complicated by the lack of appropriate online sensors and the long lag time associated with offline assays. Quite often available data lack sufficient detail to be directly used, and after a cursory evaluation are stored away. However, these historic databases of process measurements may still retain some useful information. A multivariate statistical procedure has been applied for analyzing the measurement profiles acquired during the monitoring of several fed-batch fermentations for the production of erythromycin. Multivariate principal component analysis has been used to extract information from the multivariate historic database by projecting the process variables onto a low-dimensional space defined by the principal components. Thus, each fermentation is identified by a temporal profile in the principal component plane. The projections represent monitoring charts, consistent with the concept of statistical process control, which are useful for tracking the progress of each fermentation batch and identifying anomalous behaviors (process diagnosis and fault detection).  相似文献   

17.
Reliable kinetic and thermodynamic data are required to model the evolution of electric discharge or electron-beam decomposition chemistry of gases in humid air streams. In this first segment of a continuing series, we provide a core database describing the initially dominant ion-neutral molecule reactions in humid air plasmas. Recommended reaction rate data and extrapolation tools are presented in a manner to facilitate prediction of reactivities and reaction channels as a function of temperature, pressure, and applied electric field.  相似文献   

18.
The predictive and correlative capabilities of two recent versions of the free-volume theory for self-diffusion in polymer–solvent systems are examined by comparisons with experimental data. Neither the Vrentas–Duda free-volume theory nor the Paul version generally provides satisfactory predictions for the temperature and concentration variations of solvent self-diffusion coefficients. However, the Vrentas–Duda theory does provide good correlations of solvent self-diffusion data, and, furthermore, this theory can provide good predictions if a small amount of solvent self-diffusion data is used to help estimate the parameters of the theory. New diffusivity and equilibrium data were collected for the toluene-PVAc system to provide a broader database for evaluation of the self-diffusion theories.  相似文献   

19.
In the last two decades, the volumes of chemical and biological data are constantly increasing. The problem of converting data sets into knowledge is both expensive and time-consuming, as a result a workflow technology with platforms such as KNIME, was built up to facilitate searching through multiple heterogeneous data sources and filtering for specific criteria then extracting hidden information from these large data. Before any QSAR modeling, a manual data curation is extremely recommended. However, this can be done, for small datasets, but for the extensive data accumulated recently in public databases a manual process of big data will be hardly feasible. In this work, we suggest using KNIME as an automated solution for workflow in data curation, development, and validation of predictive QSAR models from a huge dataset.In this study, we used 250250 structures from NCI database, only 3520 compounds could successfully pass through our workflow safely with their corresponding experimental log P, this property was investigated as a case study, to improve some existing log P calculation algorithms.  相似文献   

20.
MODULEWRITER is a PERL object relational mapping (ORM) tool that automatically generates database specific application programming interfaces (APIs) for SQL databases. The APIs consist of a package of modules providing access to each table row and column. Methods for retrieving, updating and saving entries are provided, as well as other generally useful methods (such as retrieval of the highest numbered entry in a table). MODULEWRITER provides for the inclusion of user-written code, which can be preserved across multiple runs of the MODULEWRITER program.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号