首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Internet is a comprehensive resource of chemical information which is at the same time largely unstructured. It provides a wealth of scientific information such as experimental data and requires a suitable automated data mining and analysis tool for its meaningful exploration. The Java based software presented here, ChemXtreme, is developed for harvesting chemical information from the Internet employing the Google API in combination with a distributed client/server text analysis architecture based on JavaRMI. It represents the first and until now the only toolkit for automated structured data retrieval from the Internet which is itself open source. ChemXtreme employs the "search the search engine" strategy, where the URLs returned from the search engine are analyzed further via textual pattern analysis. This process resembles the manual analysis of the hit list, where relevant data are captured and, by means of human intervention, are mined into a format suitable for further analysis. ChemXtreme on the other hand transforms chemical information automatically into a structured format suitable for storage in databases and further analysis and also provides links to the original information source. The query data retrieved from the search engine by the server is encoded, encrypted, and compressed and then sent to all the participating active clients in the network for parsing. Relevant information identified by the clients on the retrieved Web sites is sent back to the server, verified, and added to the database for data mining and further analysis. The distributed further analysis of URLs in a client/server architecture scales very favorably, thus producing only minimal overhead.  相似文献   

2.
With the expansion of the Internet and World Wide Web (or the Web), research environments have changed dramatically. As a result, the need to be able to efficiently and securely access information and resources from remote computer systems is becoming even more critical. This paper describes the development of an extendable integrated Web-accessible simulation environment for computational science and engineering called Computational Science and Engineering Online (CSE-Online; http://cse-online.net). CSE-Online is based on a unique client-server software architecture that can distribute the workload between the client and server computers in such a way as to minimize the communication between the client and server, thus making the environment less-sensitive to network instability. Furthermore, the new software architecture allows the user to access data and resources on one or more remote servers as well as on the computing grid while having the full capability of the Web-services collaborative environment. It can be accessed anytime and anywhere from a Web browser connected to the network by either a wired or wireless connection. It has different modes of operations to support different working environments and styles. CSE-Online is evolving into middleware that can provide a framework for accessing and managing remote data and resources including the computing grid for any domain, not necessarily just within computational science and engineering.  相似文献   

3.
The program package SENECA for Computer-Assisted Structure Elucidation (CASE) of organic molecules is described. SENECA is written completely in the programming language Java and divided into a server, a client, and a gatekeeper part. While the client allows for input of spectroscopic information, the server part performs the actual structure elucidation by stochastically walking through constitution space while optimizing the molecule toward agreement with given spectral properties. The convergence is guided by simulated annealing. The gatekeeper administers a list of server processes, which can be retrieved by the client. The package is completely platform-independent and its server part can be distributed over the Internet or an intranet using a heterogeneous network of almost any number and type of computers, thus allowing for parallel CASE computations on ordinary networks, present in almost any institution.  相似文献   

4.
The aim of this work was to organize chemical data in a client-server environment using Database Management System and Web fashion for the client interface. To solve this ancient problem (for us) merging text data, reaction schemes, tridimensional structures, and NMR, CD, and UV spectra images, we have based our implementation on a few fundamental points: no cost for the user, availability of data via the Internet, standard and freeware software, and a Web browser for the database inquiry. These functions are delivered in a platform-independent manner via the Internet and are used by computational experts and nonexperts alike. C-Glycosylporphyrins is the class of compounds chosen to test our applications. These results can be exportable for many other classes of chemical compounds.  相似文献   

5.
This work describes an Internet accessible three-dimensional particle-in-cell simulation code, which is capable of near first principles modeling of complete experimental sequences in Fourier transform ion cyclotron resonance mass spectrometers. The graphical user interface is a Java client that communicates via a socket stream connection over the Internet to the computational engine, a server that executes the simulation and sends real-time particle data back to the client for display. As a first demonstration, this code is applied to the problem of the cyclotron motion of two very close mass to charge ratios at high ion density. The ion populations in these simulations range from 50,000 to 350,000 coulombically interacting particles confined in a cubic trap, which are followed for 100,000 time-steps. Image charge, coherent cyclotron positions, and snapshots of the ion population are recorded at selected time-steps. At each time-step in the simulation the potential (coulomb + image + trap) is found by the direct solution of Poisson’s equation on a 64×64×64 computational grid. Cyclotron phase locking is demonstrated at high number density. Simulations at different magnetic fields confirm a B2 dependence for the minimum number density required to lock cyclotron modes.  相似文献   

6.
While many large publicly accessible databases provide excellent annotation for biological macromolecules, the same is not true for small chemical compounds. Commercial data sources also fail to encompass an annotation interface for large numbers of compounds and tend to be cost prohibitive to be widely available to biomedical researchers. Therefore, using annotation information for the selection of lead compounds from a modern day high-throughput screening (HTS) campaign presently occurs only under a very limited scale. The recent rapid expansion of the NIH PubChem database provides an opportunity to link existing biological databases with compound catalogs and provides relevant information that potentially could improve the information garnered from large-scale screening efforts. Using the 2.5 million compound collection at the Genomics Institute of the Novartis Research Foundation (GNF) as a model, we determined that approximately 4% of the library contained compounds with potential annotation in such databases as PubChem and the World Drug Index (WDI) as well as related databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and ChemIDplus. Furthermore, the exact structure match analysis showed 32% of GNF compounds can be linked to third party databases via PubChem. We also showed annotations such as MeSH (medical subject headings) terms can be applied to in-house HTS databases in identifying signature biological inhibition profiles of interest as well as expediting the assay validation process. The automated annotation of thousands of screening hits in batch is becoming feasible and has the potential to play an essential role in the hit-to-lead decision making process.  相似文献   

7.
We herein present the graphical user interface (GUI) TmoleX for the quantum chemical program package TURBOMOLE. TmoleX allows users to execute the complete workflow of a quantum chemical investigation from the initial building of a structure to the visualization of the results in a user friendly graphical front end. The purpose of TmoleX is to make TURBOMOLE easy to use and to provide a high degree of flexibility. Hence, it should be a valuable tool for most users from beginners to experts. The program is developed in Java and runs on Linux, Windows, and Mac platforms. It can be used to run calculations on local desktops as well as on remote computers.  相似文献   

8.
9.
Internet的普及为专业人员获取数据信息、利用计算工具提供了统一的平台,由此为化学信息学的发展带来了新的空间,推动了化学信息学以网络为基础,以化学相关的数据、信息及计算资源共享为目标的快速发展。本文将从不同侧面回顾近10年来化学信息学的重要进展, 包括:(1) 网络化学信息检索:索引对象从化学浅层网向化学深层网发展;检索工具从Web化学信息资源导航向化学专业搜索引擎(包括文本信息和化合物标识信息)、及化学深层网检索引擎 (化合物物性数据提取)发展;索引粒度从Web站点向页面、乃至页面中的特定内容发展,一般页面特定内容的数据提取(即非结构化数据提取)是未来发展的方向。(2)可共享的化学数据库:从可免费访问和使用的化学数据库向数据库内容通过集成多来源数据(包括数据库拥有者主动收集、多来源数据主动提交达到共享的方式,repository)实现数据库内容免费下载和共享,以及不同数据库之间的相关内容实现无缝连接的方向发展(如NIH建成的药物小分子共享数据库PubChem)。(3) 开源(open source)化学软件工具包:从化学结构基本处理模块如CDK、JOELib向集成开发环境如化学信息学与生物信息学集成环境Bioclipse发展。(4) 与化合物及其数据共享相关的推荐标准:包括用于共享数据交换的化学标记语言CML、IUPAC推荐的学术论文相关热力学实验数据提交标准ThermoML及化合物结构唯一描述码InChI。(5) 计算化学资源共享及基于网格的应用:从可执行程序的下载向在线计算、基于网格的应用发展。(6) eChemistry和虚拟研究环境:网络也成为化学相关日常的科学活动中不可缺少的平台。构建以网络为平台、支持开展科研活动的数字化基础设施和服务的eChemstry探索开始出现,根据需要自主集成多来源数据和计算资源,形成不同层次的支持协同工作的虚拟研究环境是未来数据和计算资源共享方式的发展方向。  相似文献   

10.
Recent availability of large publicly accessible databases of chemical compounds and their biological activities (PubChem, ChEMBL) has inspired us to develop a web‐based tool for structure activity relationship and quantitative structure activity relationship modeling to add to the services provided by CHARMMing ( www.charmming.org ). This new module implements some of the most recent advances in modern machine learning algorithms—Random Forest, Support Vector Machine, Stochastic Gradient Descent, Gradient Tree Boosting, so forth. A user can import training data from Pubchem Bioassay data collections directly from our interface or upload his or her own SD files which contain structures and activity information to create new models (either categorical or numerical). A user can then track the model generation process and run models on new data to predict activity. © 2014 Wiley Periodicals, Inc.  相似文献   

11.
An efficient program, which runs on a personal computer, for the storage, retrieval, and processing of chemical information, is presented, The program can work both as a stand-alone application or in conjunction with a specifically written Web server application or with some standard SQL servers, e.g., Oracle, Interbase, and MS SQL. New types of data fields are introduced, e.g., arrays for spectral information storage, HTML and database links, and user-defined functions. CheD has an open architecture; thus, custom data types, controls, and services may be added. A WWW server application for chemical data retrieval features an easy and user-friendly installation on Windows NT or 95 platforms.  相似文献   

12.
A series of 172 molecular structures that block the hERG K+ channel were used to develop a classification model where, initially, eight types of PaDEL fingerprints were used for k-nearest neighbor model development. A consensus model constructed using Extended-CDK, PubChem and Substructure count fingerprint-based models was found to be a robust predictor of hERG activity. This consensus model demonstrated sensitivity and specificity values of 0.78 and 0.61 for the internal dataset compounds and 0.63 and 0.54 for the external (PubChem) dataset compounds, respectively. This model has identified the highest number of true positives (i.e. 140) from the PubChem dataset so far, as compared to other published models, and can potentially serve as a basis for the prediction of hERG active compounds. Validating this model against FDA-withdrawn substances indicated that it may even be useful for differentiating between mechanisms underlying QT prolongation.  相似文献   

13.
APBS and PDB2PQR are widely utilized free software packages for biomolecular electrostatics calculations. Using the Opal toolkit, we have developed a Web services framework for these software packages that enables the use of APBS and PDB2PQR by users who do not have local access to the necessary amount of computational capabilities. This not only increases accessibility of the software to a wider range of scientists, educators, and students but also increases the availability of electrostatics calculations on portable computing platforms. Users can access this new functionality in two ways. First, an Opal-enabled version of APBS is provided in current distributions, available freely on the web. Second, we have extended the PDB2PQR web server to provide an interface for the setup, execution, and visualization of electrostatic potentials as calculated by APBS. This web interface also uses the Opal framework which ensures the scalability needed to support the large APBS user community. Both of these resources are available from the APBS/PDB2PQR website: http://www.poissonboltzmann.org/.  相似文献   

14.
Grid is an emerging infrastructure for distributed computing that provides secure and scalable mechanisms for discovering and accessing remote software and data resources. Applications built on this infrastructure have great potential for addressing and solving large scale chemical, pharmaceutical, and material science problems. The article describes the concept behind grid computing and will present the OpenMolGRID system that is an open computing grid for molecular science and engineering. This system provides grid enabled components, such as a data warehouse for chemical data, software for building QSPR/QSAR models, and molecular engineering tools for generating compounds with predefined chemical properties or biological activities. The article also provides an overview about the availability of chemical applications in the grid.  相似文献   

15.
16.
Due to the versatility of present day microcontroller boards and open source development environments, new analytical chemistry devices can now be built outside of large industry and instead within smaller individual groups. While there are a wide range of commercial devices available for detecting and identifying volatile organic compounds (VOCs), most of these devices use their own proprietary software and complex custom electronics, making modifications or reconfiguration of the systems challenging. The development of microprocessors for general use, such as the Arduino prototyping platform, now enables custom chemical analysis instrumentation. We have created an example system using commercially available parts, centered around on differential mobility spectrometer (DMS) device. The Modular Reconfigurable Gas Chromatography - Differential Mobility Spectrometry package (MR-GC-DMS) has swappable components allowing it to be quickly reconfigured for specific application purposes as well as broad, generic use. The MR-GC-DMS has a custom user-friendly graphical user interface (GUI) and precisely tuned proportional-integral-derivative controller (PID) feedback control system managing individual temperature-sensitive components. Accurate temperature control programmed into the microcontroller greatly increases repeatability and system performance. Together, this open-source platform enables researchers to quickly combine DMS devices in customized configurations for new chemical sensing applications.  相似文献   

17.
ADMET (absorption, distribution, metabolism, excretion, and toxicity)‐related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of PD‐PK‐T properties using in silico tools has become very important in pharmaceutical research to reduce cost and enhance efficiency. PaDEL‐DDPredictor is an in silico tool for rapid prediction of PD‐PK‐T properties of compounds from their chemical structures. It is free and open‐source software that, has both graphical user interface and command line interface, can work on all major platforms (Windows, Linux, and MacOS) and supports more than 90 different molecular file formats. The software can be downloaded from http://padel.nus.edu.sg/software/padelddpredictor . © 2012 Wiley Periodicals, Inc.  相似文献   

18.
A hierarchical classification of chemical scaffolds (molecular framework, which is obtained by pruning all terminal side chains) has been introduced. The molecular frameworks form the leaf nodes in the hierarchy trees. By an iterative removal of rings, scaffolds forming the higher levels in the hierarchy tree are obtained. Prioritization rules ensure that less characteristic, peripheral rings are removed first. All scaffolds in the hierarchy tree are well-defined chemical entities making the classification chemically intuitive. The classification is deterministic, data-set-independent, and scales linearly with the number of compounds included in the data set. The application of the classification is demonstrated on two data sets extracted from the PubChem database, namely, pyruvate kinase binders and a collection of pesticides. The examples shown demonstrate that the classification procedure handles robustly synthetic structures and natural products.  相似文献   

19.
Flexibility and extensibility are important issues in the design of nuclear magnetic resonance (NMR) software, as these determine the ability to integrate a variety of continuously evolving data acquisition and processing methods. Here, SpinStudioJ is introduced. It is an NMR data acquisition and processing workbench with a plug-in-based architecture. The workbench is based on Eclipse Rich Client Platform, which provides a plug-and-play runtime mechanism and rich graphical user interface functionality. New data acquisition methods and processing algorithms can be easily integrated into the SpinStudioJ workbench by defining extension points, without the need to redistribute existing modules. The software is independent of operating systems, as it leverages the cross-platform feature of the Java virtual machine.  相似文献   

20.
Ligand.Info is a compilation of various publicly available databases of small molecules. The total size of the Meta-Database is over 1 million entries. The compound records contain calculated three-dimensional coordinates and sometimes information about biological activity. Some molecules have information about FDA drug approving status or about anti-HIV activity. Meta-Database can be downloaded from the http://Ligand.Info web page. The database can also be screened using a Java-based tool. The tool can interactively cluster sets of molecules on the user side and automatically download similar molecules from the server. The application requires the Java Runtime Environment 1.4 or higher, which can be automatically downloaded from Sun Microsystems or Apple Computer and installed during the first use of Ligand.Info on desktop systems, which support Java (Ms Windows, Mac OS, Solaris, and Linux). The Ligand.Info Meta-Database can be used for virtual high-throughput screening of new potential drugs. Presented examples showed that using a known antiviral drug as query the system was able to find others antiviral drugs and inhibitors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号