首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
A major focus of current efforts in genomics is to elucidate the genetic variations extent within the human population, and to study the effects of these variations upon the human system. The most common type of genetic variations are the single nucleotide polymorphisms (SNPs), which occur every 500-1000 nt in the genome. Large-scale population association studies to study the biological or medical significance of such variations may require the analysis of hundreds of thousands of SNPs on thousands of individuals. We are pursuing development of an approach to large-scale SNP analysis that combines the specificity of invasive cleavage reactions with the parallelism of high density DNA arrays. A surface-immobilized probe oligonucleotide is specifically cleaved in the presence of a complementary target sequence in unamplified human genomic DNA, yielding a 5' phosphate group. High sensitivity detection of this reaction product on the surface is achieved by the use of rolling circle amplification, with an approximate concentration detection limit of 10 fM target DNA. This combination of very specific surface cleavage and highly sensitive surface detection will make possible the rapid and parallel analysis of genetic variations across large populations.  相似文献   

2.
Interleukin 33 (IL-33) is the latest member of the IL-1 cytokine family, which plays both pro - and anti-inflammatory functions. Numerous Single-nucleotide polymorphisms (SNPs) in the IL-33 gene have been recognized to be associated with a vast variety of inflammatory disorders. SNPs associated studies have become a crucial approach in uncovering the genetic background of human diseases. However, distinguishing the functional SNPs in a disease-related gene from a pool of both functional and neutral SNPs is a major challenge and needs multiple experiments of hundreds or thousands of SNPs in candidate genes. This study aimed to identify the possible deleterious SNPs in the IL-33 gene using bioinformatics predictive tools. The nonsynonymous SNPs (nsSNPs) were analyzed by SIFT, PolyPhen, PROVEAN, SNP&GO, MutPred, SNAP, PhD SNP, and I-Mutant tools. The Non-coding SNPs (ncSNPs) were also analyzed by SNPinfo and RegulomeDB tools. In conclusion, our in-silico analysis predicted 5 nsSNPs and 22 ncSNPs as potential candidates in the IL-33 gene for future genetic association studies.  相似文献   

3.
4.
The analysis of mitochondrial DNA (mtDNA) single-nucleotide polymorphisms (SNPs) using the SNaPshot technique (Applied Biosystems) is a fast and sensitive method for the reliable identification of disease-associated mtDNA SNPs, genetic ancestry mtDNA SNPs and forensically important mtDNA SNPs. The detection of many SNPs in one multiplex PCR and one subsequent multiplex minisequencing reaction is challenging for laboratories who want to establish this technique, due to the problem that there is no allelic ladder available for mtDNA SNP analysis via SNaPshot technique. Normally, the laboratory has to invent long-term testing and studies. The interpretation of false and correct alleles is up to some specialists knowing the expected and the estimated size of each allele SNP. We here present a protocol to assemble up to 84 alleles of 42 different mtDNA SNPs in an allelic ladder that is based upon reference alleles. We recommend using allelic ladders/reference alleles for SNP analysis to maintain high-quality analysis standards.  相似文献   

5.
The application of a new method to the multivariate analysis of incomplete data sets is described. The new method, called maximum likelihood principal component analysis (MLPCA), is analogous to conventional principal component analysis (PCA), but incorporates measurement error variance information in the decomposition of multivariate data. Missing measurements can be handled in a reliable and simple manner by assigning large measurement uncertainties to them. The problem of missing data is pervasive in chemistry, and MLPCA is applied to three sets of experimental data to illustrate its utility. For exploratory data analysis, a data set from the analysis of archeological artifacts is used to show that the principal components extracted by MLPCA retain much of the original information even when a significant number of measurements are missing. Maximum likelihood projections of censored data can often preserve original clusters among the samples and can, through the propagation of error, indicate which samples are likely to be projected erroneously. To demonstrate its utility in modeling applications, MLPCA is also applied in the development of a model for chromatographic retention based on a data set which is only 80% complete. MLPCA can predict missing values and assign error estimates to these points. Finally, the problem of calibration transfer between instruments can be regarded as a missing data problem in which entire spectra are missing on the ‘slave’ instrument. Using NIR spectra obtained from two instruments, it is shown that spectra on the slave instrument can be predicted from a small subset of calibration transfer samples even if a different wavelength range is employed. Concentration prediction errors obtained by this approach were comparable to cross-validation errors obtained for the slave instrument when all spectra were available.  相似文献   

6.
《Analytical letters》2012,45(6):1066-1074
Toll-like Receptor 8 (TLR8) plays an important role in the innate immune defense against various pathogens. The TLR8 genetic variation affects the course of infections with agents such as HIV, HCV, and mycobacteria. To assess the influence of single nucleotide polymorphisms (SNPs) large cohorts need to be studied; thus, rapid and cost-efficient protocols for high-throughput TLR8 genotyping are needed. We designed a single-tube assay for genotyping four SNPs located to the major coding exon of TLR8 using the LightCycler 480 system. The new method is accurate, fast, low-priced, and can be applied for fully automated high-throughput genotyping of TLR8 SNPs.  相似文献   

7.
Coulomb interaction is one of the major time-consuming components in a density functional theory (DFT) calculation. In the last decade, dramatic progresses have been made to improve the efficiency of Coulomb calculation, including continuous fast multipole method (CFMM) and J-engine method, all developed first inside Q-Chem. The most recent development is the advent of Fourier transform Coulomb method developed by Fusti-Molnar and Pulay, and an improved version of the method has been recently implemented in Q-Chem. It replaces the least efficient part of the previous Coulomb methods with an accurate numerical integration scheme that scales in O(N2) instead of O(N4) with the basis size. The result is a much smaller slope in the linear scaling with respect to the molecular size and we will demonstrate through a series of benchmark calculations that it speeds up the calculation of Coulomb energy by several folds over the efficient existing code, i.e., the combination of CFMM and J-engine, without loss of accuracy. Furthermore, we will show that it is complementary to the latter and together the three methods offer the best performance for Coulomb part of DFT calculations, making the DFT calculations affordable for very large systems involving thousands of basis functions.  相似文献   

8.
The identification of three-dimensional pharmacophores from large, heterogeneous data sets is still an unsolved problem. We developed a novel program, SCAMPI (statistical classification of activities of molecules for pharmacophore identification), for this purpose by combining a fast conformation search with recursive partitioning, a data-mining technique, which can easily handle large data sets. The pharmacophore identification process is designed to run recursively, and the conformation spaces are resampled under the constraints of the evolving pharmacophore model. This program is capable of deriving pharmacophores from a data set of 1000-2000 compounds, with thousands of conformations generated for each compound and in less than 1 day of computational time. For two test data sets, the identified pharmacophores are consistent with the known results from the literature.  相似文献   

9.
The process of Drug Discovery is a complex and high risk endeavor that requires focused attention on experimental hypotheses, the application of diverse sets of technologies and data to facilitate high quality decision-making. All is aimed at enhancing the quality of the chemical development candidate(s) through clinical evaluation and into the market. In support of the lead generation and optimization phases of this endeavor, high throughput technologies such as combinatorial/high throughput synthesis and high throughput and ultra-high throughput screening, have allowed the rapid analysis and generation of large number of compounds and data. Today, for every analog synthesized 100 or more data points can be collected and captured in various centralized databases. The analysis of thousands of compounds can very quickly become a daunting task. In this article we present the process we have developed for both analyzing and prioritizing large sets of data starting from diversity and focused uHTS in support of lead generation and secondary screens supporting lead optimization. We will describe how we use informatics and computational chemistry to focus our efforts on asking relevant questions about the desired attributes of a specific library, and subsequently in guiding the generation of more information-rich sets of analogs in support of both processes.  相似文献   

10.
The analysis of individual molecules is evolving into an important tool for biological research, and presents conceptually new ways of approaching experimental design strategies. However, more robust methods are required if these technologies are to be made broadly available to the biological research community. To help achieve this goal we have combined nanofabrication techniques with single-molecule optical microscopy for assembling and visualizing curtains comprised of thousands of individual DNA molecules organized at engineered diffusion barriers on a lipid bilayer-coated surface. Here we present an important extension of this technology that implements geometric barrier patterns comprised of thousands of nanoscale wells that can be loaded with single molecules of DNA. We show that these geometric nanowells can be used to precisely control the lateral distribution of the individual DNA molecules within curtains assembled along the edges of the engineered barrier patterns. The individual molecules making up the DNA curtain can be separated from one another by a user-defined distance dictated by the dimensions of the nanowells. We demonstrate the broader utility of these patterned DNA curtains in a novel, real time restriction assay that we refer to as dynamic optical restriction mapping, which can be used to rapidly identify entire sets of cleavage sites within a large DNA molecule.  相似文献   

11.
Forensic analysis of mitochondrial displacement loop (D‐loop) sequences using Sanger sequencing or SNP detection by minisequencing is well established. Pyrosequencing has become an important alternative because it enables high‐throughput analysis and the quantification of individual mitochondrial DNAs (mtDNAs) in samples originating from more than one individual. DNA typing of the mitochondrial D‐loop region is usually the method of choice if STR analysis fails because of trace amounts of DNA and/or extensive degradation. The main aim of the present work was to optimize the efficiency of pyrosequencing. To do this, 31 SNPs within the hypervariable regions I and II of the D‐loop of human mtDNA were simultaneously analyzed. As a novel approach, we applied two sets of amplification primers for the multiplexing assay. These went in combination with four sequencing primers for pyrosequencing. This method was compared with conventional sequencing of mtDNA from blood and biological trace materials.  相似文献   

12.
Maximum likelihood principal component analysis (MLPCA) was originally proposed to incorporate measurement error variance information in principal component analysis (PCA) models. MLPCA can be used to fit PCA models in the presence of missing data, simply by assigning very large variances to the non‐measured values. An assessment of maximum likelihood missing data imputation is performed in this paper, analysing the algorithm of MLPCA and adapting several methods for PCA model building with missing data to its maximum likelihood version. In this way, known data regression (KDR), KDR with principal component regression (PCR), KDR with partial least squares regression (PLS) and trimmed scores regression (TSR) methods are implemented within the MLPCA method to work as different imputation steps. Six data sets are analysed using several percentages of missing data, comparing the performance of the original algorithm, and its adapted regression‐based methods, with other state‐of‐the‐art methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
A procedure is described for calculating the similarity between molecules characterised by lists of substructural fragments. The method is very much faster in operation than the conventional method and permits data sets containing some thousands of compounds to be processed.  相似文献   

14.
Detection of nucleic acids and single nucleotide polymorphisms (SNPs) is of pivotal importance in biology and medicine. Given that the biological effect of SNPs often is enhanced in combination with other SNPs, multiplexed SNP detection is desirable. We show proof of concept of the multiplexed detection of SNPs based on the template‐directed native chemical ligation (NCL) of PNA‐probes carrying a metal tag allowing detection using ICP‐MS. For the detection of ssDNA oligonucleotides (30 bases), two probes, one carrying the metal tag and a second one carrying biotin for purification, are covalently ligated. The methodological limit of detection is of 29 pM with RSD of 6.7% at 50 pM (n = 5). Detection of SNPs is performed with the combination of two sets of reporter probes. The first probe set targets the SNP, and its yield is compared with a second set of probes targeting a neighboring sequence. The assay was used to simultaneously differentiate between alleles of three SNPs at 5‐nM concentration.  相似文献   

15.
We describe a method of performing trilinear analysis on large data sets using a modification of the PARAFAC‐ALS algorithm. Our method iteratively decomposes the data matrix into a core matrix and three loading matrices based on the Tucker1 model. The algorithm is particularly useful for data sets that are too large to upload into a computer's main memory. While the performance advantage in utilizing our algorithm is dependent on the number of data elements and dimensions of the data array, we have seen a significant performance improvement over operating PARAFAC‐ALS on the full data set. In one case of data comprising hyperspectral images from a confocal microscope, our method of analysis was approximately 60 times faster than operating on the full data set, while obtaining essentially equivalent results. Copyright © 2008 by John Wiley & Sons, Ltd.  相似文献   

16.
Analogue series play a key role in drug discovery. They arise naturally in lead optimization efforts where analogues are explored based on one or a few core structures. However, it is much harder to accurately identify and extract pairs or series of analogue molecules in large compound databases with no predefined core structures. This methodological review outlines the most common and recent methodological developments to automatically identify analogue series in large libraries. Initial approaches focused on using predefined rules to extract scaffold structures, such as the popular Bemis–Murcko scaffold. Later on, the matched molecular pair concept led to efficient algorithms to identify similar compounds sharing a common core structure by exploring many putative scaffolds for each compound. Further developments of these ideas yielded, on the one hand, approaches for hierarchical scaffold decomposition and, on the other hand, algorithms for the extraction of analogue series based on single-site modifications (so-called matched molecular series) by exploring potential scaffold structures based on systematic molecule fragmentation. Eventually, further development of these approaches resulted in methods for extracting analogue series defined by a single core structure with several substitution sites that allow convenient representations, such as R-group tables. These methods enable the efficient analysis of large data sets with hundreds of thousands or even millions of compounds and have spawned many related methodological developments.  相似文献   

17.
Single nucleotide polymorphisms (SNPs) are the most abundant variations in the human genome and have become the primary markers for genetic studies for mapping and identifying susceptible genes for complex diseases. Methods that genotype SNPs quickly and economically are of high values for these studies because they require a large amount of genotyping. Fluorescence polarization (FP) is a robust technique that can detect products without separation and purification and it has been applied for SNP genotyping. In this article the applications of FP in SNP genotyping are reviewed and one of the methods, the FP-TDI assay, is discussed in details. It is hoped that readers could get useful information for the applications of FP in SNP genotyping and some insights of the FP-TDI assay.  相似文献   

18.
SNPs are one of the main sources of DNA variation among humans. Their unique properties make them useful polymorphic markers for a wide range of fields, such as medicine, forensics, and population genetics. Although several high-throughput techniques have been (and are being) developed for the vast typing of SNPs in the medical context, population genetic studies involve the typing of few and select SNPs for targeted research. This results in SNPs having to be typed in multiple reactions, consuming large amounts of time and of DNA. In order to improve the current situation in the area of human Y-chromosome diversity studies, we decided to employ a system based on a multiplex oligo ligation assay/PCR (OLA/PCR) followed by CE to create a Y multiplex capable of distinguishing, in a single reaction, all the major haplogroups and as many subhaplogroups on the Y-chromosome phylogeny as possible. Our efforts resulted in the creation of a robust and accurate 35plex (35 SNPs in a single reaction) that when tested on 165 human DNA samples from different geographic areas, proved capable of assigning samples to their corresponding haplogroup.  相似文献   

19.
Large transition‐metal complexes are used in numerous areas of chemistry. Computer‐aided theoretical investigations of such complexes are limited by the sheer size of real systems often consisting of hundreds to thousands of atoms. Accordingly, the development and thorough evaluation of fast semi‐empirical quantum chemistry methods that are universally applicable to a large part of the periodic table is indispensable. Herein, we report on the capability of the recently developed GFNn‐xTB method family for full quantum‐mechanical geometry optimisation of medium to very large transition‐metal complexes and organometallic supramolecular structures. The results for a specially compiled benchmark set of 145 diverse closed‐shell transition‐metal complex structures for all metals up to Hg are presented. Further the GFNn‐xTB methods are tested on three established benchmark sets regarding reaction energies and barrier heights of organometallic reactions.  相似文献   

20.
In this work, two different maximum likelihood approaches for multivariate curve resolution based on maximum likelihood principal component analysis (MLPCA) and on weighted alternating least squares (WALS) are compared with the standard multivariate curve resolution alternating least squares (MCR‐ALS) method. To illustrate this comparison, three different experimental data sets are used: the first one is an environmental aerosol source apportionment; the second is a time‐course DNA microarray, and the third one is an ultrafast absorption spectroscopy. Error structures of the first two data sets were heteroscedastic and uncorrelated, and the difference between them was in the existence of missing values in the second case. In the third data set about ultrafast spectroscopy, error correlation between the values at different wavelengths is present. The obtained results confirmed that the resolved component profiles obtained by MLPCA‐MCR‐ALS are practically identical to those obtained by MCR‐WALS and that they can differ from those resolved by ordinary MCR‐ALS, especially in the case of high noise. It is shown that methods that incorporate uncertainty estimations (such as MLPCA‐ALS and MCR‐WALS) can provide more reliable results and better estimated parameters than unweighted approaches (such as MCR‐ALS) in the case of the presence of high amounts of noise. The possible advantage of using MLPCA‐MCR‐ALS over MCR‐WALS is then that the former does not require changing the traditional MCR‐ALS algorithm because MLPCA is only used as a preliminary data pretreatment before MCR analysis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号