首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have created an analysis pipeline called Sprockets, which can be used to classify proteins into various hierarchical “families”, and build searchable models of these families. The construction of these families is based on data from Expressed Sequence Tags (ESTs) and Coding DNA Sequences (CDSs), making Sprockets clusters especially suitable for studying gene families in organisms for which the completely sequenced genome does not (yet) exist. The pipeline consists of two main parts: pair-wise analysis and grouping of sequences with Z-score statistics, followed by hierarchical splitting of clusters into alignable protein families. Various computational and statistical techniques applied in Sprockets allow it to act like a massive and selective multiple sequence alignment engine for combining individual sequence collections and related public sequences. The end result is a database of gene Hidden Markov Models, each related to the other by three levels of similarity: secondary structure, function and evolutionary origin. For a sample 20,000 EST set from Lactuca spp., Sprockets provided a 9% improvement in mapping of function to unknown sequences over traditional pair-wise search methods and InterPro mapping.  相似文献   

2.
Peptide mass fingerprints were obtained for three different proteins using three different digestion procedures in triplicates with liquid chromatography coupled to electrospray ionization mass spectrometry. For each protein the results were compared with multivariate data analysis (cluster analysis, kernel principal component analysis) and pair-wise contrast evaluation. Clear systematic differences between the digestion procedures were established for all the proteins. The visual presentation of the pair-wise differences between procedures could to some extent be related to the protein fragments, although the main objective was to identify m/z and retention regions in the original peptide maps that should be subject to further exploration.  相似文献   

3.
基于小波系数的近红外光谱局部建模方法与应用研究   总被引:2,自引:0,他引:2  
局部建模方法使用与预测样本相似的样本建立模型,可解决光谱响应与浓度之间的非线性问题,扩大模型的适用范围,提高预测准确度。采用小波变换进行数据压缩并利用小波系数之间的欧氏距离作为光谱相似性的判据,实现了近红外光谱定量分析的局部建模方法,避免了样本之间的依赖性。将所建立的方法用于烟草样品中氯含量的测定,100次重复计算得到的预测集均方根误差(RMSEP)平均值为0.0665,标准偏差(σ)为0.0045,优于全局建模和基于主成分的局部建模方法。  相似文献   

4.
A novel characterization of proteins is presented based on selected properties of recently introduced 20 × 20 amino acid adjacency matrix of proteins in which matrix elements count the occurrence of all 400 possible pair-wise adjacencies obtained by reading protein primary sequence from the left to the right. In particular we consider the characterization based on the sum and the difference of the rows and the corresponding columns, which characterize proteins by a pair of 20-component vectors. The approach is illustrated on a set of ND6 proteins of eight species.  相似文献   

5.
A novel characterization of proteins is presented based on selected properties of recently introduced 20 x 20 amino acid adjacency matrix of proteins in which matrix elements count the occurrence of all 400 possible pair-wise adjacencies obtained by reading protein primary sequence from the left to the right. In particular we consider the characterization based on the sum and the difference of the rows and the corresponding columns, which characterize proteins by a pair of 20-component vectors. The approach is illustrated on a set of ND6 proteins of eight species.  相似文献   

6.
Discrete wavelet transform (DWT) provides a well-established means for spectral denoising and baseline elimination to enhance resolution and improve the performance of calibration and classification models. However, the limitation of a fixed filter bank can prevent the optimal application of conventional DWT for the multiresolution analysis of spectra of arbitrarily varying noise and background. This paper presents a novel methodology based on an improved, second-generation adaptive wavelet transform (AWT) algorithm. This AWT methodology uses a spectrally adapted lifting scheme to generate an infinite basis of wavelet filters from a single conventional wavelet, and then finds the optimal one. Such pretreatment combined with a multivariate calibration approach such as partial least squares can greatly enhance the utility of Raman spectroscopy for quantitative analysis. The present work demonstrates this methodology using two dispersive Raman spectral data sets, incorporating lactic acid and melamine in pure water and in milk solutions. The results indicate that AWT can separate spectral background and noise from signals of interest more efficiently than conventional DWT, thus improving the effectiveness of Raman spectroscopy for quantitative analysis and classification.  相似文献   

7.
Rapid diagnosis is important for efficient treatment in clinical medicine. This study aimed at development of a method for rapid and reliable diagnosis using near-infrared (NIR) spectra of human serum samples with the help of chemometric modelling. The NIR spectra of sera from 48 healthy individuals and 16 patients with suspected kidney disease were analyzed. Discrete wavelet transform (DWT) and variable selection were adopted to extract the useful information from the spectra. Principal component analysis (PCA), linear discriminant analysis (LDA) and partial least squares discriminant analysis (PLSDA) were used for discrimination of the samples. Classification of the two-class sera was obtained using LDA and PLSDA with the help of DWT and variable selection. DWT-LDA produced 93.8% and 83.3% of the recognition rates for the validation samples of the two classes, and 100% recognition rates were obtained using DWT-PLSDA. The results demonstrated that the tiny differences between the spectra of the sera were effectively explored using DWT and variable selection, and the differences can be used for discrimination of the sera from healthy and possible patients. NIR spectroscopy and chemometrics may be a potential technique for fast diagnosis of kidney disease.  相似文献   

8.
9.
Liu BF  Sera Y  Matsubara N  Otsuka K  Terabe S 《Electrophoresis》2003,24(18):3260-3265
Signal denoising and baseline correction using discrete wavelet transform (DWT) are described for microchip capillary electrophoresis (MCE). DWT was performed on an electropherogram describing a separation of nine tetramethylrohodamine-5-isothiocyanate labeled amino acids, following MCE with laser-induced fluorescence detection, using Daubechies 5 wavelet at a decomposition level of 6. The denoising efficiency was compared with, and proved to be superior to, other commonly used denoising techniques such as Fourier transform, Savitzky-Golay smoothing and moving average, in terms of noise removal and peak preservation by directly visual inspection. Novel strategies for baseline correction were proposed, with a special interest in baseline drift that frequently occurred in chromatographic and electrophoretic separations.  相似文献   

10.
在波长范围200~400nm测定苯酚、苯胺和苯甲酸混合液的吸收光谱,用离散小波变换(DWT)对光谱数据进行处理,再用支持向量回归SVR方法进行建模,建立了离散小波变换一支持向量回归方法(DWT—SVR)。方法用于模拟样品和污染水样中苯酚、苯胺和苯甲酸的同时测定,结果满意。  相似文献   

11.
石英玻璃高温分子动力学模拟中的势函数   总被引:1,自引:0,他引:1  
根据石英玻璃高温下的分子动力学研究, 分析了势函数中多体势在高温应用下的局限性, 认为离子型对势在模拟石英玻璃高温结构方面优于多体势. 在原子电荷转移方面, 计算并分析了Si和O原子电荷大小对计算原子自扩散系数的影响, 发现用原子电荷转移较少的Morse势函数计算的原子自扩散激活温度比BKS势函数计算的低, 而且在同一温度下, 自扩散系数的计算值也随着原子电荷的减小而增大, 因此, 较小的原子电荷转移应该有利于石英玻璃在高温下的动力学性能的研究.  相似文献   

12.
13.
Conventionally, protein structure prediction via “threading” relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 1455–1467, 1999  相似文献   

14.
Cao W  Chen X  Yang X  Wang E 《Electrophoresis》2003,24(18):3124-3130
Discrete wavelets transform (DWT) was applied to noise on removal capillary electrophoresis-electrochemiluminescence (CE-ECL) electropherograms. Several typical wavelet transforms, including Haar, Daublets, Coiflets, and Symmlets, were evaluated. Four types of determining threshold methods, fixed form threshold, rigorous Stein's unbiased estimate of risk (rigorous SURE), heuristic SURE and minimax, combined with hard and soft thresholding methods were compared. The denoising study on synthetic signals showed that wave Symmlet 4 with a level decomposition of 5 and the thresholding method of heuristic SURE-hard provide the optimum denoising strategy. Using this strategy, the noise on CE-ECL electropherograms could be removed adequately. Compared with the Savitzky-Golay and Fourier transform denoising methods, DWT is an efficient method for noise removal with a better preservation of the shape of peaks.  相似文献   

15.
Kim J  Taylor D  Agrawal N  Wang H  Kim H  Han A  Rege K  Jayaraman A 《Lab on a chip》2012,12(10):1813-1822
We describe the development of a fully automatic and programmable microfluidic cell culture array that integrates on-chip generation of drug concentrations and pair-wise combinations with parallel culture of cells for drug candidate screening applications. The device has 64 individually addressable cell culture chambers in which cells can be cultured and exposed either sequentially or simultaneously to 64 pair-wise concentration combinations of two drugs. For sequential exposure, a simple microfluidic diffusive mixer is used to generate different concentrations of drugs from two inputs. For generation of 64 pair-wise combinations from two drug inputs, a novel time dependent variable concentration scheme is used in conjunction with the simple diffusive mixer to generate the desired combinations without the need for complex multi-layer structures or continuous medium perfusion. The generation of drug combinations and exposure to specific cell culture chambers are controlled using a LabVIEW interface capable of automatically running a multi-day drug screening experiment. Our cell array does not require continuous perfusion for keeping cells exposed to concentration gradients, minimizing the amount of drug used per experiment, and cells cultured in the chamber are not exposed to significant shear stress continuously. The utility of this platform is demonstrated for inducing loss of viability of PC3 prostate cancer cells using combinations of either doxorubicin or mitoxantrone with TRAIL (TNF-alpha Related Apoptosis Inducing Ligand) either in a sequential or simultaneous format. Our results demonstrate that the device can capture the synergy between different sensitizer drugs and TRAIL and demonstrate the potential of the microfluidic cell array for screening and optimizing combinatorial drug treatments for cancer therapy.  相似文献   

16.
We have developed PLASS (Protein-Ligand Affinity Statistical Score), a pair-wise potential of mean-force for rapid estimation of the binding affinity of a ligand molecule to a protein active site. This scoring function is derived from the frequency of occurrence of atom-type pairs in crystallographic complexes taken from the Protein Data Bank (PDB). Statistical distributions are converted into distance-dependent contributions to the Gibbs free interaction energy for 10 atomic types using the Boltzmann hypothesis, with only one adjustable parameter. For a representative set of 72 protein-ligand structures, PLASS scores correlate well with the experimentally measured dissociation constants: a correlation coefficient R of 0.82 and RMS error of 2.0 kcal/mol. Such high accuracy results from our novel treatment of the volume correction term, which takes into account the inhomogeneous properties of the protein-ligand complexes. PLASS is able to rank reliably the affinity of complexes which have as much diversity as in the PDB.  相似文献   

17.
A model of hydrophobic collapse, which is treated as the driving force for protein folding, is presented. This model is the superposition of three models commonly used in protein structure prediction: (1) 'oil-drop' model introduced by Kauzmann, (2) a lattice model introduced to decrease the number of degrees of freedom for structural changes and (3) a model of the formation of hydrophobic core as a key feature in driving the folding of proteins. These three models together helped to develop the idea of a fuzzy-oil-drop as a model for an external force field of hydrophobic character mimicking the hydrophobicity-differentiated environment for hydrophobic collapse. All amino acids in the polypeptide interact pair-wise during the folding process (energy minimization procedure) and interact with the external hydrophobic force field defined by a three-dimensional Gaussian function. The value of the Gaussian function usually interpreted as a probability distribution is treated as a normalized hydrophobicity distribution, with its maximum in the center of the ellipsoid and decreasing proportionally with the distance versus the center. The fuzzy-oil-drop is elastic and changes its shape and size during the simulated folding procedure.  相似文献   

18.
We present a chemometrics study in which we show the identity or degree of similarity of 3D protein structures of various G-CSF (Granulocyte Colony-Stimulating Factor) isolates. The G-CSF isolates share the same amino acid sequence, but the preparation was carried out by somehow diverse technologies. The comparison of 3D structures was made on the basis of 2D NMR NOESY (Nuclear Overhauser Enhancement Spectroscopy) spectra of proteins. In searching for the most appropriate criteria to determine the identity or degree of similarity of selected spectral regions of different isolates, two methods for quantitative evaluation of identity/similarity were used. The first method compares all peaks in the two investigated protein spectral regions; the extent of peaks that overlap is determined. The second method includes spectral invariants originating from graph theory. The criteria of identity/similarity were calculated from graphs, derived from a collection of up to 200 peaks of investigated 2D NMR spectral region. The peaks were linked into a graph according to the sequential nearest neighborhoods. According to the first method all peaks were relevant, considering that spectral noise was previously removed; the largest similarity was found between the protein of a commercially available G-CSF drug and one of the three new isolates produced in the laboratory. The second method indicated that the pairwise similarity of the three new isolates is larger than the similarity of any of the new isolates with the commercially available drug. This is an expected result taking into account that the new isolates are produced by the same technology, while the commercial product has additives for long-term storage that could not be completely compensated. The proposed measure of similarity may help the developers of biosimilar products to optimize the controllable parameters of the production technology and eventually to argue the identity of the new isolate in comparison with the originator commercial product.  相似文献   

19.
Peptide and protein drug molecules fold into higher order structures (HOS) in formulation and these folded structures are often critical for drug efficacy and safety. Generic or biosimilar drug products (DPs) need to show similar HOS to the reference product. The solution NMR spectroscopy is a non-invasive, chemically and structurally specific analytical method that is ideal for characterizing protein therapeutics in formulation. However, only limited NMR studies have been performed directly on marketed DPs and questions remain on how to quantitively define similarity. Here, NMR spectra were collected on marketed peptide and protein DPs, including calcitonin-salmon, liraglutide, teriparatide, exenatide, insulin glargine and rituximab. The 1D 1H spectral pattern readily revealed protein HOS heterogeneity, exchange and oligomerization in the different formulations. Principal component analysis (PCA) applied to two rituximab DPs showed consistent results with the previously demonstrated similarity metrics of Mahalanobis distance (DM) of 3.3. The 2D 1H-13C HSQC spectral comparison of insulin glargine DPs provided similarity metrics for chemical shift difference (Δδ) and methyl peak profile, i.e., 4 ppb for 1H, 15 ppb for 13C and 98% peaks with equivalent peak height. Finally, 2D 1H-15N sofast HMQC was demonstrated as a sensitive method for comparison of small protein HOS. The application of NMR procedures and chemometric analysis on therapeutic proteins offer quantitative similarity assessments of DPs with practically achievable similarity metrics.  相似文献   

20.
Raul Cartas  Andrey Legin 《Talanta》2010,80(3):1428-1435
Simultaneous quantification of Cd2+ and Pb2+ in solution has been correctly targeted using the kinetic information from a single non-specific potentiometric sensor. Dual quantification was accomplished from the complex information in the transient response of an electrode used in a Sequential Injection Analysis (SIA) system and recorded after step injection of sample. Data was firstly preprocessed with the Discrete Wavelet Transform (DWT) to extract significant features and then fed into an Artificial Neural Network (ANN) for building the calibration model. DWT stage was optimized regarding the wavelet function and decomposition level, while the ANN stage was optimized on its structure. To simultaneously corroborate the effectiveness of the approach, two different potentiometric sensors were used as study case, one using a glass selective to Cd2+ and another a PVC membrane selective to Pb2+.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号