Classifying proteins into their respective enzyme class is an interesting question for researchers for a variety of reasons. The open source Protein Data Bank (PDB) contains more than 1,60,000 structures, with more being added everyday. This paper proposes an attention-based bidirectional-LSTM model (ABLE) trained on over sampled data generated by SMOTE to analyse and classify a protein into one of the six enzyme classes or a negative class using only the primary structure of the protein described as a string by the FASTA sequence as an input. We achieve the highest F1-score of 0.834 using our proposed model on a dataset of proteins from the PDB. We baseline our model against eighteen other machine learning and deep learning networks, including CNN, LSTM, Bi-LSTM, GRU, and the state-of-the-art DeepEC model. We conduct experiments with two different oversampling techniques, SMOTE and ADASYN. To corroborate the obtained results, we perform extensive experimentation and statistical testing. 相似文献
Acidic catecholamine metabolites, which could serve as diagnostic markers for many diseases, demonstrate an importance of accurate sensing. However, they share a highly similar chemical structure, which is a challenge in the design of sensing strategies. A nanopore may be engineered to sense these metabolites in a single molecule manner. To achieve this, a recently developed programmable nano-reactor for stochastic sensing (PNRSS) technique adapted with a phenylboronic acid (PBA) adaptor was applied. Three acidic catecholamine metabolites, including 3,4-dihydroxyphenylacetic acid (DOPAC), 3,4-dihydroxymandelic acid (DHMA) and 3-methoxy-4-hydroxymandetic acid (VMA) were investigated by PNRSS. Specifically, DHMA, which contains an α-hydroxycarboxylate moiety and an adjacent cis-hydroxyl groups on its benzene ring, reports two binding modes simultaneously resolvable by PNRSS. Assisted with the high resolution of PNRSS, direct regulation of these two binding modes by pH can also be observed. A custom machine learning algorithm was also developed to achieve automatic event classification. 相似文献
When buyer valuations are drawn IID from a known regular distribution, a second price auction with a symmetric reserve price is the revenue-optimal single-item auction. When this distribution is irregular, we provide the first separation result showing that a second price auction with reserves earns at most 0.778 times the revenue of Myerson’s optimal auction, even when the reserves can be asymmetric. Since the lower bound is 0.745 for i.i.d. buyers, our result is nearly tight. 相似文献
ABSTRACTA class of semilinear parabolic reaction diffusion equations with multiple time delays is considered. These time delays and corresponding weights are to be optimized such that the associated solution of the delay equation is the best approximation of a desired state function. The differentiability of the mapping is proved that associates the solution of the delay equation to the vector of weights and delays. Based on an adjoint calculus, first-order necessary optimality conditions are derived. Numerical test examples show the applicability of the concept of optimizing time delays. 相似文献
基于状态空间模型的许多传统滤波算法都基于Rn空间中的高斯分布模型,但当状态向量中包含角变量或方向变量时,难以达到理想的效果。针对J.T.Horwood等提出的nS?R流形上的Gauss Von Mises(GVM)多变量概率密度分布,扩展了狄拉克混合逼近方法,给出了联合分布的GVM逼近方法,推导了后验分布的GVM参数计算公式,设计了量测更新状态估计算法。将J.T.Horwood等的时间更新算法与所提出的量测更新算法相结合,可实现基于GVM分布的递推贝叶斯滤波器(GVMF)。仿真结果表明,当状态向量符合GVM概率分布模型时,GVMF对角变量的估计明显优于传统的扩展卡尔曼滤波器。 相似文献
This article proposes a Bayesian density estimation method based upon mixtures of gamma distributions. It considers both the cases of known mixture size, using a Gibbs sampling scheme with a Metropolis step, and unknown mixture size, using a reversible jump technique that allows us to move from one mixture size to another. We illustrate our methods using a number of simulated datasets, generated from distributions covering a wide range of cases: single distributions, mixtures of distributions with equal means and different variances, mixtures of distributions with different means and small variances and, finally, a distribution contaminated by low-weighted distributions with different means and equal, small variances. An application to estimation of some quantities for a M/G/1 queue is given, using real E-mail data from CNR-IAMI. 相似文献
This article proposes a class of conditionally specified models for the analysis of multivariate space-time processes. Such models are useful in situations where there is sparse spatial coverage of one of the processes and much more dense coverage of the other process(es). The dependence structure across processes and over space, and time is completely specified through a neighborhood structure. These models are applicable to both point and block sources; for example, multiple pollutant monitors (point sources) or several county-level exposures (block sources). We introduce several computational tricks that are integral for model fitting, give some simple sufficient and necessary conditions for the space-time covariance matrix to be positive definite, and implement a Gibbs sampler, using Hybrid MC steps, to sample from the posterior distribution of the parameters. Model fit is assessed via the DIC. Predictive accuracy, over both time and space, is assessed both relatively and absolutely via mean squared prediction error and coverage probabilities. As an illustration of these models, we fit them to particulate matter and ozone data collected in the Los Angeles, CA, area in 1995 over a three-month period. In these data, the spatial coverage of particulate matter was sparse relative to that of ozone. 相似文献
In this paper, we show a mathematical construction of Beck–Cohen superstatistics in the Bayesian point of view with the help of the two representations of a gamma function. Furthermore, it is shown how some results for superstatistics are related to each other. 相似文献
In the light of recent developments in computer technology, a promising and efficient way to design a material with a desired property would be to solve the inverse problem: use a physical property to predict structure. Here, we discuss the basic idea and mathematical foundation of the inverse approach, and proposed strategies for its utilization in the design of materials over nano‐ to macro‐scales. At the nano‐scale, analyzed strategies include scanning of a high‐dimensional space of chemical compounds for those compounds that have a targeted property, and identification of correlations in large databases of materials. However, unlike utilization of inverse approach at nano‐scale where full structural information ‐ atoms and their positions‐ is linked to targeted properties, at the meso‐ and macro‐scale, only partial structural information, manifested via structural motifs or representative volume elements, is available. We discuss the role of partial structural information in the inverse approach to the design of materials at those scales. Risks and limitations of the inverse approach are analyzed and dependence of the approach on factors such as structure parametrization, approximations in theoretical models, and feedback from structural characterization, is addressed.