首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
关于DNA序列分类问题的模型   总被引:4,自引:1,他引:3  
本文提出了一种将人工神经元网络用于 DNA分类的方法 .作者首先应用概率统计的方法对 2 0个已知类别的人工 DNA序列进行特征提取 ,形成 DNA序列的特征向量 ,并将之作为样本输入 BP神经网络进行学习 .作者应用了 MATLAB软件包中的 Neural Network Toolbox(神经网络工具箱 )中的反向传播 ( Backpropagation BP)算法来训练神经网络 .在本文中 ,作者构造了两个三层 BP神经网络 ,将提取的 DNA特征向量集作为样本分别输入这两个网络进行学习 .通过训练后 ,将 2 0个未分类的人工序列样本和 1 82个自然序列样本提取特征形成特征向量并输入两个网络进行分类 .结果表明 :本文中提出的分类方法能够以很高的正确率和精度对 DNA序列进行分类 ,将人工神经元网络用于 DNA序列分类是完全可行的  相似文献   

2.
DNA序列分类的数学模型   总被引:1,自引:0,他引:1  
本文从三个不同的角度分别论述了如何对 DNA序列进行分类的问题 ,依据这三个角度分别建立了三类模型 .首先 ,从生物学背景和几何对称观点出发 ,建立了 DNA序列的三维空间曲线的表达形式 .建立了初步数学模型 -积分模型 ,并且通过模型函数计算得到了 1到 2 0号 DNA序列的分类结果 ,发现与题目所给分类结果相同 ,然后我们又对后 2 0个 DNA序列进行了分类 .然后 ,从人工神经网络的角度出发 ,得到了第二类数学模型 -人工神经网络模型 .并且选择了三种适用于模式分类的基本网络 ,即感知机模型 ,多层感知机 ( BP网络 )模型以及 LVQ矢量量化学习器 ,同时就本问题提出了对 BP网络的改进 (改进型多层感知机 ) ,最后采用多种训练方案 ,均得到了较理想的分类结果 .同时也发现了通过人工神经网络的方法得到的分类结果与积分模型得到的分类结果是相同的 (前四十个 ) .最后 ,我们对碱基赋予几何意义 :A.C.G.T分别表示右 .下 .左 .上 .用 DNA序列控制平面上点的移动 ,每个序列得到一个游动曲线 ,提取游动方向趋势作为特征 ,建立起了模型函数 ,同时也得到了后二十个 DNA序列的分类结果 ,而且发现结果与上述两个模型所得到的分类结果几乎相同 (其中有一个不同 ,在本模型中表示为不可分的 ) .此模型保留的信息量更多 ,而且  相似文献   

3.
通过支持向量机(SVM)对客车车型的长,宽,高,宽长比等7个特征进行特征选择,得到的准确率最高的子集是长、宽、高、宽长比、宽高比,以它作为样本特征进行分类.对客车的4类车型进行分类,每类车型选择80个样本,50个样本进行训练,30个样本进行预测,结果表明:对1类车型的分类准确率可达到100%,对2类和4类车型可达到96%以上,对3类车可达到93%以上.得到了比选用长、宽、高作为特征进行分类更优的结果.然后运用加入参数寻优的SVM对客车的4类车型进行分类,并加以比较.基于高斯函数的特性,两次用到SVM进行机器学习时,核函数均选用RBF核函数.  相似文献   

4.
统计DNA序列中64种包含3个碱基字符串的频率,基于生物学知识,以此作为区分不同类别DNA序列的特征.对此频率数据使用主成分分析和Fisher判别两种方法进行数据降维操作,根据降维后的数据建立距离判别模型,用训练样本回判,检验模型判别效果,最后对未知类别序列进行判别归类,比较分类结果.  相似文献   

5.
统计DNA序列中64种包含3个碱基字符串的频率,基于生物学知识,以此作为区分不同类别DNA序列的特征.对此频率数据使用主成分分析和Fisher判别两种方法进行数据降维操作,根据降维后的数据建立距离判别模型,用训练样本回判,检验模型判别效果,最后对未知类别序列进行判别归类,比较分类结果.  相似文献   

6.
针对不平衡数据集分类问题,提出了一种基于聚类的欠采样方法.分别取不同的聚类个数,对训练集中的多数类样本进行若干次聚类,然后用聚类中心作为多数类样本,与少数类样本构成若干个新的训练集,之后用这些训练集训练分类器,剔除具有错误分类倾向的分类器,最后对分类结果进行投票.仿真实验对几种欠采样方法进行比较.实验采用16个平衡率不一的数据集进行测试.理论分析与实验结果表明:提出的基于聚类的欠采样方法能有效地改善不平衡数据集的不平衡性.  相似文献   

7.
基于贝叶斯统计方法的两总体基因表达数据分类   总被引:1,自引:0,他引:1  
在疾病的诊断过程中,对疾病的精确分类是提高诊断准确率和疾病治愈率至 关重要的一个环节,DNA芯片技术的出现使得我们从微观的层次获得与疾病分类及诊断 密切相关的基因功能信息.但是DNA芯片技术得到的基因的表达模式数据具有多变量小 样本特点,使得分类过程极不稳定,因此我们首先筛选出表达模式发生显著性变化的基因 作为特征基因集合以减少变量个数,然后再根据此特征基因集合建立分类器对样本进行分 类.本文运用似然比检验筛选出特征基因,然后基于贝叶斯方法建立了统计分类模型,并 应用马尔科夫链蒙特卡罗(MCMC)抽样方法计算样本归类后验概率.最后我们将此模型 应用到两组真实的DNA芯片数据上,并将样本成功分类.  相似文献   

8.
M序列由于具有良好的统计特性经常被应用在信息安全领域.这使得寻找F2中M序列反馈函数成为一项有意义的工作.给出了由已知M序列反馈多项式得出新的与已知函数同次数的M序列反馈多项式的新方法.主要工作如下:1)用图形简单的给出了并圈法的逆过程所实现的操作过程.2)将并圈法的逆运算与并圈法先后应用在已有M序列状态图交叉排列的两对前共轭顶点对上,得到了由已知M序列反馈多项式生成新M序列反馈多项式的算法.3)证明了上述给出算法在二阶有限域F2中的正确性.4)用C语言实现了算法.实验结果表明当移位寄存器的阶不是很大时算法是有效的.  相似文献   

9.
有序判别分析新算法及其应用   总被引:1,自引:1,他引:0  
判别分析是用已知分类数据建模对未知分类数据进行判别的方法,所用数据和分类不分顺序。要对有序又有周期数据进行判别分析,就要探索有序判别的新方法。这种方法的分类应当是有序的,并且能够排除事物发展周期性的干扰。本文介绍多元数据有序判别分析新方法的原理、建模流程、应用流程和应用实例。这种判别分析将分类建模与判别归类分开。新方法对多元数据建模时在多类模型中建立滑移的多套子模型,应用时根据应用领域的知识对样本归属作初步预估,然后程序选择相关的子模型进行判别归类。这种方法解决了由于时间序列多元数据周期性造成的样本分类颠倒问题,为时间序列数据的分类和预测开辟了新途径,在实际应用中取得了良好的效果,解决了重大难题。  相似文献   

10.
针对英文情感分类问题,对不同样本采用不同权重,通过引入模糊隶属度函数,通过计算样本模糊隶属度确定样本隶属某一类程度的模糊支持向量机分类算法,通过对比选取不同核函数和不同惩罚系数的结果.仿真实验结果表明应用模糊支持向量机进行英文情感分类具有较好的分类能力和较高的识别能力.  相似文献   

11.
The problem considered is that of the allocation of resources to activities according to a fractional measure given by the ratio of “return” to “cost”. The return is the sum of returns from the activities, each activity being described by a concave return function. There is a positive fixed cost and a variable cost that depend linearly on the allocations. Properties related to the uniqueness of optimal solutions and the number of non-zero allocations are derived. A method is given by which any set of feasible allocations can be used to derive an upper bound of the optimal value of the objective function: optimal and almost-optimal allocations can be recognized. Allocations can be generated by a fast incremental method that is described. The method utilizes data in sequential order and can be used to solve large problems.  相似文献   

12.
This article shows how to smoothly “monotonize” standard kernel estimators of hazard rate, using bootstrap weights. Our method takes a variety of forms, depending on choice of kernel estimator and on the distance function used to define a certain constrained optimization problem. We confine attention to a particularly simple kernel approach and explore a range of distance functions. It is straightforward to reduce “quadratic” inequality constraints to “linear” equality constraints, and so our method may be implemented using little more than conventional Newton–Raphson iteration. Thus, the necessary computational techniques are very familiar to statisticians. We show both numerically and theoretically that monotonicity, in either direction, can generally be imposed on a kernel hazard rate estimator regardless of the monotonicity or otherwise of the true hazard rate. The case of censored data is easily accommodated. Our methods have straightforward extension to the problem of testing for monotonicity of hazard rate, where the distance function plays the role of a test statistic.  相似文献   

13.
This paper presents an optimum concept to design “road-friendly” vehicles with the recognition of pavement loads as a primary objective function of vehicle suspension design. A walking-beam suspension system is used as an illustrative example of vehicle model to demonstrate the concept and process of optimization. The hypothesis of isotropy is applied to the measured one-dimensional road profile so that a two-dimensional random field model of pavement surface roughness can be achieved. Dynamic response of the walking-beam suspension system is obtained by means of stochastic process theory. Three commonly used objective of suspension optimum design, including ride quality, suspension stroke, and road adhesion, are briefly reviewed. The minimization of the probability of peak value of the tire load exceeding a given value is proposed as an objective function. Using the direct update method, optimization is carried out when tire loads is taken as the objective function of suspension design. The results show that tires with high air pressure and suspension systems with small damping will lead to large tire loads. The concept proposed in this paper is applicable to generic cases, where more complex vehicle model and pavement surface condition apply.  相似文献   

14.
多工序制造过程通常包含串联和并联两种结构,具有串并联混合结构的多工序制造过程是实践中最为常见的形式,而不同模式的并联结构其上游工序质量波动对下游工序及总过程能力的影响不同。针对多工序制造过程并联结构特点,本文从波动减少的角度重点对多工序并联制造过程中并行、分散和收敛三种基本模式进行过程能力分析,研究多工序制造过程中各子过程波动对整体过程能力的影响,并根据各子过程质量波动减少的“困难度”和“效用比”评价质量改进的效果,给出多工序并联过程能力改进策略选择依据。通过实例表明,本方法能较好地识别各工序质量波动减少对本工序过程能力和总过程能力的不同影响,确定质量改进的优先顺序,实现多工序制造过程的经济性质量改进和优化。  相似文献   

15.
In this paper we present a new steepest-descent type algorithm for convex optimization problems. Our algorithm pieces the unknown into sub-blocs of unknowns and considers a partial optimization over each sub-bloc. In quadratic optimization, our method involves Newton technique to compute the step-lengths for the sub-blocs resulting descent directions. Our optimization method is fully parallel and easily implementable, we first presents it in a general linear algebra setting, then we highlight its applicability to a parabolic optimal control problem, where we consider the blocs of unknowns with respect to the time dependency of the control variable. The parallel tasks, in the last problem, turn “on” the control during a specific time-window and turn it “off” elsewhere. We show that our algorithm significantly improves the computational time compared with recognized methods. Convergence analysis of the new optimal control algorithm is provided for an arbitrary choice of partition. Numerical experiments are presented to illustrate the efficiency and the rapid convergence of the method.  相似文献   

16.
In this paper, a warm standby repairable system consisting of two dissimilar units and one repairman is studied. In this system, it is assumed that the working time distributions and the repair time distributions of the two units are both exponential, and unit 1 is given priority in use. After repair, both unit 1 and unit 2 are “as good as new”. Moreover, the transfer switch in the system is unreliable, and the function of the switch is: “as long as the switch fails, the whole system fails immediately”. Under these assumptions, using Markov process theory and the Laplace transform, some important reliability indexes and some steady state system indexes are derived. Finally, a numerical example is given to illustrate the theoretical results of the model.  相似文献   

17.
The work is devoted to application of global optimization in data fitting problem under interval uncertainty. Parameters of the linear function that best fits intervally defined data are taken as the maximum point for a special (“recognizing”) functional which is shown to characterize consistency between the data and parameters. The new data fitting technique is therefore called “maximum consistency method”. We investigate properties of the recognizing functional and present interpretation of the parameter estimates produced by the maximum consistency method.  相似文献   

18.
本文研究考虑交易成本的投资组合模型,分别以风险价值(VAR)和夏普比率(SR)作为投资组合的风险评价指标和效益评价指标。为有效求解此模型,本文在引力搜索和粒子群算法的基础上提出了一种混合优化算法(IN-GSA-PSO),将粒子群算法的群体最佳位置和个体最佳位置与引力搜索算法的加速度算子有机结合,使混合优化算法充分发挥单一算法的开采能力和探索能力。通过对算法相关参数的合理设置,算法能够达到全局搜索和局部搜索的平衡,快速收敛到模型的最优解。本文选取上证50股2014年下半年126个交易日的数据,运用Matlab软件进行仿真实验,实验结果显示,考虑交易成本的投资组合模型可使投资者得到更高的收益率。研究同时表明,基于PSO和GSA的混合算法在求解投资组合模型时比单一算法具有更好的性能,能够得到满意的优化结果。  相似文献   

19.
针对非线性大扰动翼型气动力优化问题,提出了基于卷积神经网络气动力降阶模型的优化方法.该方法用不同形状参数下翼型的气动力数据作为训练信号,训练卷积神经网络翼型气动力降阶模型.采用该气动力降阶模型,以最大升阻比为目标,对翼型进行优化,结果表明该方法可用于大扰动下翼型气动力的预测和优化.该文同时还讨论了池化法和径向基法的训练...  相似文献   

20.
Abstract

Projection pursuit describes a procedure for searching high-dimensional data for “interesting” low-dimensional projections via the optimization of a criterion function called the projection pursuit index. By empirically examining the optimization process for several projection pursuit indexes, we observed differences in the types of structure that maximized each index. We were especially curious about differences between two indexes based on expansions in terms of orthogonal polynomials, the Legendre index, and the Hermite index. Being fast to compute, these indexes are ideally suited for dynamic graphics implementations.

Both Legendre and Hermite indexes are weighted L 2 distances between the density of the projected data and a standard normal density. A general form for this type of index is introduced that encompasses both indexes. The form clarifies the effects of the weight function on the index's sensitivity to differences from normality, highlighting some conceptual problems with the Legendre and Hermite indexes. A new index, called the Natural Hermite index, which alleviates some of these problems, is introduced.

A polynomial expansion of the data density reduces the form of the index to a sum of squares of the coefficients used in the expansion. This drew our attention to examining these coefficients as indexes in their own right. We found that the first two coefficients, and the lowest-order indexes produced by them, are the most useful ones for practical data exploration because they respond to structure that can be analytically identified, and because they have “long-sighted” vision that enables them to “see” large structure from a distance. Complementing this low-order behavior, the higher-order indexes are “short-sighted.” They are able to see intricate structure, but only when they are close to it.

We also show some practical use of projection pursuit using the polynomial indexes, including a discovery of previously unseen structure in a set of telephone usage data, and two cautionary examples which illustrate that structure found is not always meaningful.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号