首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 521 毫秒
1.
丁霞  张晓飞  易鸣 《数学杂志》2017,37(5):1093-1100
本文研究了组织特异性蛋白质复合体的识别问题.利用蛋白质相互作用网络数据以及组织特异性基因表达数据构建组织特异性蛋白网络,利用多种代表性聚类算法对该网络进行聚类,并利用非负矩阵分解对聚类结果进行合并聚类,得到了组织特异性蛋白质复合体.结果表明,聚类效果得到明显提升,并且能识别出组织特异性蛋白质复合体.  相似文献   

2.
针对肿瘤的早期诊断,提出了一种基于提升小波变换的特征提取的方法,对肿瘤数据样本进行分析鉴别.该方法利用提升小波变换对190例肝癌(包括对照)和107例肺癌(包括对照)基因表达谱芯片数据进行处理后,提取信号的低频信息,经支持向量机训练学习,构造分类器模型,用于癌和非癌样本的区分甄别.实验结果表明,经提升小波变换提取的特征基因,送入分类器中能得到较高的分类率,且在支持向量机中选取线性核函数或径向基函数都能达到较好的分类效果.通过随机选取的20例基因表达谱芯片样本,对所建立的模型进行了测试,获得了很好的效果,因此,本文提出的方法对肿瘤的诊断有一定的应用意义.  相似文献   

3.
利用小波分析预测方法对金融数据—股票收盘价这一典型的非平稳时间序列进行预测.使用M a llat小波分解算法对数据进行分解,对分解后的数据进行平滑处理,然后再进行重构,而重构之后的数据就成为近似意义的平稳时间序列,这样就得到了原始数据的近似信号,再应用传统时间序列预测方法对重构后的数据进行预测,将预测结果与实际值,以及和传统预测方法预测结果比较,小波分析方法预测效果更为理想.  相似文献   

4.
介绍应用小波分析理论解决时间序列统计数据的测量误差消除问题,实例证明借助离散小波分解与重构手段,可有效地从误差干扰的统计数据序列中提取统计数据的原始特征.完成CPI经济序列数据预测,为CPI统计数据的误差消除引入一种有效方法.  相似文献   

5.
分析蛋白质相互作用网络的拓扑结构特征与生物进化之间的关系,并用于预测蛋白质的功能是后基因时代重要的研究课题.本文提出了基于模糊数学理论的蛋白质网络多种拓扑属性模糊关系数学模型,并应用频数分析算法将无序的模糊关系数据进行序列化.通过分析蛋白质网络拓扑属性模糊关系参数可以为研究生物进化与蛋白质网络结构之间的关系提供数据基础,同时也为预测未知蛋白质的功能奠定了基础.该方法为研究蛋白质网络进化和标记蛋白质的功能开拓了一个新的方向.  相似文献   

6.
针对合成孔径雷达图像的分类优化方法,提出一种基于多特征与卷积神经网络的SAR图像分类方法Canny-WTD-CNN.将Canny算子提取的边缘特征,与小波阈值去噪法提取的小波特征进行自适应融合,作为卷积神经网络的输入;以softmax为分类器,对SAR图像进行分类识别检测.最后利用MSTAR公开数据集的三类目标数据进行试验,并给出该方法与其他方法结果的对比,表明该方法的有效性,识别率达到99.14%.  相似文献   

7.
研究了修正的等熵Van der Waals气体动力学Euler方程Riemann问题及其基本波的相互作用.利用Maxwell提出的等面积法则,将Van der Waals气体状态方程修正为与实际相符,从而守恒律方程组从混合型转化为双曲型.利用广义特征线分析法,构造性地得到了Riemann问题的解是存在的.进一步,得到了基本波相互作用.  相似文献   

8.
以“平安银行” 00001号股票收盘价为实证背景,基于小波分析下的滑动GA-BP-GRACH模型对该股票变化趋势进行预测研究,即:通过小波分解得到两类股票变化数据(低频、高频),并建立滑动窗口下的GA-BP神经网络对其低频数据进行预测,鉴于高频数据表现出的波动性特点,采用GRACH模型进行预测.结果显示,两类模型的预测效果均为良好.最后,再基于小波重构得到股票的最终预测数值.实验表明,所述模型在股票预测方面比传统神经网络模型更加优越,对股票变化规律刻画也有着一定的参考价值.  相似文献   

9.
为了使所建气动力模型能准确地刻画飞行器的气动特性,提出一种基于特征提取的小波网络气动力建模方法,通过对横侧向与纵侧向的飞行数据训练样本进行核主成分分析特征,提取出训练样本的基本特征;利用提取的基本特征分别构建横侧向与纵侧向小波网络气动力模型.实验结果表明,提出的集成建模方法预测精度高,用于飞行器的气动力建模是有效的,也是可行的.  相似文献   

10.
杨进  陈亮 《经济数学》2018,(2):62-67
为了实现对股票价格变化的短期预测,提出了一种基于小波神经网络(WNN)与自回归积分滑动平均模型(ARIMA)的组合预测模型.将股票的收盘价序列数据划分为线性以及非线性(误差项)两个部分,分别利用统计学中ARIMA模型和小波神经网络分别对两部分数据进行预测并得到结果,将两部分结果组合相加合成为整个股票价格的预测结果.实验结果表明该组合模型在预测精度方面有提高,是一种比较有效的预测模型.  相似文献   

11.
In a recent paper published in this Journal, Lovell and Rouse (LR) proposed a modification of the standard data envelopment analysis (DEA) model that overcomes the infeasibility problem often encountered in computing super-efficiency. In the LR procedure one appropriately scales up the observed input vector (scale down the output vector) of the relevant super-efficient firm thereby usually creating its inefficient surrogate. By contrast, Chen suggested a different procedure that replaces input–output bundles that are found to be inefficient in standard DEA by their efficient projections. An alternative procedure proposed in this paper uses the directional distance function and the resulting Nerlove–Luenberger measure of super-efficiency. The fact that the directional distance function combines, by definition, features of both an input-oriented and an output-oriented model, generally leads to a complete ranking of the observations and is easily interpreted. A dataset on international airlines is utilized in an illustrative empirical application.  相似文献   

12.
张文  王强  唐子旭  秦广杰  李健 《运筹与管理》2022,31(11):167-173
机器学习相关技术的发展提升了在线虚假评论识别的准确率,然而现阶段机器学习模型缺少足够量的已标注数据来进行模型训练。本文基于生成式对抗网络(GAN)提出了评论数据集扩充方法GAN-RDE(GAN-Review Dataset Expansion)以解决虚假评论识别中模型训练数据贫乏问题。具体而言,首先将初始评论数据划分为真实评论数据集和虚假评论数据集,使用真实评论数据集和虚假评论数据集分别训练GAN,生成符合真实评论与虚假评论特征分布的向量。然后将GAN训练得到的符合评论特征分布的向量与初始评论数据集的特征词词向量矩阵进行合并,扩充模型训练数据。最后,利用朴素贝叶斯、多层感知机和支持向量机作为基础分类器,对比数据扩充前后虚假评论识别的效果。实验结果表明,使用GAN-RDE方法扩充评论数据集后,机器学习模型对虚假评论识别准确率得到显著提升。  相似文献   

13.
14.
Although support vector regression models are being used successfully in various applications, the size of the business datasets with millions of observations and thousands of variables makes training them difficult, if not impossible to solve. This paper introduces the Row and Column Selection Algorithm (ROCSA) to select a small but informative dataset for training support vector regression models with standard SVM tools. ROCSA uses ε-SVR models with L1-norm regularization of the dual and primal variables for the row and column selection steps, respectively. The first step involves parallel processing of data chunks and selects a fraction of the original observations that are either representative of the pattern identified in the chunk, or represent those observations that do not fit the identified pattern. The column selection step dramatically reduces the number of variables and the multicolinearity in the dataset, increasing the interpretability of the resulting models and their ease of maintenance. Evaluated on six retail datasets from two countries and a publicly available research dataset, the reduced ROCSA training data improves the predictive accuracy on average by 39% compared with the original dataset when trained with standard SVM tools. Comparison with the ε SSVR method using reduced kernel technique shows similar performance improvement. Training a standard SVM tool with the ROCSA selected observations improves the predictive accuracy on average by 21% compared to the practical approach of random sampling.  相似文献   

15.
Multi-dimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multi-dimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 0-1 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the single-class setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform state-of-the-art algorithms for multi-label classification.  相似文献   

16.
为在数据缺失的情况下进行心脏病诊断并获得较高的准确率,对缺失值进行处理后,利用径向基函数支持向量机,采用交叉验证和网格搜索寻找最佳惩罚参数和关联参数,对UCI Heart数据集进行分类,多分类准确率为81.89%,二分类准确率为89.61%.仿真结果表明,支持向量机网络模型性能稳定,样本追加能力强,训练时间短,分类效果好,在心脏病等医疗诊断中有很大的应用潜力.  相似文献   

17.
18.

This paper reviews real estate price estimation in France, a market that has received little attention. We compare seven popular machine learning techniques by proposing a different approach that quantifies the relevance of location features in real estate price estimation with high and fine levels of granularity. We take advantage of a newly available open dataset provided by the French government that contains 5 years of historical data of real estate transactions. At a high level of granularity, we obtain important differences regarding the models’ prediction powers between cities with medium and high standards of living (precision differences beyond 70% in some cases). At a low level of granularity, we use geocoding to add precise geographical location features to the machine learning algorithm inputs. We obtain important improvements regarding the models’ forecasting powers relative to models trained without these features (improvements beyond 50% for some forecasting error measures). Our results also reveal that neural networks and random forest techniques particularly outperform other methods when geocoding features are not accounted for, while random forest, adaboost and gradient boosting perform well when geocoding features are considered. For identifying opportunities in the real estate market through real estate price prediction, our results can be of particular interest. They can also serve as a basis for price assessment in revenue management for durable and non-replenishable products such as real estate.

  相似文献   

19.
We introduce the ordered weighted averaging (OWA) operator and emphasize how the choice of the weights, the weighting vector, allows us to implement different types of aggregation. We describe two important characterizing features associated with OWA weights. The first of these is the attitudinal character and the second is measure of dispersion. We discuss some methods for generating the weights and the role that these characterizing features can play in the determination of the OWA weights. We note that while in many cases these two features can help provide a clear distinction between different types of OWA operators there are some important cases in which these two characterizing features do not distinguish between OWA aggregations. In an attempt to address this we introduce a third characterizing feature associated with an OWA aggregation called the focus. We look at the calculation of this feature in a number of different situations.  相似文献   

20.
In recent years, support vector machines (SVMs) were successfully applied to a wide range of applications. However, since the classifier is described as a complex mathematical function, it is rather incomprehensible for humans. This opacity property prevents them from being used in many real-life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. To overcome this limitation, rules can be extracted from the trained SVM that are interpretable by humans and keep as much of the accuracy of the SVM as possible. In this paper, we will provide an overview of the recently proposed rule extraction techniques for SVMs and introduce two others taken from the artificial neural networks domain, being Trepan and G-REX. The described techniques are compared using publicly available datasets, such as Ripley’s synthetic dataset and the multi-class iris dataset. We will also look at medical diagnosis and credit scoring where comprehensibility is a key requirement and even a regulatory recommendation. Our experiments show that the SVM rule extraction techniques lose only a small percentage in performance compared to SVMs and therefore rank at the top of comprehensible classification techniques.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号