首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
主要研究垃圾文本识别问题,利用苹果手机评论文本特征向量建立了SVM分类模型对垃圾文本进行识别,并与BP神经网络判别模型结果进行对比,得出苹果手机前400组训练样本的判别正确率为71%,后196组测试样本的判别正确率为70.12%.故得到,影响垃圾观点文本识别效果的主要原因为:1)评论文本的特征项的提取和文本特征空间向量求解.2)判别分类方法的选择,其中SVM文本识别效果最优.  相似文献   

2.
基于贝叶斯逐步判别法构建入侵检测模型,将入侵检测转化为一个分类判别问题,基于步进式引入的方法淘汰冗余的特征变量,能够在保障判别效果的前提下有效降低原分类判别函数的计算复杂度.使用KDD CUP99数据中10%数据集作为实验数据,以常见的拒绝服务攻击(DoS攻击)为例创建具体的模型实例,实验结果表明,模型对于样本内连接记录的回代判对率和样本外连接记录的检测正确率均较高.  相似文献   

3.
本文根据一组糖尿病患者所测的数据,应用逐步Beyes判别法及逐步Fisher判别法建立判断糖尿病分型的判别方程,选入判别方程中的因子都与医学理论及临床相符.用所建立的判别方程对原数据的糖尿病患者进行回代诊断,其国代正确率分别为 93.75%及90%.诊断正确率较高,应用方便,可在基层医院等单位推广使用.  相似文献   

4.
根据65例正常肺功能的样本数据,应用主成份分析方法,我们算得第一、第二主成份。将这二个主成份看作近似正态分布,利用计算机打印出平面上的置信椭圆。对每一个被诊断者,可以利用该置信椭圆作出判别诊断。通过208例肺功能异常者对该置信椭圆作试验,发现判别的正确率为97.1%。  相似文献   

5.
在判别分析过程中,当一些学习数据的类别标签发生错误时,传统方法的判别效果不佳.为克服这一缺陷,提出一个基于混合模型的稳健判别方法,参数估计分两步完成.一个模拟数据和一个实际数据的判别结果表明,方法可显著提高分类正确率,比传统方法具有明显的优势.  相似文献   

6.
基于脑电信号非平稳、复杂、信噪比低的特性,使用经验模式分解(EMD)算法对脑电信号进行分解,提取主要IMF分量的特征值,之后使用模糊C-均值(FCM)进行分类,并与现有的几种脑电分类方法做了对比研究.研究结果表明,基于2003年第二届BCI大赛脑电信号库的分类正确率达到78%,对于现有的脑电分类方法有一定的借鉴意义.  相似文献   

7.
有序判别分析新算法及其应用   总被引:1,自引:1,他引:0  
判别分析是用已知分类数据建模对未知分类数据进行判别的方法,所用数据和分类不分顺序。要对有序又有周期数据进行判别分析,就要探索有序判别的新方法。这种方法的分类应当是有序的,并且能够排除事物发展周期性的干扰。本文介绍多元数据有序判别分析新方法的原理、建模流程、应用流程和应用实例。这种判别分析将分类建模与判别归类分开。新方法对多元数据建模时在多类模型中建立滑移的多套子模型,应用时根据应用领域的知识对样本归属作初步预估,然后程序选择相关的子模型进行判别归类。这种方法解决了由于时间序列多元数据周期性造成的样本分类颠倒问题,为时间序列数据的分类和预测开辟了新途径,在实际应用中取得了良好的效果,解决了重大难题。  相似文献   

8.
本文提出一个新的多元统计方法:双重筛选逐步判别分类,其目的是解决已知模型样本较少时,如何进行两类样本判别分类的问题,将此方法用于地质学中对于有矿和无矿的判别,取得较好的应用效果。  相似文献   

9.
本文应用模糊集理论,将采煤工作面顶板稳定程度作为采煤工作面顶板这一论域U上的模糊子集A,介绍了一种构造多元隶属函数的方法,并且把它用于对47个采煤工作面顶板进行分类,其判别的正确率为95.74%。我们认为可以作为一种有效的方法应用于煤炭生产实际。  相似文献   

10.
使用凝血四项指标诊断凝血功能是临床的常规检查,但根据经验进行诊断正确率不高.剔除临床上最重要的指标FIB后,建立支持向量机模型诊断的正确率和根据凝血四项指标诊断的正确率没有显著差异,100次模拟的平均正确率分别达到了95.4496%和95.5039%.  相似文献   

11.
The purpose of the present paper is to explore the ability of neural networks such as multilayer perceptrons and modular neural networks, and traditional techniques such as linear discriminant analysis and logistic regression, in building credit scoring models in the credit union environment. Also, since funding and small sample size often preclude the use of customized credit scoring models at small credit unions, we investigate the performance of generic models and compare them with customized models. Our results indicate that customized neural networks offer a very promising avenue if the measure of performance is percentage of bad loans correctly classified. However, if the measure of performance is percentage of good and bad loans correctly classified, logistic regression models are comparable to the neural networks approach. The performance of generic models was not as good as the customized models, particularly when it came to correctly classifying bad loans. Although we found significant differences in the results for the three credit unions, our modular neural network could not accommodate these differences, indicating that more innovative architectures might be necessary for building effective generic models.  相似文献   

12.
针对当前煤层底板突水影响因素复杂、预测精度低及难度大等问题,通过结合主成分分析法(PCA)和Fisher判别分析法,构建了PCA-Fisher煤层底板突水判别模型,并将该判别模型应用于贵州省六盘水月亮田煤矿9号煤层对其进行底板突水危险性预测.笔者将含水层水压、隔水层厚度及煤层倾角等6个指标作为影响该煤层底板突水危险性的主要因素,把18组实测数据输入PCA-Fisher判别模型并进行煤层底板突水预测.结果显示:PCA提取的3个主成分F1、F2及F3的方差贡献率达94.179%,且判别模型的前14组训练样本正确率达85.7%;最后判别未参加训练的后4组样本,误判率为0%,其精度高达100%,结果印证了PCA-Fisher的判别模型对煤层底板突水预测的正确性.  相似文献   

13.
A hybrid genetic model for the prediction of corporate failure   总被引:1,自引:0,他引:1  
This study examines the potential of a neural network (NN) model, whose inputs and structure are automatically selected by means of a genetic algorithm (GA), for the prediction of corporate failure using information drawn from financial statements. The results of this model are compared with those of a linear discriminant analysis (LDA) model. Data from a matched sample of 178 publicly quoted, failed and non-failed, US firms, drawn from the period 1991 to 2000 is used to train and test the models. The best evolved neural network correctly classified 86.7 (76.6)% of the firms in the training set, one (three) year(s) prior to failure, and 80.7 (66.0)% in the out-of-sample validation set. The LDA model correctly categorised 81.7 (75.0)% and 76.0 (64.7)% respectively. The results provide support for a hypothesis that corporate failure can be anticipated, and that a hybrid GA/NN model can outperform an LDA model in this domain.MSC codes: 62M45, 68W10, 90B50, 91C20  相似文献   

14.
Fuzzy ordered classifiers were used to assign fuzzy labels to river sites expressing their suitability as a habitat for a certain macroinvertebrate taxon, given up to three abiotic properties of the considered river site. The models were built using expert knowledge and evaluated on data collected in the Province of Overijssel in the Netherlands. Apart from a performance measure for crisp classifiers common in the aquatic ecology domain, the percentage of correctly classified instances (% CCI), two performance measures for fuzzy (ordered) classifiers are introduced in this paper: the percentage of correctly fuzzy classified instances (% CFCI) and the average deviation (AD). Furthermore, results of an interpretability-preserving genetic optimization of the linguistic terms, applying once binary encoding and once real encoding, are presented.  相似文献   

15.
Soltysik and Yarnold propose, as a method for two-group multivariate optimal discriminant analysis (MultiODA), selecting a linear discriminant function based on an algorithm by Warmack and Gonzalez. An important assumption underlying the Warmack–Gonzalez algorithm is likely to be violated when the data in the discriminant training samples are discrete, and in particular when they are nominal, causing the algorithm to fail. We offer modest changes to the algorithm that overcome this limitation.  相似文献   

16.
Evidence of deficiencies in basic mathematical skills of beginning undergraduates has been documented worldwide. Many different theories have been set out as to why these declines in mathematical competency levels have occurred over time. One such theory is the widening access to higher education which has resulted in a less mathematically prepared profile of beginning undergraduates than ever before. In response to this situation, the present study details the examination of a range of methods through which a student's mathematical performance in higher education could be predicted at the beginning of their third-level studies. Several statistical prediction methods were examined and the most effective method in predicting students’ mathematical performance was discriminant analysis. The discriminant analysis correctly classified 71.3% of students in terms of mathematics performance. An ability to carry out such a prediction in turn allows for appropriate mathematics remediation to be offered to students predicted to fail third-level mathematics. The results of the prediction of mathematical performance, which was carried out using a database consisting of over 1000 beginning undergraduates over a 3-year period, are detailed in this article along with the implications of such findings to educational policy and practice.  相似文献   

17.
We consider the problem of classifying a p× 1 observation into one of two multivariate normal populations when the training samples contain a block of missing observations. A new classification procedure is proposed which is a linear combination of two discriminant functions, one based on the complete samples and the other on the incomplete samples. The new discriminant function is easy to use. We consider the estimation of error rate of the linear combination classification procedure by using the leave-one-out estimation and bootstrap estimation. A Monte Carlo study is conducted to evaluate the error rate and the estimation of it. A numerical example is given tof illustrate its use.  相似文献   

18.
Logistic回归模型在信用风险分析中的应用   总被引:2,自引:0,他引:2  
通过运行SPSS,建立L og istic回归信用评价模型(cred it eva luation m odel),用来对中国2000年106家上市公司进行两类模式分类,这两类模式是指按照公司的经营状况分为“差”和“正常”两个小组.对每一家上市公司,考虑其经营状况的4个主要财务指标:每股收益、每股净资产、净资产收益率和每股现金流量.仿真结果表明,L og istic回归信用评价模型对总体106个样本,判别准确率达到99.06%.此外,本文的研究结果还发现,当利用SPSS的D iscrim inan t给出的模型系数建立的线性判别分析模型和利用SPSS的M u ltinom ia lL og istic给出的模型参数建立的L og istic回归模型,L og istic回归模型的判别结果不如线性判别模型.但如果剔除不合格的样本,或是将样本数据规格化,则可以提高L og istic回归模型的分类准确率.  相似文献   

19.
Bayes判别在进行判别分析时考虑到各总体出现的先验概率、预报的先验概率及错判造成的损失,其判别效能优于其他判别方法.对Bayes判别方法详细介绍的基础上,利用R软件对一组舒张压和胆固醇数据分别进行Bayes判别分析、Fisher判别分析和基于距离的判别分析,对比三种不同方法下得到的判别结果,结果表明Bayes判别分析得到的分类结果精度较高,Bayes判别分析在医学领域有较好的应用前景.  相似文献   

20.
A procedure is presented for finding maximum likelihood estimates of the parameters of a mixture of two Weibull distributions. Estimation of a nonlinear discriminant function on the basis of small sample size is considered. Throughout simulation experiments, the total probabilities of misclassification and percentage biases are evaluated and discussed. The problem of updating a nonlinear discriminant function on the basis of two Weibull distributions is studied in situations when the additional observations are mixed or classified. The performance of all results is investigated using a series of simulation experiments by means of relative efficiencies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号