共查询到18条相似文献,搜索用时 779 毫秒
1.
马田系统是以马氏距离为测量尺度,通过选取正常样本构建马氏空间,对多元系统进行诊断和预测的分类技术。马氏距离对样本数据的变化非常敏感,因此用于构建马氏空间的正常样本的数据质量直接影响到分类的准确率。实际应用中正常样本的选取大多依据主观经验判断,缺乏客观规范的选择机制。本文提出基于控制图的马氏空间生成机理,先由专家选取的正常样本构建初始马氏空间,再以每个正常样品在初始马氏空间和对应的缩减马氏空间上的马氏距离增量作为新的测量尺度,以此建立单值控制图,利用控制图稳定性判定规则剔除异常数据,从而得到稳定状态的马氏空间。实验分析结果表明该方法的有效性且提高了马田系统分类的准确率。 相似文献
2.
3.
4.
5.
传统的聚类方法由于无法提取样本和变量间的局部对应关系,并且当数据具有高维性和稀疏性时表现不佳,因此学者们提出了双向聚类,基于样本和变量间的局部关系,同时对样本和变量进行聚类,形成一系列子矩阵的聚类结果。近年来,双向聚类发展迅速,在基因分析、文本聚类、推荐系统等领域应用广泛。首先,对双向聚类方法进行梳理与归纳,重点阐述稀疏双向聚类、谱双向聚类和信息双向聚类三类方法,分析它们之间的区别和联系,并且介绍这三类方法在多源数据的整合分析、多层聚类、半监督学习以及集成学习上的发展现状和趋势;其次,重点介绍双向聚类在基因分析、文本聚类、推荐系统等领域的应用研究情况;最后,结合大数据时代的数据特征和双向聚类存在的问题,展望双向聚类未来的研究方向。 相似文献
6.
类内距离和类间距离数值量级差异性导致两类距离无法直接融合,进而影响了FCM聚类模型设计。首先,本文全面回顾了经典和改进型的FCM聚类模型,构建了类内距离和类间距离迹的关系模型,分别从类内类间距离的变化不一致性和量级差异性两个方面分析了现有FCM聚类模型的不足;其次,运用高斯核距离替代传统的欧式距离来表征类内类间距离,基于最小化类内紧凑度与类间分离度差的思想,设计了类内类间距离平衡方法,提出了一种改进的FCM聚类目标函数与算法;最后,运用算例说明了本方法的有效性和优越性。 相似文献
7.
8.
针对包含多个正常类的多元数据异常检测问题,提出了一种基于多分类马田系统的半监督数据异常检测方法.通过对训练数据集中的每个正常类分别建立马氏空间,构建了基于马氏距离的多类测量尺度,方法对测试数据集中正常数据进行分类的同时,能够实现对异常数据的检测.通过模拟带异常值的高斯混合模型数据验证了该方法的有效性. 相似文献
9.
10.
区间型符号数据是一种重要的符号数据类型,现有文献往往假设区间内的点数据服从均匀分布,导致其应用的局限性。本文基于一般分布的假设,给出了一般分布区间型符号数据的扩展的Hausdorff距离度量,基于此提出了一般分布的区间型符号数据的SOM聚类算法。随机模拟试验的结果表明,基于本文提出的基于扩展的Hausdorff距离度量的SOM聚类算法的有效性优于基于传统Hausdorff距离度量的SOM聚类算法和基于μσ距离度量的SOM聚类算法。最后将文中方法应用于气象数据的聚类分析,示例文中方法的应用步骤与可操作性,并进一步评价文中方法在解决实际问题中的有效性。 相似文献
11.
A local geometrical properties application to fuzzy clustering 总被引:1,自引:0,他引:1
Possibilistic clustering is seen increasingly as a suitable means to resolve the limitations resulting from the constraints imposed in the fuzzy C-means algorithm. Studying the metric derived from the covariance matrix we obtain a membership function and an objective function whether the Mahalanobis distance or the Euclidean distance is used. Applying the theoretical results using the Euclidean distance we obtain a new algorithm called fuzzy-minimals, which detects the possible prototypes of the groups of a sample. We illustrate the new algorithm with several examples. 相似文献
12.
To classify time series by nearest neighbors, we need to specify or learn one or several distance measures. We consider variations of the Mahalanobis distance measures which rely on the inverse covariance matrix of the data. Unfortunately??for time series data??the covariance matrix has often low rank. To alleviate this problem we can either use a pseudoinverse, covariance shrinking or limit the matrix to its diagonal. We review these alternatives and benchmark them against competitive methods such as the related Large Margin Nearest Neighbor Classification (LMNN) and the Dynamic Time Warping (DTW) distance. As we expected, we find that the DTW is superior, but the Mahalanobis distance measures are one to two orders of magnitude faster. To get best results with Mahalanobis distance measures, we recommend learning one distance measure per class using either covariance shrinking or the diagonal approach. 相似文献
13.
An appropriate distance is an essential ingredient in various real-world learning tasks. Distance metric learning proposes to study a metric, which is capable of reflecting the data configuration much better in comparison with the commonly used methods. We offer an algorithm for simultaneous learning the Mahalanobis like distance and K-means clustering aiming to incorporate data rescaling and clustering so that the data separability grows iteratively in the rescaled space with its sequential clustering. At each step of the algorithm execution, a global optimization problem is resolved in order to minimize the cluster distortions resting upon the current cluster configuration. The obtained weight matrix can also be used as a cluster validation characteristic. Namely, closeness of such matrices learned during a sample process can indicate the clusters readiness; i.e. estimates the true number of clusters. Numerical experiments performed on synthetic and on real datasets verify the high reliability of the proposed method. 相似文献
14.
总体协差阵为单位阵的最小距离判别 总被引:1,自引:0,他引:1
本引进最小距离域定义,得到了确定最小距离域的计算方法。在马氏距离判别定理的基础上,确定了最小距离判别规则,得到了利用该规则的判别方法。 相似文献
15.
针对区间数多指标群决策问题,提出一种基于集值统计模型的改进灰靶决策方法。首先利用集值统计模型对多专家的区间评价进行估计,得到符合可信度要求的决策指标样本矩阵。然后利用基于加权广义马氏距离的灰靶决策方法对决策方案进行排序,给出决策样本为区间数群决策矩阵形式的灰靶决策模型。最后通过一个具体的算例给出决策方法的过程,避免了马氏距离不存在的情况,克服了决策指标间的相关性、重要性差异和不同量纲对决策过程和决策结果的影响,方法的可行性与有效性得到验证。 相似文献
16.
针对决策指标之间的相关性问题,将马氏距离引入传统TOPSIS方法,提出了基于马氏距离的TOPSIS方法.在此基础上,分析了基于马氏距离改进后贴近度的性质,并以投资决策方案选择为例加以说明.结果表明,基于马氏距离改进的TOPSIS方法对决策数据的非奇异线性变换具有不变性.协方差矩阵体现了决策指标之间的相关性,因而可以有效避免指标的相关性对决策效果的影响. 相似文献
17.
Jianhui Zhou 《Journal of multivariate analysis》2009,100(1):195-209
The canonical correlation (CANCOR) method for dimension reduction in a regression setting is based on the classical estimates of the first and second moments of the data, and therefore sensitive to outliers. In this paper, we study a weighted canonical correlation (WCANCOR) method, which captures a subspace of the central dimension reduction subspace, as well as its asymptotic properties. In the proposed WCANCOR method, each observation is weighted based on its Mahalanobis distance to the location of the predictor distribution. Robust estimates of the location and scatter, such as the minimum covariance determinant (MCD) estimator of Rousseeuw [P.J. Rousseeuw, Multivariate estimation with high breakdown point, Mathematical Statistics and Applications B (1985) 283-297], can be used to compute the Mahalanobis distance. To determine the number of significant dimensions in WCANCOR, a weighted permutation test is considered. A comparison of SIR, CANCOR and WCANCOR is also made through simulation studies to show the robustness of WCANCOR to outlying observations. As an example, the Boston housing data is analyzed using the proposed WCANCOR method. 相似文献
18.
江苏作为"一带一路"战略的交汇点,有必要探究其各地区外向型经济发展能力.马氏距离具备消除指标间的相关性且不受量纲影响,代替TOPSIS中的欧氏距离,运用灰色关联度来判断指标的关联性,建立基于马氏距离、灰色关联度的TOPSIS外向型经济发展能力评价模型.以江苏13个地级城市为研究对象,进行实证研究.研究表明,外向型经济发展能力评价模型有助于综合判断各城市外向经济发展能力,发现短板并加以整改,促进"一带一路"建设. 相似文献