首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
在轮廓监控中,产品或过程的质量特征可以由一种特定的函数关系表示。如果轮廓的函数形式是已知的,则可以使用参数化方法来监控轮廓。然而,当轮廓形态复杂时,继续使用参数方法则可能导致由于模型设定不准确而无法正确识别异常轮廓的问题。因此本文提出了一种基于非参数回归的新方法以解决制造过程中常见的复杂轮廓监控问题。所提方法将基于非参数回归的B样条与迭代的聚类分析过程相结合,在应用过程中不需要对轮廓的形式进行限制性假设。仿真研究评估了该监控方法在不同变异情况下的性能,并且通过与现有方法的比较分析,验证了该方法的有效性和优越性。最后通过轮廓监控领域的一个经典案例说明了新方法的实际应用效果。  相似文献   

2.
轮廓线的变点识别是质量管理的研究热点之一,当前研究多以轮廓整体变化为识别对象,而对局部变化问题研究相对较少,且更少有在发现变异时间的同时能够寻找到变化区域在个体轮廓曲线上位置的系统方法。本文针对轮廓线局部变化识别问题,提出基于小波变换和聚类分析的方法。通过仿真性能评价,并与现有方法进行比较,结果显示本方法能够在更小的差异度检测出变化并准确定位变化区域。在文章的末尾,本文采用了一个实例对该方法的效果进行验证。  相似文献   

3.
针对ARMA模型建模过程中模型识别和参数估计易受观测值异常点影响问题,构建了同时考虑加性异常点和更新性异常点的ARMA模型.运用基于Gibbs抽样的Markov Chain Monte Carlo贝叶斯方法,估计稳健ARMA模型参数,同步确定观测值中异常点的位置,辨别异常点类型.并利用我国人口自然增长数据进行仿真分析,研究结果表明:贝叶斯方法能够有效地识别ARMA序列的异常点.  相似文献   

4.
传统线性模型异常点识别方法容易发生误判:正常点被归为异常点或者异常点被归为正常点.为解决此类问题,提出了应用逆跳马尔科夫蒙特卡洛方法识别异常点的思想,同时将其应用于实际数据加以检验,识别效果明显好于传统方法.  相似文献   

5.
以不需要过程分布假设前提的自由分布变点识别问题为研究对象,针对单观测数据序列提出基于K-S检验的变点识别流程,以上证A股日收盘价序列数据为依据验证识别方法的有效性,仿真性能测试结果表明新方法适合多种不同类型的分布过程,其综合性能优于其他类型方法.  相似文献   

6.
综合评价中异常值的识别及无量纲化处理方法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对综合评价中的异常值现象,讨论了原始数据中是否存在异常值、若存在异常值该如何识别异常值以及对含有异常值的评价数据如何进行无量纲化处理三个问题。关于异常值的判断与识别,给出了以“中位数”为参考点,通过比较排序后两端数据偏离中位数的距离的处理思路。对含有异常值的评价数据的无量纲化处理问题,基于常用的“极值处理法”,通过分别指定异常值和非异常值无量纲化取值区间的方式,提出了一种分段的无量纲化处理方法。最后,通过与已有文献异常值识别及无量纲化处理结果的对比分析,验证了本文方法的有效性,发现本文给出的方法能够实现对异常值的适度筛选,且能够提升无量纲化数据分布均衡性。  相似文献   

7.
考虑ATM交易过程当中产生的一系列参数,如交易量、交易成功率和响应时间等,对交易状态特征进行分析并建立了异常检测模型。针对成功率与响应时间2个参数,利用聚类算法将数据点划分为正常点、疑似异常点、异常点3大类。对于疑似的异常点,再根据其时间序列周围点的分布情况确定是否确实为异常点;对于交易量参数,首先通过LOF局部离群因子对离群点进行识别,再结合交易量随时间的移动均线及标准差加以辅助筛选,得到初步的疑似异常点,进一步通过与不同天同一时刻数据进行比较,最终确定是否为异常点。根据上述模型,本文将异常情况划分为3个预警等级,并对重大故障情况进行预测。  相似文献   

8.
变量选择控制图是高维统计过程监控的重要方法。针对传统变量选择控制图较少考虑高维过程空间相关性而造成监控效率低的问题,提出一种基于Fused-LASSO的高维空间相关过程监控模型。首先,利用Fused LASSO算法对似然比检验进行改进;然后,推导出基于惩罚似然比的监控统计量;最后,通过仿真模拟和真实案例分析所提监控模型的性能。仿真实验和真实案例均表明:在高维空间相关过程中,当相邻监控变量同时发生异常时,利用所提监控方法能够准确识别潜在异常变量,取得较好的监控效果。  相似文献   

9.
《数理统计与管理》2019,(2):326-333
时间序列数据的处理及挖掘一直是业界关注的热点,而海表温度也一直是人们观测、研究和预报的重要对象。本文主要考虑对一年的跨度进行切割,使得落在每个切割区间的海表温度数据满足最优的正态分布,以便对遥感数据的异常性作出检验。结合2003-2011年南海和东海海表温度数据集,本文引入Floyd算法,将寻求数据集最优分割问题转化为图论中网络中最短路求解问题,将不超过30天的点之间的距离设定为无穷大,以避免分割点过于密集的情况,并将频率与概率的距离定义的误差转化为线路权重,实现了动态全局最优分割。且正态分布下的3σ异常值检验法,实现了对异常值的识别。  相似文献   

10.
本文介绍了一个有效的处理高维变点问题的方法。我们先将数据矩阵使用主成分分析的方法投影到低维空间,然后再利用传统变点的方法来进行估计。在变点个数未知时,我们使用交叉核实的方法来估计变点个数。在数值模拟研究中,我们将新方法同一些已有的方法进行了比较,在估计的准确度和计算时间等方面都要优于其他方法。  相似文献   

11.
In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results, particularly for two-stage analyses. In the DEA-related literature, prior work on this issue has focused on the efficient frontier as a basis for detecting outliers. An iterative approach for dealing with the potential for one outlier to mask the presence of another has been proposed but not demonstrated. This paper proposes using both the efficient frontier and the inefficient frontier to identify outliers and thereby improve the accuracy of second stage results in two-stage nonparametric analysis. The iterative outlier detection approach is implemented in a leave-one-out method using both the efficient frontier and the inefficient frontier and demonstrated in a two-stage semi-parametric bootstrapping analysis of a classic data set. The results show that the conclusions drawn can be different when outlier identification includes consideration of the inefficient frontier.  相似文献   

12.
高质量的决策越来越依赖于高质量的数据挖掘及其分析,高质量的数据挖掘离不开高质量的数据.在大型仪器利用情况调查中,由于主客观因素,总是致使有些数据出现异常,影响数据的质量.这就需要通过适用的方法对异常数据进行检测处理.不同类型数据往往需要不同的异常值检测方法.分析了大型仪器利用情况调查数据的总体特点、一般方法,并以国家科技部平台中心主持的"我国大型仪器资源现状调查"(2009)中大型仪器使用机时和共享机时数据为主线,比较研究了回归方法、基于深度的方法和箱线图方法等对不同类型数据异常值检测的适用性.选取不同角度,检验并采用不同的适用方法,找出相关的可疑异常值,有助于下一步有效开展大型仪器利用情况异常数据的分析处理,提高数据质量,为大型仪器利用情况综合评价奠定基础,也为科技资源调查数据预处理中异常值检测方法提供有益借鉴.  相似文献   

13.
We propose new tools for visualizing large amounts of functional data in the form of smooth curves. The proposed tools include functional versions of the bagplot and boxplot, which make use of the first two robust principal component scores, Tukey’s data depth and highest density regions.

By-products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with existing methods for detecting outliers in functional data, and show that our methods are better able to identify outliers.

An R-package containing computer code and datasets is available in the online supplements.  相似文献   

14.
Cluster-based outlier detection   总被引:1,自引:0,他引:1  
Outlier detection has important applications in the field of data mining, such as fraud detection, customer behavior analysis, and intrusion detection. Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Outliers are traditionally considered as single points; however, there is a key observation that many abnormal events have both temporal and spatial locality, which might form small clusters that also need to be deemed as outliers. In other words, not only a single point but also a small cluster can probably be an outlier. In this paper, we present a new definition for outliers: cluster-based outlier, which is meaningful and provides importance to the local data behavior, and how to detect outliers by the clustering algorithm LDBSCAN (Duan et al. in Inf. Syst. 32(7):978–986, 2007) which is capable of finding clusters and assigning LOF (Breunig et al. in Proceedings of the 2000 ACM SIG MOD International Conference on Manegement of Data, ACM Press, pp. 93–104, 2000) to single points.  相似文献   

15.
异常点诊断是统计学中的经典问题.发现并减少异常点对纳税评估数据分析的影响是一项很有意义的研究.然而,通常的异常点诊断一般采用适用于单峰分布的全局识别方法.借鉴局部域相关积分(Local correlation integral)理论,提出基于非参数密度估计的识别方法.方法适用于多峰分布,能识别局域性质的异常点,对异常点占比较高的样本也有较强的识别能力.基于某市10 920个企业样本,实证分析对比研究了税务局目前使用的和建议的纳税评估方法,结果表明税务局采用的方法有较大的纳税评估风险(误判风险).  相似文献   

16.
Summary  The problem of detection of multidimensional outliers is a fundamental and important problem in applied statistics. The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has led to development of techniques which have been known in the statistical community for well over a decade. The literature on this subject is vast and growing. In this paper, we propose to use the artificial intelligence technique ofself-organizing map (SOM) for detecting multiple outliers in multidimensional datasets. SOM, which produces a topology-preserving mapping of the multidimensional data cloud onto lower dimensional visualizable plane, provides an easy way of detection of multidimensional outliers in the data, at respective levels of leverage. The proposed SOM based method for outlier detection not only identifies the multidimensional outliers, it actually provides information about the entire outlier neighbourhood. Being an artificial intelligence technique, SOM based outlier detection technique is non-parametric and can be used to detect outliers from very large multidimensional datasets. The method is applied to detect outliers from varied types of simulated multivariate datasets, a benchmark dataset and also to real life cheque processing dataset. The results show that SOM can effectively be used as a useful technique for multidimensional outlier detection.  相似文献   

17.
This paper explains some drawbacks on previous approaches for detecting influential observations in deterministic nonparametric data envelopment analysis models as developed by Yang et al. (Annals of Operations Research 173:89–103, 2010). For example efficiency scores and relative entropies obtained in this model are unimportant to outlier detection and the empirical distribution of all estimated relative entropies is not a Monte-Carlo approximation. In this paper we developed a new method to detect whether a specific DMU is truly influential and a statistical test has been applied to determine the significance level. An application for measuring efficiency of hospitals is used to show the superiority of this method that leads to significant advancements in outlier detection.  相似文献   

18.
A multivariate outlier detection method for interval data is proposed that makes use of a parametric approach to model the interval data. The trimmed maximum likelihood principle is adapted in order to robustly estimate the model parameters. A simulation study demonstrates the usefulness of the robust estimates for outlier detection, and new diagnostic plots allow gaining deeper insight into the structure of real world interval data.  相似文献   

19.
There exist many data clustering algorithms, but they can not adequately handle the number of clusters or cluster shapes. Their performance mainly depends on a choice of algorithm parameters. Our approach to data clustering and algorithm does not require the parameter choice; it can be treated as a natural adaptation to the existing structure of distances between data points. The outlier factor introduced by the author specifies a degree of being an outlier for each data point. The outlier factor notion is based on the difference between the frequency distribution of interpoint distances in a given dataset and the corresponding distribution of uniformly distributed points. Then data clusters can be determined by maximizing the outlier factor function. The data points in dataset are divided into clusters according to the attractor regions of local optima. An experimental evaluation of the proposed algorithm shows that the proposed method can identify complex cluster shapes. Key advantages of the approach are: good clustering properties for datasets with comparatively large amount of noise (an additional data points), and an absence of important parameters which adequate choice determines the quality of results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号