首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
在时间序列回归模型分析中,相关性和方差齐性的检验是一个很基本的问题.本文讨论了具有双线性BL(1,1,1,1)误差的非线性回归模型的相关性和方差齐性的检验问题, 用Score检验方法给出了双线性项检验、相关性检验、方差齐性检验、以及相关性和方差齐性同时检验的检验统计量.推广和发展了具有线性序列误差项回归模型的结果.本文还用数值实例说明了检验方法的实用价值.  相似文献   

2.
In this paper we introduce COV, a novel information retrieval (IR) algorithm for massive databases based on vector space modeling and spectral analysis of the covariance matrix, for the document vectors, to reduce the scale of the problem. Since the dimension of the covariance matrix depends on the attribute space and is independent of the number of documents, COV can be applied to databases that are too massive for methods based on the singular value decomposition of the document-attribute matrix, such as latent semantic indexing (LSI). In addition to improved scalability, theoretical considerations indicate that results from our algorithm tend to be more accurate than those from LSI, particularly in detecting subtle differences in document vectors. We demonstrate the power and accuracy of COV through an important topic in data mining, known as outlier cluster detection. We propose two new algorithms for detecting major and outlier clusters in databases—the first is based on LSI, and the second on COV. Our implementation studies indicate that our cluster detection algorithms outperform the basic LSI and COV algorithm in detecting outlier clusters.  相似文献   

3.
Summary  The problem of detection of multidimensional outliers is a fundamental and important problem in applied statistics. The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has led to development of techniques which have been known in the statistical community for well over a decade. The literature on this subject is vast and growing. In this paper, we propose to use the artificial intelligence technique ofself-organizing map (SOM) for detecting multiple outliers in multidimensional datasets. SOM, which produces a topology-preserving mapping of the multidimensional data cloud onto lower dimensional visualizable plane, provides an easy way of detection of multidimensional outliers in the data, at respective levels of leverage. The proposed SOM based method for outlier detection not only identifies the multidimensional outliers, it actually provides information about the entire outlier neighbourhood. Being an artificial intelligence technique, SOM based outlier detection technique is non-parametric and can be used to detect outliers from very large multidimensional datasets. The method is applied to detect outliers from varied types of simulated multivariate datasets, a benchmark dataset and also to real life cheque processing dataset. The results show that SOM can effectively be used as a useful technique for multidimensional outlier detection.  相似文献   

4.
This article proposes a new technique for detecting outliers in autoregressive models and identifying the type as either innovation or additive. This technique can be used without knowledge of the true model order, outlier location, or outlier type. Specifically, we perturb an observation to obtain the perturbation size that minimizes the resulting residual sum of squares (SSE). The reduction in the SSE yields outlier detection and identification measures. In addition, the perturbation size can be used to gauge the magnitude of the outlier. Monte Carlo studies and empirical examples are presented to illustrate the performance of the proposed method as well as the impact of outliers on model selection and parameter estimation. We also obtain robust estimators and model selection criteria, which are shown in simulation studies to perform well when large outliers occur.  相似文献   

5.
This paper suggests an outlier detection procedure which applies a nonparametric model accounting for undesired outputs and exogenous influences in the sample. Although efficiency is estimated in a deterministic frontier approach, each potential outlier initially benefits of the doubt of not being an outlier. We survey several outlier detection procedures and select five complementary methodologies which, taken together, are able to detect all influential observations. To exploit the singularity of the leverage and the peer count, the super-efficiency and the order-m method and the peer index, it is proposed to select these observations as outliers which are simultaneously revealed as atypical by at least two of the procedures. A simulated example demonstrates the usefulness of this approach. The model is applied to the Portuguese drinking water sector, for which we have an unusually rich data set.  相似文献   

6.
We consider the problem of deleting bad influential observations (outliers) in linear regression models. The problem is formulated as a Quadratic Mixed Integer Programming (QMIP) problem, where penalty costs for discarding outliers are used into the objective function. The optimum solution defines a robust regression estimator called penalized trimmed squares (PTS). Due to the high computational complexity of the resulting QMIP problem, the proposed robust procedure is computationally suitable for small sample data. The computational performance and the effectiveness of the new procedure are improved significantly by using the idea of ε-Insensitive loss function from support vectors machine regression. Small errors are ignored, and the mathematical formula gains the sparseness property. The good performance of the ε-Insensitive PTS (IPTS) estimator allows identification of multiple outliers avoiding masking or swamping effects. The computational effectiveness and successful outlier detection of the proposed method is demonstrated via simulated experiments. This research has been partially funded by the Greek Ministry of Education under the program Pythagoras II.  相似文献   

7.
We consider a square random matrix of size N of the form A + Y where A is deterministic and Y has i.i.d. entries with variance 1/N. Under mild assumptions, as N grows the empirical distribution of the eigenvalues of A + Y converges weakly to a limit probability measure β on the complex plane. This work is devoted to the study of the outlier eigenvalues, i.e., eigenvalues in the complement of the support of β. Even in the simplest cases, a variety of interesting phenomena can occur. As in earlier works, we give a sufficient condition to guarantee that outliers are stable and provide examples where their fluctuations vary with the particular distribution of the entries of Y or the Jordan decomposition of A. We also exhibit concrete examples where the outlier eigenvalues converge in distribution to the zeros of a Gaussian analytic function. © 2016 Wiley Periodicals, Inc.  相似文献   

8.
9.
This paper explains some drawbacks on previous approaches for detecting influential observations in deterministic nonparametric data envelopment analysis models as developed by Yang et al. (Annals of Operations Research 173:89–103, 2010). For example efficiency scores and relative entropies obtained in this model are unimportant to outlier detection and the empirical distribution of all estimated relative entropies is not a Monte-Carlo approximation. In this paper we developed a new method to detect whether a specific DMU is truly influential and a statistical test has been applied to determine the significance level. An application for measuring efficiency of hospitals is used to show the superiority of this method that leads to significant advancements in outlier detection.  相似文献   

10.
We propose new tools for visualizing large amounts of functional data in the form of smooth curves. The proposed tools include functional versions of the bagplot and boxplot, which make use of the first two robust principal component scores, Tukey’s data depth and highest density regions.

By-products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with existing methods for detecting outliers in functional data, and show that our methods are better able to identify outliers.

An R-package containing computer code and datasets is available in the online supplements.  相似文献   

11.
Cluster-based outlier detection   总被引:1,自引:0,他引:1  
Outlier detection has important applications in the field of data mining, such as fraud detection, customer behavior analysis, and intrusion detection. Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Outliers are traditionally considered as single points; however, there is a key observation that many abnormal events have both temporal and spatial locality, which might form small clusters that also need to be deemed as outliers. In other words, not only a single point but also a small cluster can probably be an outlier. In this paper, we present a new definition for outliers: cluster-based outlier, which is meaningful and provides importance to the local data behavior, and how to detect outliers by the clustering algorithm LDBSCAN (Duan et al. in Inf. Syst. 32(7):978–986, 2007) which is capable of finding clusters and assigning LOF (Breunig et al. in Proceedings of the 2000 ACM SIG MOD International Conference on Manegement of Data, ACM Press, pp. 93–104, 2000) to single points.  相似文献   

12.
We fit parametric models to survival data in the case of censoring and (outlier) contamination. To do so, we adapt the robust density power divergence methodology of Basu, Harris, Hjort, and Jones (Biometrika, 85, 549–559, 1998) to the case of censored survival data. Asymptotic properties, simulation performance and application to data are provided.  相似文献   

13.
We address the statistical problem of detecting change points in the stress‐strength reliability R=P(X<Y) in a sequence of paired variables (X,Y). Without specifying their underlying distributions, we embed this nonparametric problem into a parametric framework and apply the maximum likelihood method via a dynamic programming approach to determine the locations of the change points in R. Under some mild conditions, we show the consistency and asymptotic properties of the procedure to locate the change points. Simulation experiments reveal that, in comparison with existing parametric and nonparametric change‐point detection methods, our proposed method performs well in detecting both single and multiple change points in R in terms of the accuracy of the location estimation and the computation time. Applications to real data demonstrate the usefulness of our proposed methodology for detecting the change points in the stress‐strength reliability R. Supplementary materials are available online.  相似文献   

14.
In [10] it is claimed that the set of predicate tautologies of all complete BL‐chains and the set of all standard tautologies (i. e., the set of predicate formulas valid in all standard BL‐algebras) coincide. As noticed in [11], this claim is wrong. In this paper we show that a complete BL‐chain B satisfies all standard BL‐tautologies iff for any transfinite sequence (ai: iI) of elements of B , the condition ∧iI (a2i ) = (∧iI ai)2 holds in B . (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

15.
Likelihood ratio tests for detecting a single outlier in multivariate linear models are considered, where an observation is called an outlier if there has been a shift in the mean. The test statistics are the maximum of n nonindependent statistics, where n is the number of observations. Relevant distributions to use upper and lower Bonferroni's inequalities are given.  相似文献   

16.
Abstract

The implementation of the Hill estimator, which estimates the heaviness of the tail of a distribution, requires a choice of the number of extreme observations in the tails, r from a sample of size n where 2 ≤ r + 1 ≤ n. This article is concerned with a robust procedure of choosing an optimal r. Thus, an estimation procedure, δ s , based on the idea of spacing statistics, H(r) is developed. The proposed decision rule for choosing r under the squared error loss is found to be a simple function of the sample size. The proposed rule is then illustrated across a wide range of data, including insurance claims, currency exchange rate returns, and city size.  相似文献   

17.
Summary Ak-in-a-row procedure is proposed to select the most demanded element in a set ofn elements. We show that the least favorable configuration of the proposed procedure which always selects the element when the same element has been demanded (or observed)k times in a row has a simple form similar to those of classical selection procedures. Moreover, numerical evidences are provided to illustrate the fact thatk-in-a-row procedure is better than the usual inverse sampling procedure and fixed sample size procedure when the distance between the most demanded element and the other elements is large and when the number of elements is small.  相似文献   

18.
Generalizations of Boolean elements of a BL‐algebra L are studied. By utilizing the MV‐center MV(L) of L, it is reproved that an element xL is Boolean iff xx * = 1 . L is called semi‐Boolean if for all xL, x * is Boolean. An MV‐algebra L is semi‐Boolean iff L is a Boolean algebra. A BL‐algebra L is semi‐Boolean iff L is an SBL‐algebra. A BL‐algebra L is called hyper‐Archimedean if for all xL, xn is Boolean for some finite n ≥ 1. It is proved that hyper‐Archimedean BL‐algebras are MV‐algebras. The study has application in mathematical fuzzy logics whose Lindenbaum algebras are MV‐algebras or BL‐algebras. (© 2007 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

19.
异常交易行为的甄别研究   总被引:1,自引:1,他引:0  
本文在无指导学习的研究框架下,运用分位数回归模型结合变点检验,对中国证券市场的异常交易行为进行甄别研究。通过分析持股比例变动与股价收益率间协同演化关系的异常,为甄别异常交易行为设立判别标准并客观的界定阈值提供了一种新的方法。基于这一方法监管者可以构建分期、分级、分类的实时监管体系,提高监管效率。  相似文献   

20.
For the several sample problem, a vector of estimable parameters is considered. For a fixed total sample size, a multistage (sequential) procedure based on generalized U-statistics is developed for choosing a partition of this sample size into individual sample size for which the generalized variance of the estimator of the parameter vector is asymptotically minimized.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号