首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The following mixture model-based clustering methods are compared in a simulation study with one-dimensional data, fixed number of clusters and a focus on outliers and uniform “noise”: an ML-estimator (MLE) for Gaussian mixtures, an MLE for a mixture of Gaussians and a uniform distribution (interpreted as “noise component” to catch outliers), an MLE for a mixture of Gaussian distributions where a uniform distribution over the range of the data is fixed (Fraley and Raftery in Comput J 41:578–588, 1998), a pseudo-MLE for a Gaussian mixture with improper fixed constant over the real line to catch “noise” (RIMLE; Hennig in Ann Stat 32(4): 1313–1340, 2004), and MLEs for mixtures of t-distributions with and without estimation of the degrees of freedom (McLachlan and Peel in Stat Comput 10(4):339–348, 2000). The RIMLE (using a method to choose the fixed constant first proposed in Coretto, The noise component in model-based clustering. Ph.D thesis, Department of Statistical Science, University College London, 2008) is the best method in some, and acceptable in all, simulation setups, and can therefore be recommended.  相似文献   

2.
本文讨论了检验样本是来自一个正态总体还是两个未知均值和方差的正态的混合分布,采用对数极大似然比的检验,如果不加限制,Hartinganm曾指出不是寻找的、X^2分布,我们在混合的中了一点后得到了其极限分布产工给出了分位点数值表。  相似文献   

3.
We adopt the Bayesian paradigm and discuss certain properties of posterior median estimators of possibly sparse sequences. The prior distribution considered is a mixture of an atom of probability at zero and a symmetric unimodal distribution, and the noise distribution is taken as another symmetric unimodal distribution. We derive an explicit form of the corresponding posterior median and show that it is an antisymmetric function and, under some conditions, a shrinkage and a thresholding rule. Furthermore we show that, as long as the tails of the nonzero part of the prior distribution are heavier than the tails of the noise distribution, the posterior median, under some constraints on the involved parameters, has the bounded shrinkage property, extending thus recent results to larger families of prior and noise distributions. Expressions of posterior distributions and posterior medians in particular cases of interest are obtained. The asymptotes of the derived posterior medians, which provide valuable information of how the corresponding estimators treat large coefficients, are also given. These results could be particularly useful for studying frequentist optimality properties and developing statistical techniques of the resulting posterior median estimators of possibly sparse sequences for a wider set of prior and noise distributions.  相似文献   

4.
Two conditions are shown under which elliptical distributions are scale mixtures of normal distributions with respect to probability distributions. The issue of finding the mixing distribution function is also considered. As a unified theoretical framework, it is also shown that any scale mixture of normal distributions is always a term of a sequence of elliptical distributions, increasing in dimension, and that all the terms of this sequence are also scale mixtures of normal distributions sharing the same mixing distribution function. Some examples are shown as applications of these concepts, showing the way of finding the mixing distribution function.  相似文献   

5.
Summary In this paper we extend Ruben's [4] result for quadratic forms in normal variables. He represented the distribution function of the quadratic form in normal variables as an infinite mixture of chi-square distribution functions. In the central case, we show that the distribution function of a quadratic form int-variables can be represented as a mixture of beta distribution functions. In the noncentral case, the distribution function presented is an infinite series in beta distribution functions. An application to quadratic discrimination is given.  相似文献   

6.
This article explores the possibility of modeling the wage distribution using a mixture of density functions. We deal with this issue for a long time and we build on our earlier work. Classical models use the probability distribution such as normal, lognormal, Pareto, etc., but the results are not very good in the last years. Changing the parameters of a probability density over time has led to a degradation of such models and it was necessary to choose a different probability distribution. We were using the idea of mixtures of distributions (instead of using one classical density) in previous articles. We tried using a mixture of probability distributions (normal, lognormal and a mixture of Johnson’s distribution densities) in our models. The achieved results were very good. We used data from Czech Statistical Office covering the wages of the last 18 years in Czech Republic.  相似文献   

7.
There are a number of cases where the moments of a distribution are easily obtained, but theoretical distributions are not available in closed form. This paper shows how to use moment methods to approximate a theoretical univariate distribution with mixtures of known distributions. The methods are illustrated with gamma mixtures. It is shown that for a certain class of mixture distributions, which include the normal and gamma mixture families, one can solve for a p-point mixing distribution such that the corresponding mixture has exactly the same first 2p moments as the targeted univariate distribution. The gamma mixture approximation to the distribution of a positive weighted sums of independent central 2 variables is demonstrated and compared with a number of existing approximations. The numerical results show that the new approximation is generally superior to these alternatives.  相似文献   

8.
In model-based clustering, the density of each cluster is usually assumed to be a certain basic parametric distribution, for example, the normal distribution. In practice, it is often difficult to decide which parametric distribution is suitable to characterize a cluster, especially for multivariate data. Moreover, the densities of individual clusters may be multimodal themselves, and therefore cannot be accurately modeled by basic parametric distributions. This article explores a clustering approach that models each cluster by a mixture of normals. The resulting overall model is a multilayer mixture of normals. Algorithms to estimate the model and perform clustering are developed based on the classification maximum likelihood (CML) and mixture maximum likelihood (MML) criteria. BIC and ICL-BIC are examined for choosing the number of normal components per cluster. Experiments on both simulated and real data are presented.  相似文献   

9.
The problem of constructing an estimate of a signal function from noisy observations, assuming that this function is uniformly Lipschitz regular, is considered. The thresholding of empirical wavelet coefficients is used to reduce the noise. As a rule, it is assumed that the noise distribution is Gaussian and the optimal parameters of thresholding are known for various classes of signal functions. In this paper a model of additive noise whose distribution belongs to a fairly wide class, is considered. The mean-square risk estimate of thresholding is analyzed. It is shown that under certain conditions, this estimate is strongly consistent and asymptotically normal.  相似文献   

10.
This paper focuses on the question of specification of measurement error distribution and the distribution of true predictors in generalized linear models when the predictors are subject to measurement errors. The standard measurement error model typically assumes that the measurement error distribution and the distribution of covariates unobservable in the main study are normal. To make the model flexible enough we, instead, assume that the measurement error distribution is multivariate t and the distribution of true covariates is a finite mixture of normal densities. Likelihood–based method is developed to estimate the regression parameters. However, direct maximization of the marginal likelihood is numerically difficult. Thus as an alternative to it we apply the EM algorithm. This makes the computation of likelihood estimates feasible. The performance of the proposed model is investigated by simulation study.  相似文献   

11.
The normal inverse Gaussian (NIG) distribution is a promising alternative for modelling financial data since it is a continuous distribution that allows for skewness and fat tails. There is an increasing number of applications of the NIG distribution to financial problems. Due to the complicated nature of its density, estimation procedures are not simple. In this paper we propose Bayesian estimation for the parameters of the NIG distribution via an MCMC scheme based on the Gibbs sampler. Our approach makes use of the data augmentation provided by the mixture representation of the distribution. We also extend the model to allow for modelling heteroscedastic regression situations. Examples with financial and simulated data are provided. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

12.
Reference growth curves estimate the distribution of a measurement as it changes according to some covariate, often age. We present a new methodology to estimate growth curves based on mixture models and splines. We model the distribution of the measurement with a mixture of normal distributions with an unknown number of components, and model dependence on the covariate through the weights, using smooth functions based on B-splines. In this way the growth curves respect the continuity of the covariate and there is no need for arbitrary grouping of the observations. The method is illustrated with data on triceps skinfold in Gambian girls and women.  相似文献   

13.
A major problem in statistical quality control is to detect a change in the distribution of independent sequentially observed random vectors. The case of a Gaussian pre-change distribution has been extensively analyzed. Here we are concerned with the non-normal multivariate case. In this setup it is natural to use tolerance regions as detection tools. These regions are defined in terms of density level sets, which can be estimated in a plug-in fashion. Under a normal mixture model we compare, through a simulation study, the performance of such a detection scheme for two density estimators: a (parametric) normal mixture and a (nonparametric) kernel estimator. The problem of the bandwidth choice for the latter is addressed. We also obtain a result concerning the convergence rates of the error probabilities under a general parametric model. Finally, a real data example is discussed.  相似文献   

14.
A Gaussian measurement error assumption, that is, an assumption that the data are observed up to Gaussian noise, can bias any parameter estimation in the presence of outliers. A heavy tailed error assumption based on Student’s t distribution helps reduce the bias. However, it may be less efficient in estimating parameters if the heavy tailed assumption is uniformly applied to all of the data when most of them are normally observed. We propose a mixture error assumption that selectively converts Gaussian errors into Student’s t errors according to latent outlier indicators, leveraging the best of the Gaussian and Student’s t errors; a parameter estimation can be not only robust but also accurate. Using simulated hospital profiling data and astronomical time series of brightness data, we demonstrate the potential for the proposed mixture error assumption to estimate parameters accurately in the presence of outliers. Supplemental materials for this article are available online.  相似文献   

15.
Model Misspecification: Finite Mixture or Homogeneous?   总被引:1,自引:0,他引:1  
A common problem in statistical modelling is to distinguish between finite mixture distribution and a homogeneous non-mixture distribution. Finite mixture models are widely used in practice and often mixtures of normal densities are indistinguishable from homogenous non-normal densities. This paper illustrates what happens when the EM algorithm for normal mixtures is applied to a distribution that is a homogeneous non-mixture distribution. In particular, a population-based EM algorithm for finite mixtures is introduced and applied directly to density functions instead of sample data. The population-based EM algorithm is used to find finite mixture approximations to common homogeneous distributions. An example regarding the nature of a placebo response in drug treated depressed subjects is used to illustrate ideas.  相似文献   

16.
The estimating equations derived from minimising aL 2 distance between the empirical distribution function and the parametric distribution representing a mixture ofk normal distributions with possibly different means and/or different dispersion parameters are given explicitly. The equations are of theM estimator form in which the function is smooth, bounded and has bounded partial derivatives. As a consequence it is shown that there is a solution of the equations which is robust. In particular there exists a weakly continuous, Fréchet differentiable root and hence there is a consistent root of the equations which is asymptotically normal. These estimating equations offer a robust alternative to the maximum likelihood equations, which are known to yield nonrobust estimators.  相似文献   

17.
The Behrens-Fisher distribution is generally defined as the convolution of two Student t distributions and it is well known that it can be represented as a scale mixture of normals. By extending this standardized distribution to a location-scale family in the usual way we prove that this generalised Behrens-Fisher distribution can also be represented as a location mixture of t distributions when the mixing distribution is, in turn, a Student t. This characterization is applied to the computation of certain predictive distributions appearing in the Bayesian analysis of two sample problems.  相似文献   

18.
本文针对多电导水平离子通道的由多个正态分布加权组成的混合分布特点,用EM迭代算法对混合分布中的参数进行极大似然估计,并在此基础上,利用混合分布中最可能的成分判断通道状态,从而还原通道潜在信号,克服了离子通道分析软件PCLAMP中参数估计与状态还原的缺陷。  相似文献   

19.
In a structural measurement error model the structural quasi-score (SQS) estimator is based on the distribution of the latent regressor variable. If this distribution is misspecified, the SQS estimator is (asymptotically) biased. Two types of misspecification are considered. Both assume that the statistician erroneously adopts a normal distribution as his model for the regressor distribution. In the first type of misspecification, the true model consists of a mixture of normal distributions which cluster around a single normal distribution, in the second type, the true distribution is a normal distribution admixed with a second normal distribution of low weight. In both cases of misspecification, the bias, of course, tends to zero when the size of misspecification tends to zero. However, in the first case the bias goes to zero in a flat way so that small deviations from the true model lead to a negligible bias, whereas in the second case the bias is noticeable even for small deviations from the true model.  相似文献   

20.
The quasi-likelihood estimator and the Bayesian type estimator of the volatility parameter are in general asymptotically mixed normal. In case the limit is normal, the asymptotic expansion was derived by Yoshida [28] as an application of the martingale expansion. The expansion for the asymptotically mixed normal distribution is then indispensable to develop the higher-order approximation and inference for the volatility. The classical approaches in limit theorems, where the limit is a process with independent increments or a simple mixture, do not work. We present asymptotic expansion of a martingale with asymptotically mixed normal distribution. The expansion formula is expressed by the adjoint of a random symbol with coefficients described by the Malliavin calculus, differently from the standard invariance principle. Applications to a quadratic form of a diffusion process (“realized volatility”) are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号