首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
In model-based clustering, the density of each cluster is usually assumed to be a certain basic parametric distribution, for example, the normal distribution. In practice, it is often difficult to decide which parametric distribution is suitable to characterize a cluster, especially for multivariate data. Moreover, the densities of individual clusters may be multimodal themselves, and therefore cannot be accurately modeled by basic parametric distributions. This article explores a clustering approach that models each cluster by a mixture of normals. The resulting overall model is a multilayer mixture of normals. Algorithms to estimate the model and perform clustering are developed based on the classification maximum likelihood (CML) and mixture maximum likelihood (MML) criteria. BIC and ICL-BIC are examined for choosing the number of normal components per cluster. Experiments on both simulated and real data are presented.  相似文献   

2.
Series models have several functions: comprehending the functional dependence of variable of interest on covariates, forecasting the dependent variable for future values of covariates and estimating variance disintegration, co-integration and steady-state relations. Although the regression function in a time series model has been extensively modeled both parametrically and nonparametrically, modeling of the error autocorrelation is mainly restricted to the parametric setup. A proper modeling of autocorrelation not only helps to reduce the bias in regression function estimate, but also enriches forecasting via a better forecast of the error term. In this article, we present a nonparametric modeling of autocorrelation function under a Bayesian framework. Moving into the frequency domain from the time domain, we introduce a Gaussian process prior to the log of the spectral density, which is then updated by using a Whittle approximation for the likelihood function (Whittle likelihood). The posterior computation is simplified due to the fact that Whittle likelihood is approximated by the likelihood of a normal mixture distribution with log-spectral density as a location shift parameter, where the mixture is of only five components with known means, variances, and mixture probabilities. The problem then becomes conjugate conditional on the mixture components, and a Gibbs sampler is used to initiate the unknown mixture components as latent variables. We present a simulation study for performance comparison, and apply our method to the two real data examples.  相似文献   

3.
Regression density estimation is the problem of flexibly estimating a response distribution as a function of covariates. An important approach to regression density estimation uses finite mixture models and our article considers flexible mixtures of heteroscedastic regression (MHR) models where the response distribution is a normal mixture, with the component means, variances, and mixture weights all varying as a function of covariates. Our article develops fast variational approximation (VA) methods for inference. Our motivation is that alternative computationally intensive Markov chain Monte Carlo (MCMC) methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly in exploratory analysis and model choice. Our article makes three contributions. First, a VA for MHR models is described where the variational lower bound is in closed form. Second, the basic approximation can be improved by using stochastic approximation (SA) methods to perturb the initial solution to attain higher accuracy. Third, the advantages of our approach for model choice and evaluation compared with MCMC-based approaches are illustrated. These advantages are particularly compelling for time series data where repeated refitting for one-step-ahead prediction in model choice and diagnostics and in rolling-window computations is very common. Supplementary materials for the article are available online.  相似文献   

4.
Univariate or multivariate ordinal responses are often assumed to arise from a latent continuous parametric distribution, with covariate effects that enter linearly. We introduce a Bayesian nonparametric modeling approach for univariate and multivariate ordinal regression, which is based on mixture modeling for the joint distribution of latent responses and covariates. The modeling framework enables highly flexible inference for ordinal regression relationships, avoiding assumptions of linearity or additivity in the covariate effects. In standard parametric ordinal regression models, computational challenges arise from identifiability constraints and estimation of parameters requiring nonstandard inferential techniques. A key feature of the nonparametric model is that it achieves inferential flexibility, while avoiding these difficulties. In particular, we establish full support of the nonparametric mixture model under fixed cut-off points that relate through discretization the latent continuous responses with the ordinal responses. The practical utility of the modeling approach is illustrated through application to two datasets from econometrics, an example involving regression relationships for ozone concentration, and a multirater agreement problem. Supplementary materials with technical details on theoretical results and on computation are available online.  相似文献   

5.
Spatial scan density (SSD) estimation via mixture models is an important problem in the field of spatial statistical analysis and has wide applications in image analysis. The “borrowed strength” density estimation (BSDE) method via mixture models enables one to estimate the local probability density function in a random field wherein potential similarities between the density functions for the subregions are exploited. This article proposes an efficient methods for SSD estimation by integrating the borrowed strength technique into the alternative EM framework which combines the statistical basis of the BSDE approach with the stability and improved convergence rate of the alternative EM methods. In addition, we propose adaptive SSD estimation methods that extend the aforementioned approach by eliminating the need to find the posterior probability of membership of the component densities afresh in each subregion. Simulation results and an application to the detection and identification of man-made regions of interest in an unmanned aerial vehicle imagery experiment show that the adaptive methods significantly outperform the BSDE method. Other applications include automatic target recognition, mammographic image analysis, and minefield detection.  相似文献   

6.
竞争风险混合模型的参数估计与检验   总被引:1,自引:0,他引:1  
本文在独立同分布I型区间删失情形下,研究了竞争风险混合模型中当参数真值是内点时,参数极大似然估计的性质,获得了其强相合性和渐近正态性.在较为宽松的条件下,给出了竞争风险混合模型参数序关系假设检验的检验方法,同时得到了似然比检验统计量及其在零假设下的渐近分布为加权x~2分布,并给出了—个例子并进行了功效比较.  相似文献   

7.
沪深大盘指数的收益率分布函数并不服从通常人们所认为的正态分布.因此,采用一种新的方法—非参数核密度估计,对沪深股指收益率分布进行拟合.该方法不仅很好地刻画了收益率分布的尖峰和肥尾特征,而且由此建立的VaR模型比一般的基于参数分布的VaR模型更能捕捉市场的风险特征,结论也更加准确.  相似文献   

8.
Model Misspecification: Finite Mixture or Homogeneous?   总被引:1,自引:0,他引:1  
A common problem in statistical modelling is to distinguish between finite mixture distribution and a homogeneous non-mixture distribution. Finite mixture models are widely used in practice and often mixtures of normal densities are indistinguishable from homogenous non-normal densities. This paper illustrates what happens when the EM algorithm for normal mixtures is applied to a distribution that is a homogeneous non-mixture distribution. In particular, a population-based EM algorithm for finite mixtures is introduced and applied directly to density functions instead of sample data. The population-based EM algorithm is used to find finite mixture approximations to common homogeneous distributions. An example regarding the nature of a placebo response in drug treated depressed subjects is used to illustrate ideas.  相似文献   

9.
对损失分布的估计一直是保险公司的重要问题. 有多种参数方法以及非参数方法拟合损失分布. 本文作者提出了结合参数和非参数的方法来解决损失分布拟合问题. 首先通过超额均值图确定大小损失之间的阈限,再利用广义Pareto分布拟合阈值以上损失, 转换后的核密度估计拟合阈值以下损失. 最后, 通过实证分析将该方法和其他方法进行了误差分析比较, 取得了理想的结果.  相似文献   

10.
In this paper, several properties of one-way classification model with skew-normal random effects are obtained, such as moment generating function, density function and noncentral skew chi-square distribution, etc. Based on the EM algorithm, we discuss the maximum likelihood(ML) estimation of unknown parameters. For testing problem of fixed effect, a parametric bootstrap(PB) approach is developed. Finally, some simulation results on the Type I error rates and powers of the PB approach are obtained, which show that the PB approach provides satisfactory performances on the Type I error rates and powers, even for small samples.For illustration, our main results are applied to a real data problem.  相似文献   

11.
This article explores the possibility of modeling the wage distribution using a mixture of density functions. We deal with this issue for a long time and we build on our earlier work. Classical models use the probability distribution such as normal, lognormal, Pareto, etc., but the results are not very good in the last years. Changing the parameters of a probability density over time has led to a degradation of such models and it was necessary to choose a different probability distribution. We were using the idea of mixtures of distributions (instead of using one classical density) in previous articles. We tried using a mixture of probability distributions (normal, lognormal and a mixture of Johnson’s distribution densities) in our models. The achieved results were very good. We used data from Czech Statistical Office covering the wages of the last 18 years in Czech Republic.  相似文献   

12.
The main purpose of this paper is the study of the multivariate Behrens-Fisher distribution. It is defined as the convolution of two independent multivariate Student t distributions. Some representations of this distribution as the mixture of known distributions are shown. An important result presented in the paper is the elliptical condition of this distribution in the special case of proportional scale matrices of the Student t distributions in the defining convolution. For the bivariate Behrens-Fisher problem, the authors propose a non-informative prior distribution leading to highest posterior density (H.P.D.) regions for the difference of the mean vectors whose coverage probability matches the frequentist coverage probability more accurately than that obtained using the independence-Jeffreys prior distribution, even with small samples.  相似文献   

13.
In a structural measurement error model the structural quasi-score (SQS) estimator is based on the distribution of the latent regressor variable. If this distribution is misspecified, the SQS estimator is (asymptotically) biased. Two types of misspecification are considered. Both assume that the statistician erroneously adopts a normal distribution as his model for the regressor distribution. In the first type of misspecification, the true model consists of a mixture of normal distributions which cluster around a single normal distribution, in the second type, the true distribution is a normal distribution admixed with a second normal distribution of low weight. In both cases of misspecification, the bias, of course, tends to zero when the size of misspecification tends to zero. However, in the first case the bias goes to zero in a flat way so that small deviations from the true model lead to a negligible bias, whereas in the second case the bias is noticeable even for small deviations from the true model.  相似文献   

14.
In the context of semi-functional partial linear regression model, we study the problem of error density estimation. The unknown error density is approximated by a mixture of Gaussian densities with means being the individual residuals, and variance a constant parameter. This mixture error density has a form of a kernel density estimator of residuals, where the regression function, consisting of parametric and nonparametric components, is estimated by the ordinary least squares and functional Nadaraya–Watson estimators. The estimation accuracy of the ordinary least squares and functional Nadaraya–Watson estimators jointly depends on the same bandwidth parameter. A Bayesian approach is proposed to simultaneously estimate the bandwidths in the kernel-form error density and in the regression function. Under the kernel-form error density, we derive a kernel likelihood and posterior for the bandwidth parameters. For estimating the regression function and error density, a series of simulation studies show that the Bayesian approach yields better accuracy than the benchmark functional cross validation. Illustrated by a spectroscopy data set, we found that the Bayesian approach gives better point forecast accuracy of the regression function than the functional cross validation, and it is capable of producing prediction intervals nonparametrically.  相似文献   

15.
针对非对称厚尾GARCH模型参数的预选分布很难确定的问题。对模型参数空间进行数据扩张,把模型中的厚尾残差分布表示成正态分布和逆伽玛分布的混合分布,然后通过对参数的后验条件分布进行变换获得参数的预选分布,从而利用M-H抽样实现了非对称厚尾GARCH模型的贝叶斯分析。中国原油收益率波动的实证研究发现中国原油收益率的波动具有高峰厚尾性但不存在"杠杆效应",样本内的预测评价发现基于M-H抽样的贝叶斯方法优于极大似然方法,说明了M-H抽样方案设计的有效性。  相似文献   

16.
We consider the problem of the construction of the goodness-of-fit test in the case of continuous time observations of a diffusion process with small noise. The null hypothesis is parametric and we use a minimum distance estimator of the unknown parameter. We propose an asymptotically distribution free test for this model.  相似文献   

17.
The paper is devoted to the problem of statistical estimation of a multivariate distribution density, which is a discrete mixture of Gaussian distributions. A heuristic approach is considered, based on the use of the EM algorithm and nonparametric density estimation with a sequential increase in the number of components of the mixture. Criteria for testing of model adequacy are discussed.  相似文献   

18.
In this paper we investigate the problem of thermal explosion in a two-phase polydisperse combustible mixture (oxygen and fuel concentrations are takes into account). The current work presents a new, simplified model of the thermal explosion in a combustible gaseous mixture containing vaporizing fuel droplets of different radii (polydisperse). The polydispersity is modeled using a probability density function (PDF). The evolution of the size distribution of droplets due to the evaporation process is described by the kinetic equation for the PDF. An explicit expression of the critical condition for thermal explosion limit is derived analytically and represents a generalization of the critical parameter of the classical Semenov theory.  相似文献   

19.
In this article, we propose a novel Bayesian nonparametric clustering algorithm based on a Dirichlet process mixture of Dirichlet distributions which have been shown to be very flexible for modeling proportional data. The idea is to let the number of mixture components increases as new data to cluster arrive in such a manner that the model selection problem (i.e. determination of the number of clusters) can be answered without recourse to classic selection criteria. Thus, the proposed model can be considered as an infinite Dirichlet mixture model. An expectation propagation inference framework is developed to learn this model by obtaining a full posterior distribution on its parameters. Within this learning framework, the model complexity and all the involved parameters are evaluated simultaneously. To show the practical relevance and efficiency of our model, we perform a detailed analysis using extensive simulations based on both synthetic and real data. In particular, real data are generated from three challenging applications namely images categorization, anomaly intrusion detection and videos summarization.  相似文献   

20.

Assume that there are multiple data streams (channels, sensors) and in each stream the process of interest produces generally dependent and non-identically distributed observations. When the process is in a normal mode (in-control), the (pre-change) distribution is known, but when the process becomes abnormal there is a parametric uncertainty, i.e., the post-change (out-of-control) distribution is known only partially up to a parameter. Both the change point and the post-change parameter are unknown. Moreover, the change affects an unknown subset of streams, so that the number of affected streams and their location are unknown in advance. A good changepoint detection procedure should detect the change as soon as possible after its occurrence while controlling for a risk of false alarms. We consider a Bayesian setup with a given prior distribution of the change point and propose two sequential mixture-based change detection rules, one mixes a Shiryaev-type statistic over both the unknown subset of affected streams and the unknown post-change parameter and another mixes a Shiryaev–Roberts-type statistic. These rules generalize the mixture detection procedures studied by Tartakovsky (IEEE Trans Inf Theory 65(3):1413–1429, 2019) in a single-stream case. We provide sufficient conditions under which the proposed multistream change detection procedures are first-order asymptotically optimal with respect to moments of the delay to detection as the probability of false alarm approaches zero.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号