首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Equally weighted mixture models are recommended for situations where it is required to draw precise finite sample inferences requiring population parameters, but where the population distribution is not constrained to belong to a simple parametric family. They lead to an alternative procedure to the Laird-DerSimonian maximum likelihood algorithm for unequally weighted mixture models. Their primary purpose lies in the facilitation of exact Bayesian computations via importance sampling. Under very general sampling and prior specifications, exact Bayesian computations can be based upon an application of importance sampling, referred to as Permutable Bayesian Marginalization (PBM). An importance function based upon a truncated multivariatet-distribution is proposed, which refers to a generalization of the maximum likelihood procedure. The estimation of discrete distributions, by binomial mixtures, and inference for survivor distributions, via mixtures of exponential or Weibull distributions, are considered. Equally weighted mixture models are also shown to lead to an alternative Gibbs sampling methodology to the Lavine-West approach.  相似文献   

2.
胡宏昌  曾珍 《数学学报》2017,60(6):961-976
考虑如下广义线性模型y_i=h(x~T_i,β)+e_i=1,2,…,n,其中e_i=G(…,ε_(i-1),ε_i),h是一个连续可导函数,ε_i是独立同分布的随机变量,并且它的期望为0,方差σ~2有限.本文给出了参数β的M估计,并且得到了该估计的Bahadur表示,该结论推广了线性模型的相关结论.应用M估计的Bahadur表示,得到了相依误差的线性回归模型,poisson模型,logistic模型和独立误差的广义线性模型等模型的渐近性质.  相似文献   

3.
Regression density estimation is the problem of flexibly estimating a response distribution as a function of covariates. An important approach to regression density estimation uses finite mixture models and our article considers flexible mixtures of heteroscedastic regression (MHR) models where the response distribution is a normal mixture, with the component means, variances, and mixture weights all varying as a function of covariates. Our article develops fast variational approximation (VA) methods for inference. Our motivation is that alternative computationally intensive Markov chain Monte Carlo (MCMC) methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly in exploratory analysis and model choice. Our article makes three contributions. First, a VA for MHR models is described where the variational lower bound is in closed form. Second, the basic approximation can be improved by using stochastic approximation (SA) methods to perturb the initial solution to attain higher accuracy. Third, the advantages of our approach for model choice and evaluation compared with MCMC-based approaches are illustrated. These advantages are particularly compelling for time series data where repeated refitting for one-step-ahead prediction in model choice and diagnostics and in rolling-window computations is very common. Supplementary materials for the article are available online.  相似文献   

4.
In this article, we first propose a semiparametric mixture of generalized linear models (GLMs) and a nonparametric mixture of GLMs, and then establish identifiability results under mild conditions.  相似文献   

5.
The analysis of finite mixture models for exponential repeated data is considered. The mixture components correspond to different unknown groups of the statistical units. Dependency and variability of repeated data are taken into account through random effects. For each component, an exponential mixed model is thus defined. When considering parameter estimation in this mixture of exponential mixed models, the EM-algorithm cannot be directly used since the marginal distribution of each mixture component cannot be analytically derived. In this paper, we propose two parameter estimation methods. The first one uses a linearisation specific to the exponential distribution hypothesis within each component. The second approach uses a Metropolis–Hastings algorithm as a building block of a general MCEM-algorithm.  相似文献   

6.
Mixture cure models were originally proposed in medical statistics to model long-term survival of cancer patients in terms of two distinct subpopulations - those that are cured of the event of interest and will never relapse, along with those that are uncured and are susceptible to the event. In the present paper, we introduce mixture cure models to the area of credit scoring, where, similarly to the medical setting, a large proportion of the dataset may not experience the event of interest during the loan term, i.e. default. We estimate a mixture cure model predicting (time to) default on a UK personal loan portfolio, and compare its performance to the Cox proportional hazards method and standard logistic regression. Results for credit scoring at an account level and prediction of the number of defaults at a portfolio level are presented; model performance is evaluated through cross validation on discrimination and calibration measures. Discrimination performance for all three approaches was found to be high and competitive. Calibration performance for the survival approaches was found to be superior to logistic regression for intermediate time intervals and useful for fixed 12 month time horizon estimates, reinforcing the flexibility of survival analysis as both a risk ranking tool and for providing robust estimates of probability of default over time. Furthermore, the mixture cure model’s ability to distinguish between two subpopulations can offer additional insights by estimating the parameters that determine susceptibility to default in addition to parameters that influence time to default of a borrower.  相似文献   

7.
Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew–normal mixture of joint location,scale and skewness models to analyze the heteroscedastic skew–normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation–Maximization(EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index(BMI) data are presented.  相似文献   

8.
This paper deals with iterative gradient and subgradient methods with random feasibility steps for solving constrained convex minimization problems, where the constraint set is specified as the intersection of possibly infinitely many constraint sets. Each constraint set is assumed to be given as a level set of a convex but not necessarily differentiable function. The proposed algorithms are applicable to the situation where the whole constraint set of the problem is not known in advance, but it is rather learned in time through observations. Also, the algorithms are of interest for constrained optimization problems where the constraints are known but the number of constraints is either large or not finite. We analyze the proposed algorithm for the case when the objective function is differentiable with Lipschitz gradients and the case when the objective function is not necessarily differentiable. The behavior of the algorithm is investigated both for diminishing and non-diminishing stepsize values. The almost sure convergence to an optimal solution is established for diminishing stepsize. For non-diminishing stepsize, the error bounds are established for the expected distances of the weighted averages of the iterates from the constraint set, as well as for the expected sub-optimality of the function values along the weighted averages.  相似文献   

9.
In this paper, we consider the issue of variable selection in partial linear single-index models under the assumption that the vector of regression coefficients is sparse. We apply penalized spline to estimate the nonparametric function and SCAD penalty to achieve sparse estimates of regression parameters in both the linear and single-index parts of the model. Under some mild conditions, it is shown that the penalized estimators have oracle property, in the sense that it is asymptotically normal with the same mean and covariance that they would have if zero coefficients are known in advance. Our model owns a least square representation, therefore standard least square programming algorithms can be implemented without extra programming efforts. In the meantime, parametric estimation, variable selection and nonparametric estimation can be realized in one step, which incredibly increases computational stability. The finite sample performance of the penalized estimators is evaluated through Monte Carlo studies and illustrated with a real data set.  相似文献   

10.
Mixture of Experts(MoE) regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression, clustering and classification.Laplace distribution is one of the most important statistical tools to analyze thick and tail data. Laplace Mixture of Linear Experts(LMoLE) regression models are based on the Laplace distribution which is more robust. Similar to modelling variance parameter in a homogeneous population, we propose and study a new novel class of models: heteroscedastic Laplace mixture of experts regression models to analyze the heteroscedastic data coming from a heterogeneous population in this paper. The issues of maximum likelihood estimation are addressed. In particular, Minorization-Maximization(MM) algorithm for estimating the regression parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo simulations. Results from the analysis of two real data sets are presented.  相似文献   

11.
A new method is proposed of constructing mortality forecasts. This parameterized approach utilizes Generalized Linear Models (GLMs), based on heteroscedastic Poisson (non-additive) error structures, and using an orthonormal polynomial design matrix. Principal Component (PC) analysis is then applied to the cross-sectional fitted parameters. The produced model can be viewed either as a one-factor parameterized model where the time series are the fitted parameters, or as a principal component model, namely a log-bilinear hierarchical statistical association model of Goodman [Goodman, L.A., 1991. Measures, models, and graphical displays in the analysis of cross-classified data. J. Amer. Statist. Assoc. 86(416), 1085-1111] or equivalently as a generalized Lee-Carter model with p interaction terms. Mortality forecasts are obtained by applying dynamic linear regression models to the PCs. Two applications are presented: Sweden (1751-2006) and Greece (1957-2006).  相似文献   

12.

In the mean regression context, this study considers several frequently encountered heteroscedastic error models where the regression mean and variance functions are specified up to certain parameters. An important point we note through a series of analyses is that different assumptions on standardized regression errors yield quite different efficiency bounds for the corresponding estimators. Consequently, all aspects of the assumptions need to be specifically taken into account in constructing their corresponding efficient estimators. This study clarifies the relation between the regression error assumptions and their, respectively, efficiency bounds under the general regression framework with heteroscedastic errors. Our simulation results support our findings; we carry out a real data analysis using the proposed methods where the Cobb–Douglas cost model is the regression mean.

  相似文献   

13.
In some situations, the failure time of interest is defined as the gap time between two related events and the observations on both event times can suffer either right or interval censoring.Such data are usually referred to as doubly censored data and frequently encountered in many clinical and observational studies. Additionally, there may also exist a cured subgroup in the whole population,which means that not every individual under study will experience the failure time of interest eventually. In this paper, we consider regression analysis of doubly censored data with a cured subgroup under a wide class of flexible transformation cure models. Specifically, we consider marginal likelihood estimation and develop a two-step approach by combining the multiple imputation and a new expectation-maximization(EM) algorithm for its implementation. The resulting estimators are shown to be consistent and asymptotically normal. The finite sample performance of the proposed method is investigated through simulation studies. The proposed method is also applied to a real dataset arising from an AIDS cohort study for illustration.  相似文献   

14.
Definitive screening designs (DSDs) are a class of experimental designs that allow the estimation of linear, quadratic, and interaction effects with little experimental effort if there is effect sparsity. The number of experimental runs is twice the number of factors of interest plus one. Many industrial experiments involve nonnormal responses. Generalized linear models (GLMs) are a useful alternative for analyzing these kind of data. The analysis of GLMs is based on asymptotic theory, something very debatable, for example, in the case of the DSD with only 13 experimental runs. So far, analysis of DSDs considers a normal response. In this work, we show a five‐step strategy that makes use of tools coming from the Bayesian approach to analyze this kind of experiment when the response is nonnormal. We consider the case of binomial, gamma, and Poisson responses without having to resort to asymptotic approximations. We use posterior odds that effects are active and posterior probability intervals for the effects and use them to evaluate the significance of the effects. We also combine the results of the Bayesian procedure with the lasso estimation procedure to enhance the scope of the method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

15.
Advances in Data Analysis and Classification - In statistical analysis, particularly in econometrics, the finite mixture of regression models based on the normality assumption is routinely used to...  相似文献   

16.
一种新的求解带约束的有限极大极小问题的精确罚函数   总被引:1,自引:0,他引:1  
提出了一种新的精确光滑罚函数求解带约束的极大极小问题.仅仅添加一个额外的变量,利用这个精确光滑罚函数,将带约束的极大极小问题转化为无约束优化问题. 证明了在合理的假设条件下,当罚参数充分大,罚问题的极小值点就是原问题的极小值点.进一步,研究了局部精确性质.数值结果表明这种罚函数算法是求解带约束有限极大极小问题的一种有效算法.  相似文献   

17.
We derive an asymptotic expansion for the log-likelihood of Gaussian mixture models (GMMs) with equal covariance matrices in the low signal-to-noise regime. The expansion reveals an intimate connection between two types of algorithms for parameter estimation: the method of moments and likelihood optimizing algorithms such as Expectation-Maximization (EM). We show that likelihood optimization in the low SNR regime reduces to a sequence of least squares optimization problems that match the moments of the estimate to the ground truth moments one by one. This connection is a stepping stone towards the analysis of EM and maximum likelihood estimation in a wide range of models. A motivating application for the study of low SNR mixture models is cryo-electron microscopy data, which can be modeled as a GMM with algebraic constraints imposed on the mixture centers. We discuss the application of our expansion to algebraically constrained GMMs, among other example models of interest. © 2022 The Authors. Communications on Pure and Applied Mathematics published by Wiley Periodicals LLC.  相似文献   

18.
This paper examines the analysis of an extended finite mixture of factor analyzers (MFA) where both the continuous latent variable (common factor) and the categorical latent variable (component label) are assumed to be influenced by the effects of fixed observed covariates. A polytomous logistic regression model is used to link the categorical latent variable to its corresponding covariate, while a traditional linear model with normal noise is used to model the effect of the covariate on the continuous latent variable. The proposed model turns out be in various ways an extension of many existing related models, and as such offers the potential to address some of the issues not fully handled by those previous models. A detailed derivation of an EM algorithm is proposed for parameter estimation, and latent variable estimates are obtained as by-products of the overall estimation procedure.  相似文献   

19.
Summary Smith (1976,J. R. Statist. Soc., A,139, 183–204) has argued that survey statisticians should attempt to model finite population structures in the same way that statisticians in other disciplines have to provide models of finite or infinite populations. Following this argument, we suggest in this paper that an obvious model for a stratified population when auxiliary information regarding variate values is available, is the one way analysis of covariance model with unequal variances and we consider the problem of estimating the finite population mean. Finally a possible extension of this result is discussed.  相似文献   

20.
We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself, as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory, propose an inference procedure that accounts for total variability in the predicted response curves, and construct pointwise prediction intervals. The estimation/inferential procedure accommodates realistic scenarios, such as correlated error structure as well as sparse and/or irregular designs. We investigate our methodology in finite sample size through simulations and two real data applications. Supplementary material for this article is available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号