共查询到10条相似文献,搜索用时 109 毫秒
1.
Ta‐Hsin Li 《商业与工业应用随机模型》2019,35(5):1185-1201
This paper addresses the problem of data fragmentation when incorporating imbalanced categorical covariates in nonparametric survival models. The problem arises in an application of demand forecasting where certain categorical covariates are important explanatory factors for the diversity of survival patterns but are severely imbalanced in the sense that a large percentage of data segments defined by these covariates have very small sample sizes. Two general approaches, called the class‐based approach and the fusion‐based approach, are proposed to handle the problem. Both reply on judicious utilization of a data segment hierarchy defined by the covariates. The class‐based approach allows certain segments in the hierarchy to have their private survival functions and aggregates the others to share a common survival function. The fusion‐based approach allows all survival functions to borrow and share information from all segments based on their positions in the hierarchy. A nonparametric Bayesian estimator with Dirichlet process priors provides the data‐sharing mechanism in the fusion‐based approach. The hyperparameters in the priors are treated as fixed quantities and learned from data by taking advantage of the data segment hierarchy. The proposed methods are motivated and validated by a case study with real‐world data from an operation of software development service. 相似文献
2.
Victor De Oliveira 《Annals of the Institute of Statistical Mathematics》2012,64(1):107-133
Conditional autoregressive (CAR) models have been extensively used for the analysis of spatial data in diverse areas, such
as demography, economy, epidemiology and geography, as models for both latent and observed variables. In the latter case,
the most common inferential method has been maximum likelihood, and the Bayesian approach has not been used much. This work
proposes default (automatic) Bayesian analyses of CAR models. Two versions of Jeffreys prior, the independence Jeffreys and
Jeffreys-rule priors, are derived for the parameters of CAR models and properties of the priors and resulting posterior distributions
are obtained. The two priors and their respective posteriors are compared based on simulated data. Also, frequentist properties
of inferences based on maximum likelihood are compared with those based on the Jeffreys priors and the uniform prior. Finally,
the proposed Bayesian analysis is illustrated by fitting a CAR model to a phosphate dataset from an archaeological region. 相似文献
3.
In count data regression there can be several problems that prevent the use of the standard Poisson log‐linear model: overdispersion, caused by unobserved heterogeneity or correlation, excess of zeros, non‐linear effects of continuous covariates or of time scales, and spatial effects. We develop Bayesian count data models that can deal with these issues simultaneously and within a unified inferential approach. Models for overdispersed or zero‐inflated data are combined with semiparametrically structured additive predictors, resulting in a rich class of count data regression models. Inference is fully Bayesian and is carried out by computationally efficient MCMC techniques. Simulation studies investigate performance, in particular how well different model components can be identified. Applications to patent data and to data from a car insurance illustrate the potential and, to some extent, limitations of our approach. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献
4.
《Journal of computational and graphical statistics》2013,22(4):811-830
This article proposes a new Bayesian approach to prediction on continuous covariates. The Bayesian partition model constructs arbitrarily complex regression and classification surfaces by splitting the covariate space into an unknown number of disjoint regions. Within each region the data are assumed to be exchangeable and come from some simple distribution. Using conjugate priors, the marginal likelihoods of the models can be obtained analytically for any proposed partitioning of the space where the number and location of the regions is assumed unknown a priori. Markov chain Monte Carlo simulation techniques are used to obtain predictive distributions at the design points by averaging across posterior samples of partitions. 相似文献
5.
Gaussian Markov random fields (GMRF) are important families of distributions for the modeling of spatial data and have been extensively used in different areas of spatial statistics such as disease mapping, image analysis and remote sensing. GMRFs have been used for the modeling of spatial data, both as models for the sampling distribution of the observed data and as models for the prior of latent processes/random effects; we consider mainly the former use of GMRFs. We study a large class of GMRF models that includes several models previously proposed in the literature. An objective Bayesian analysis is presented for the parameters of the above class of GMRFs, where explicit expressions for the Jeffreys (two versions) and reference priors are derived, and for each of these priors results on posterior propriety of the model parameters are established. We describe a simple MCMC algorithm for sampling from the posterior distribution of the model parameters, and study frequentist properties of the Bayesian inferences resulting from the use of these automatic priors. Finally, we illustrate the use of the proposed GMRF model and reference prior for studying the spatial variability of lip cancer cases in the districts of Scotland over the period 1975-1980. 相似文献
6.
We consider several Bayesian multivariate spatial models for estimating the crash rates from different kinds of crashes. Multivariate conditional autoregressive (CAR) models are considered to account for the spatial effect. The models considered are fully Bayesian. A general theorem for each case is proved to ensure posterior propriety under noninformative priors. The different models are compared according to some Bayesian criterion. Markov chain Monte Carlo (MCMC) is used for computation. We illustrate these methods with Texas Crash Data. 相似文献
7.
GARCH models are commonly used for describing, estimating and predicting the dynamics of financial returns. Here, we relax the usual parametric distributional assumptions of GARCH models and develop a Bayesian semiparametric approach based on modeling the innovations using the class of scale mixtures of Gaussian distributions with a Dirichlet process prior on the mixing distribution. The proposed specification allows for greater flexibility in capturing the usual patterns observed in financial returns. It is also shown how to undertake Bayesian prediction of the Value at Risk (VaR). The performance of the proposed semiparametric method is illustrated using simulated and real data from the Hang Seng Index (HSI) and Bombay Stock Exchange index (BSE30). 相似文献
8.
This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. 相似文献
9.
Generalized linear mixed models (GLMMs) have been applied widely in the analysis of longitudinal data. This model confers
two important advantages, namely, the flexibility to include random effects and the ability to make inference about complex
covariances. In practice, however, the inference of variance components can be a difficult task due to the complexity of the
model itself and the dimensionality of the covariance matrix of random effects. Here we first discuss for GLMMs the relation
between Bayesian posterior estimates and penalized quasi-likelihood (PQL) estimates, based on the generalization of Harville’s
result for general linear models. Next, we perform fully Bayesian analyses for the random covariance matrix using three different
reference priors, two with Jeffreys’ priors derived from approximate likelihoods and one with the approximate uniform shrinkage
prior. Computations are carried out via the combination of asymptotic approximations and Markov chain Monte Carlo methods.
Under the criterion of the squared Euclidean norm, we compare the performances of Bayesian estimates of variance components
with that of PQL estimates when the responses are non-normal, and with that of the restricted maximum likelihood (REML) estimates
when data are assumed normal. Three applications and simulations of binary, normal, and count responses with multiple random
effects and of small sample sizes are illustrated. The analyses examine the differences in estimation performance when the
covariance structure is complex, and demonstrate the equivalence between PQL and the posterior modes when the former can be
derived. The results also show that the Bayesian approach, particularly under the approximate Jeffreys’ priors, outperforms
other procedures. 相似文献
10.
Isma?l Castillo 《Probability Theory and Related Fields》2012,152(1-2):53-99
This paper is a contribution to the Bayesian theory of semiparametric estimation. We are interested in the so-called Bernstein–von Mises theorem, in a semiparametric framework where the unknown quantity is (θ, f), with θ the parameter of interest and f an infinite-dimensional nuisance parameter. Two theorems are established, one in the case with no loss of information and one in the information loss case with Gaussian process priors. The general theory is applied to three specific models: the estimation of the center of symmetry of a symmetric function in Gaussian white noise, a time-discrete functional data analysis model and Cox’s proportional hazards model. In all cases, the range of application of the theorems is investigated by using a family of Gaussian priors parametrized by a continuous parameter. 相似文献