首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
This article proposes a new approach for Bayesian and maximum likelihood parameter estimation for stationary Gaussian processes observed on a large lattice with missing values. We propose a Markov chain Monte Carlo approach for Bayesian inference, and a Monte Carlo expectation-maximization algorithm for maximum likelihood inference. Our approach uses data augmentation and circulant embedding of the covariance matrix, and provides likelihood-based inference for the parameters and the missing data. Using simulated data and an application to satellite sea surface temperatures in the Pacific Ocean, we show that our method provides accurate inference on lattices of sizes up to 512 × 512, and is competitive with two popular methods: composite likelihood and spectral approximations.  相似文献   

2.
The problem of formal likelihood-based (either classical or Bayesian) inference for discretely observed multidimensional diffusions is particularly challenging. In principle, this involves data augmentation of the observation data to give representations of the entire diffusion trajectory. Most currently proposed methodology splits broadly into two classes: either through the discretization of idealized approaches for the continuous-time diffusion setup or through the use of standard finite-dimensional methodologies discretization of the diffusion model. The connections between these approaches have not been well studied. This article provides a unified framework that brings together these approaches, demonstrating connections, and in some cases surprising differences. As a result, we provide, for the first time, theoretical justification for the various methods of imputing missing data. The inference problems are particularly challenging for irreducible diffusions, and our framework is correspondingly more complex in that case. Therefore, we treat the reducible and irreducible cases differently within the article. Supplementary materials for the article are available online.  相似文献   

3.
在火炸药产品的敏感性推断中,对响应分布的标准差给出较精确的推断,是基础性工作之一.为此,本文基于Logistic响应分布,在二元响应数据下,应用鞍点近似方法构造了刻度参数的近似置信区间,并进行了模拟研究.最后,本文将该方法应用于QD-8电雷管.模拟结果和实例分析表明,在中、小样本情形,本文方法对刻度参数的推断结果较为精确,显著改进了现行的基于渐近正态性的方法.  相似文献   

4.
This paper presents a decomposition for the posterior distribution of the covarianee matrix of normal models under a family of prior distributions when missing data are ignorable and monotone. This decomposition is an extension of Bartlett′s decomposition of the Wishart distribution to monotone missing data. It is not only theoretically interesting but also practically useful. First, with monotone missing data, it allows more efficient drawing of parameters from the posterior distribution than the factorized likelihood approach. Furthermore, with nonmonotone missing data, it allows for a very efficient monotone date augmentation algorithm and thereby multiple imputation or the missing data needed to create a monotone pattern.  相似文献   

5.
Principled techniques for incomplete-data problems are increasingly part of mainstream statistical practice. Among many proposed techniques so far, inference by multiple imputation (MI) has emerged as one of the most popular. While many strategies leading to inference by MI are available in cross-sectional settings, the same richness does not exist in multilevel applications. The limited methods available for multilevel applications rely on the multivariate adaptations of mixed-effects models. This approach preserves the mean structure across clusters and incorporates distinct variance components into the imputation process. In this paper, I add to these methods by considering a random covariance structure and develop computational algorithms. The attraction of this new imputation modeling strategy is to correctly reflect the mean and variance structure of the joint distribution of the data, and allow the covariances differ across the clusters. Using Markov Chain Monte Carlo techniques, a predictive distribution of missing data given observed data is simulated leading to creation of multiple imputations. To circumvent the large sample size requirement to support independent covariance estimates for the level-1 error term, I consider distributional impositions mimicking random-effects distributions assigned a priori. These techniques are illustrated in an example exploring relationships between victimization and individual and contextual level factors that raise the risk of violent crime.  相似文献   

6.
In this article, empirical likelihood inference for estimating equation with missing data is considered. Based on the weighted-corrected estimating function, an empirical log-likelihood ratio is proved to be a standard chi-square distribution asymptotically under some suitable conditions. This result is different from those derived before. So it is convenient to construct confidence regions for the parameters of interest. We also prove that our proposed maximum empirical likelihood estimator θ is asymptotically normal and attains the semiparametric efficiency bound of missing data. Some simulations indicate that the proposed method performs the best.  相似文献   

7.
This article takes up Bayesian inference in linear models with disturbances from a noncentral Student-t distribution. The distribution is useful when both long tails and asymmetry are features of the data. The distribution can be expressed as a location-scale mixture of normals with inverse weights distributed according to a chi-square distribution. The computations are performed using Gibbs sampling with data augmentation. An empirical application to Standard and Poor's stock returns indicates that posterior odds strongly favor a noncentral Student-t specification over its symmetric counterpart.  相似文献   

8.
Probabilistic Decision Graphs (PDGs) are a class of graphical models that can naturally encode some context specific independencies that cannot always be efficiently captured by other popular models, such as Bayesian Networks. Furthermore, inference can be carried out efficiently over a PDG, in time linear in the size of the model. The problem of learning PDGs from data has been studied in the literature, but only for the case of complete data. We propose an algorithm for learning PDGs in the presence of missing data. The proposed method is based on the Expectation-Maximisation principle for estimating the structure of the model as well as the parameters. We test our proposal on both artificially generated data with different rates of missing cells and real incomplete data. We also compare the PDG models learnt by our approach to the commonly used Bayesian Network (BN) model. The results indicate that the PDG model is less sensitive to the rate of missing data than BN model. Also, though the BN models usually attain higher likelihood, the PDGs are close to them also in size, which makes the learnt PDGs preferable for probabilistic inference purposes.  相似文献   

9.
The mixture of Dirichlet process (MDP) defines a flexible prior distribution on the space of probability measures. This study shows that ordinary least-squares (OLS) estimator, as a functional of the MDP posterior distribution, has posterior mean given by weighted least-squares (WLS), and has posterior covariance matrix given by the (weighted) heteroscedastic-consistent sandwich estimator. This is according to a pairs bootstrap distribution approximation of the posterior, using a Pólya urn scheme. Also, when the MDP prior baseline distribution is specified as a product of independent probability measures, this WLS solution provides a new type of generalized ridge regression estimator. Such an estimator can handle multicollinear or singular design matrices even when the number of covariates exceeds the sample size, and can shrink the coefficient estimates of irrelevant covariates towards zero, which makes it useful for nonlinear regressions via basis expansions. Also, this MDP/OLS functional methodology can be extended to methods for analyzing the sensitivity of the heteroscedasticity-consistent causal effect size over a range of hidden biases, due to missing covariates omitted from the regression; and more generally, can be extended to a Vibration of Effects analysis. The methodology is illustrated through the analysis of simulated and real data sets. Overall, this study establishes new connections between Dirichlet process functional inference, the bootstrap, consistent sandwich covariance estimation, ridge shrinkage regression, WLS, and sensitivity analysis, to provide regression methodology useful for inferences of the mean dependent response.  相似文献   

10.
We establish computationally flexible methods and algorithms for the analysis of multivariate skew normal models when missing values occur in the data. To facilitate the computation and simplify the theoretic derivation, two auxiliary permutation matrices are incorporated into the model for the determination of observed and missing components of each observation. Under missing at random mechanisms, we formulate an analytically simple ECM algorithm for calculating parameter estimation and retrieving each missing value with a single-valued imputation. Gibbs sampling is used to perform a Bayesian inference on model parameters and to create multiple imputations for missing values. The proposed methodologies are illustrated through a real data set and comparisons are made with those obtained from fitting the normal counterparts.  相似文献   

11.
We report on a sequence of two classroom teaching experiments that investigated high school students’ understandings as they explored connections among the ideas comprising the inner logic of statistical inference—ideas involving a core image of sampling as a repeatable process, and the organization of its outcomes into a distribution of sample statistics as a basis for making inferences. Students’ responses to post-instruction test questions indicate that despite understanding various individual components of inference—a sample, a population, and a distribution of a sample statistic—their abilities to coordinate and compose these into a coherent and well-connected scheme of ideas were usually tenuous. We argue that the coordination and composition required to assemble these component ideas into a coherent scheme is a major source of difficulty in developing a deep understanding of inference.  相似文献   

12.
Maximum likelihood estimation in finite mixture distributions is typically approached as an incomplete data problem to allow application of the expectation-maximization (EM) algorithm. In its general formulation, the EM algorithm involves the notion of a complete data space, in which the observed measurements and incomplete data are embedded. An advantage is that many difficult estimation problems are facilitated when viewed in this way. One drawback is that the simultaneous update used by standard EM requires overly informative complete data spaces, which leads to slow convergence in some situations. In the incomplete data context, it has been shown that the use of less informative complete data spaces, or equivalently smaller missing data spaces, can lead to faster convergence without sacrifying simplicity. However, in the mixture case, little progress has been made in speeding up EM. In this article we propose a component-wise EM for mixtures. It uses, at each iteration, the smallest admissible missing data space by intrinsically decoupling the parameter updates. Monotonicity is maintained, although the estimated proportions may not sum to one during the course of the iteration. However, we prove that the mixing proportions will satisfy this constraint upon convergence. Our proof of convergence relies on the interpretation of our procedure as a proximal point algorithm. For performance comparison, we consider standard EM as well as two other algorithms based on missing data space reduction, namely the SAGE and AECME algorithms. We provide adaptations of these general procedures to the mixture case. We also consider the ECME algorithm, which is not a data augmentation scheme but still aims at accelerating EM. Our numerical experiments illustrate the advantages of the component-wise EM algorithm relative to these other methods.  相似文献   

13.
We propose a new robustness diagnostic scheme for bootstrap inference procedures. The scheme is adaptive to the data actually observed, applies readily to bootstrap inference output of diverse format, and therefore provides robustness diagnostics practically more relevant than most conventional robustness measures. Specifically, it monitors the sensitivity of the bootstrap distribution of inference output to specially designed omnidirectional data perturbations, and quantifies findings by a standardized measure with the aid of repeated resampling. The resulting measure, displayed in the form of an R-value plot, permits direct comparisons across different bootstrap procedures and across inference output of different types. Numerical examples are presented using both simulated and real-life data to illustrate applications of the scheme to estimation and hypothesis testing problems. This article has supplementary material online.  相似文献   

14.
The multiset sampler, an MCMC algorithm recently proposed by Leman and coauthors, is an easy-to-implement algorithm which is especially well-suited to drawing samples from a multimodal distribution. We generalize the algorithm by redefining the multiset sampler with an explicit link between target distribution and sampling distribution. The generalized formulation replaces the multiset with a K-tuple, which allows us to use the algorithm on unbounded parameter spaces, improves estimation, and sets up further extensions to adaptive MCMC techniques. Theoretical properties of the algorithm are provided and guidance is given on its implementation. Examples, both simulated and real, confirm that the generalized multiset sampler provides a simple, general and effective approach to sampling from multimodal distributions. Supplementary materials for this article are available online.  相似文献   

15.
Stochastic epidemic models describe the dynamics of an epidemic as a disease spreads through a population. Typically, only a fraction of cases are observed at a set of discrete times. The absence of complete information about the time evolution of an epidemic gives rise to a complicated latent variable problem in which the state space size of the epidemic grows large as the population size increases. This makes analytically integrating over the missing data infeasible for populations of even moderate size. We present a data augmentation Markov chain Monte Carlo (MCMC) framework for Bayesian estimation of stochastic epidemic model parameters, in which measurements are augmented with subject-level disease histories. In our MCMC algorithm, we propose each new subject-level path, conditional on the data, using a time-inhomogenous continuous-time Markov process with rates determined by the infection histories of other individuals. The method is general, and may be applied to a broad class of epidemic models with only minimal modifications to the model dynamics and/or emission distribution. We present our algorithm in the context of multiple stochastic epidemic models in which the data are binomially sampled prevalence counts, and apply our method to data from an outbreak of influenza in a British boarding school. Supplementary material for this article is available online.  相似文献   

16.
Incorporating statistical multiple comparisons techniques with credit risk measurement, a new methodology is proposed to construct exact confidence sets and exact confidence bands for a beta distribution. This involves simultaneous inference on the two parameters of the beta distribution, based upon the inversion of Kolmogorov tests. Some monotonicity properties of the distribution function of the beta distribution are established which enable the derivation of an efficient algorithm for the implementation of the procedure. The methodology has important applications to financial risk management. Specifically, the analysis of loss given default (LGD) data are often modeled with a beta distribution. This new approach properly addresses model risk caused by inadequate sample sizes of LGD data, and can be used in conjunction with the standard recommendations provided by regulators to provide enhanced and more informative analyses.  相似文献   

17.
Multiple imputation (MI) has become a standard statistical technique for dealing with missing values. The CDC Anthrax Vaccine Research Program (AVRP) dataset created new challenges for MI due to the large number of variables of different types and the limited sample size. A common method for imputing missing data in such complex studies is to specify, for each of J variables with missing values, a univariate conditional distribution given all other variables, and then to draw imputations by iterating over the J conditional distributions. Such fully conditional imputation strategies have the theoretical drawback that the conditional distributions may be incompatible. When the missingness pattern is monotone, a theoretically valid approach is to specify, for each variable with missing values, a conditional distribution given the variables with fewer or the same number of missing values and sequentially draw from these distributions. In this article, we propose the “multiple imputation by ordered monotone blocks” approach, which combines these two basic approaches by decomposing any missingness pattern into a collection of smaller “constructed” monotone missingness patterns, and iterating. We apply this strategy to impute the missing data in the AVRP interim data. Supplemental materials, including all source code and a synthetic example dataset, are available online.  相似文献   

18.
We offer an inference methodology for the upper endpoint of a regularly varying distribution with finite endpoint. We apply it to the IDL and GRG data sets of lifespans of super-centenarians. As in the comprehensive analysis of Rootzén and Zholud, our results underscore the effect of the data sampling scheme and censoring on the conclusions. We also quantify the statistical difficulty of distinguishing between the hypotheses of finite and infinite lifespan by providing estimates of the required sample size.  相似文献   

19.
In this paper, stochastic approximation (SA) algorithm with a new adaptive step size scheme is proposed. New adaptive step size scheme uses a fixed number of previous noisy function values to adjust steps at every iteration. The algorithm is formulated for a general descent direction and almost sure convergence is established. The case when negative gradient is chosen as a search direction is also considered. The algorithm is tested on a set of standard test problems. Numerical results show good performance and verify efficiency of the algorithm compared to some of existing algorithms with adaptive step sizes.  相似文献   

20.
In this paper we consider the generalized gamma distribution as introduced in Gåsemyr and Natvig (1998). This distribution enters naturally in Bayesian inference in exponential survival models with left censoring. In the paper mentioned above it is shown that the weighted sum of products of generalized gamma distributions is a conjugate prior for the parameters of component lifetimes, having autopsy data in a Marshall-Olkin shock model. A corresponding result is shown in Gåsemyr and Natvig (1999) for independent, exponentially distributed component lifetimes in a model with partial monitoring of components with applications to preventive system maintenance. A discussion in the present paper strongly indicates that expressing the posterior distribution in terms of the generalized gamma distribution is computationally efficient compared to using the ordinary gamma distribution in such models. Furthermore, we present two types of sequential Metropolis-Hastings algorithms that may be used in Bayesian inference in situations where exact methods are intractable. Finally these types of algorithms are compared with standard simulation techniques and analytical results in arriving at the posterior distribution of the parameters of component lifetimes in special cases of the mentioned models. It seems that one of these types of algorithms may be very favorable when prior assessments are updated by several data sets and when there are significant discrepancies between the prior assessments and the data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号