首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A Gaussian measurement error assumption, that is, an assumption that the data are observed up to Gaussian noise, can bias any parameter estimation in the presence of outliers. A heavy tailed error assumption based on Student’s t distribution helps reduce the bias. However, it may be less efficient in estimating parameters if the heavy tailed assumption is uniformly applied to all of the data when most of them are normally observed. We propose a mixture error assumption that selectively converts Gaussian errors into Student’s t errors according to latent outlier indicators, leveraging the best of the Gaussian and Student’s t errors; a parameter estimation can be not only robust but also accurate. Using simulated hospital profiling data and astronomical time series of brightness data, we demonstrate the potential for the proposed mixture error assumption to estimate parameters accurately in the presence of outliers. Supplemental materials for this article are available online.  相似文献   

2.
This paper addresses the problem of modelling time series with nonstationarity from a finite number of observations. Problems encountered with the time varying parameters in regression type models led to the smoothing techniques. The smoothing methods basically rely on the finiteness of the error variance, and thus, when this requirement fails, particularly when the error distribution is heavy tailed, the existing smoothing methods due to [1], are no longer optimal. In this paper, we propose a penalized minimum dispersion method for time varying parameter estimation when a regression model generated by an infinite variance stable process with characteristic exponent α ε (1, 2). Recursive estimates are evaluated and it is shown that these estimates for a nonstationary process with normal errors is a special case.  相似文献   

3.
In the Gaussian Kriging model, errors are assumed to follow a Gaussian process. This is reasonable in many cases, but such an assumption is not appropriate for the situations when outliers are present. Large prediction errors may occur in those cases and more robust estimation is critical. In this article, we propose a robust estimation of Kriging parameters by utilizing other loss functions rather than classical L2. In the Gaussian Kriging model, regression parameters are estimated by generalized least squares, which are also referred to as L2 criterion. To make these estimators more robust to outliers, the L1 and the ?-insensitive loss functions are introduced in place of L2 in this article. Mathematical programming formulations are developed upon the idea of support vector machine. A machining experiment data are analysed to verify usefulness of the proposed method.  相似文献   

4.
Compactly supported autocovariance functions reduce computations needed for estimation and prediction under Gaussian process models, which are commonly used to model spatial and spatial-temporal data. A critical issue in using such models is the loss in statistical efficiency caused when the true autocovariance function is not compactly supported. Theoretical results indicate the value of specifying the local behavior of the process correctly. One way to obtain a compactly supported autocovariance function that has similar local behavior to an autocovariance function K of interest is to multiply K by some smooth compactly supported autocovariance function, which is called covariance tapering. This work extends previous theoretical results showing that covariance tapering has some asymptotic optimality properties as the number of observations in a fixed and bounded domain increases. However, numerical experiments show that for purposes of parameter estimation, covariance tapering often does not work as well as the simple alternative of breaking the observations into blocks and ignoring dependence across blocks. When covariance tapering is used for spatial prediction, predictions near the boundary of the observation domain are affected most. This article proposes an approach to modifying the taper to ameliorate this edge effect. In addition, a justification for a specific approach to carrying out conditional simulations based on tapered covariances is given. Supplementary materials for this article are available online.  相似文献   

5.
This article proposes a four-pronged approach to efficient Bayesian estimation and prediction for complex Bayesian hierarchical Gaussian models for spatial and spatiotemporal data. The method involves reparameterizing the covariance structure of the model, reformulating the means structure, marginalizing the joint posterior distribution, and applying a simplex-based slice sampling algorithm. The approach permits fusion of point-source data and areal data measured at different resolutions and accommodates nonspatial correlation and variance heterogeneity as well as spatial and/or temporal correlation. The method produces Markov chain Monte Carlo samplers with low autocorrelation in the output, so that fewer iterations are needed for Bayesian inference than would be the case with other sampling algorithms. Supplemental materials are available online.  相似文献   

6.

Multiple linear regression model based on normally distributed and uncorrelated errors is a popular statistical tool with application in various fields. But these assumptions of normality and no serial correlation are hardly met in real life. Hence, this study considers the linear regression time series model for series with outliers and autocorrelated errors. These autocorrelated errors are represented by a covariance-stationary autoregressive process where the independent innovations are driven by shape mixture of skew-t normal distribution. The shape mixture of skew-t normal distribution is a flexible extension of the skew-t normal with an additional shape parameter that controls skewness and kurtosis. With this error model, stochastic modeling of multiple outliers is possible with an adaptive robust maximum likelihood estimation of all the parameters. An Expectation Conditional Maximization Either algorithm is developed to carryout the maximum likelihood estimation. We derive asymptotic standard errors of the estimators through an information-based approximation. The performance of the estimation procedure developed is evaluated through Monte Carlo simulations and real life data analysis.

  相似文献   

7.
The Gaussian geostatistical model has been widely used for modeling spatial data. However, this model suffers from a severe difficulty in computation: it requires users to invert a large covariance matrix. This is infeasible when the number of observations is large. In this article, we propose an auxiliary lattice-based approach for tackling this difficulty. By introducing an auxiliary lattice to the space of observations and defining a Gaussian Markov random field on the auxiliary lattice, our model completely avoids the requirement of matrix inversion. It is remarkable that the computational complexity of our method is only O(n), where n is the number of observations. Hence, our method can be applied to very large datasets with reasonable computational (CPU) times. The numerical results indicate that our model can approximate Gaussian random fields very well in terms of predictions, even for those with long correlation lengths. For real data examples, our model can generally outperform conventional Gaussian random field models in both prediction errors and CPU times. Supplemental materials for the article are available online.  相似文献   

8.
Summary This paper establishes the uniform closeness of a weighted residual empirical process to its natural estimate in the linear regression setting when the errors are Gaussian, or a function of Gaussian random variables, that are strictly stationary and long range dependent. This result is used to yield the asymptotic uniform linearity of a class of rank statistics in linear regression models with long range dependent errors. The latter result, in turn, yields the asymptotic distribution of the Jaeckel (1972) rank estimators. The paper also studies the least absolute deviation and a class of certain minimum distance estimators of regression parameters and the kernel type density estimators of the marginal error density when the errors are long range dependent.Research of this author was partly supported by the NSF grant: DMS-9102041  相似文献   

9.
In this article, a new model for decision making under uncertainty is presented. Here, we model human attitude toward risks to show that an individual estimate of the expected utility of a lottery follows a generalized Beta distribution with a random error that follows a similar distribution. An individual is said to maximize his stochastic utility when requested to present his preference between risky lotteries. Hypothetically, risky lotteries are those exhibiting wider ranges of rewards where human estimate will not be below the utility of the lowest reward nor above the highest of the lottery. The Beta distribution is bounded and complies to such intuitive preconditions with a variance depending on such bounds. The proposed model will overestimate/underestimate the expected utility of a lottery according to the lottery probability mass and individuals' risk attitudes. By such estimation, our model conforms to the fourfold choice pattern. The model also explains the violations present as inconsistencies in the expected utility theory, such as Allais paradox, common consequence effect, common ratio effect, and the violation of betweenness that can be found in the fourfold choice pattern. For the validation purposes, 13 datasets from literature were collected and tested. The β-SU model fits the data at least as good as other approaches such as the CPT/StEUT and presents higher prediction log-likelihoods and less sum of squared errors in most of the cases, a matter that supports the proposition that human estimates of the expected utility may be drawn out of a generalized Beta distribution.  相似文献   

10.
Editorial     
Complete randomization for many industrial and agricultural experiments is frequently impractical due to constraints in time, cost, or existence of one or more hard-to-change factors. In these situations, restrictions on randomization may lead to split-plot designs, allowing certain factor levels to be randomly applied to the whole plot units and remaining factor levels randomly applied to the subplot units. Separate random errors in whole and subplot units from the two randomizations results in a compound symmetric error structure, which affects estimation, inference, and choice of design. In this article, we consider the prediction properties of split-plot designs, expanding the comparison between designs beyond parameter estimation properties and present three-dimensional variance dispersion graphs. These graphs can be used for evaluating the prediction capability of split-plot designs as well as for developing design strategies. We demonstrate, through several examples, that three-dimensional variance dispersion plots offer a more comprehensive study of competing designs than what is offered when comparing designs with single number optimality criteria (such as D-, G-, and V-criteria).  相似文献   

11.
Abstract

Simultaneous confidence bands of a regression curve may be used to quantify the uncertainty of an estimate of the curve. The tube formula for volumes of tubular neighborhoods of a manifold provides a very powerful method for obtaining such bands at a prescribed level, when errors are Gaussian. This article studies robustness of the tube formula for non-Gaussian errors. The formula holds without modification for an error vector with a spherically symmetric distribution. Simulations are used for a variety of independent non-Gaussian error distributions. The results are acceptable for contaminated and heavy tailed error distributions. The formula can break down in some extreme cases for discrete and highly skewed errors. Computational issues involved in applying the tube formula are also discussed.  相似文献   

12.
Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are modeled through a model, whose parameters are also estimated from data. Multilevel model fails to fit well typically by the use of the EM algorithm once one of level error variance (like Cauchy distribution) tends to infinity. This paper proposes a composite multilevel to combine the nested structure of multilevel data and the robustness of the composite quantile regression, which greatly improves the efficiency and precision of the estimation. The new approach, which is based on the Gauss-Seidel iteration and takes a full advantage of the composite quantile regression and multilevel models, still works well when the error variance tends to infinity, We show that even the error distribution is normal, the MSE of the estimation of composite multilevel quantile regression models nearly equals to mean regression. When the error distribution is not normal, our method still enjoys great advantages in terms of estimation efficiency.  相似文献   

13.
This paper describes the identification of nonlinear dynamic systems with a Gaussian process (GP) prior model. This model is an example of the use of a probabilistic non-parametric modelling approach. GPs are flexible models capable of modelling complex nonlinear systems. Also, an attractive feature of this model is that the variance associated with the model response is readily obtained, and it can be used to highlight areas of the input space where prediction quality is poor, owing to the lack of data or complexity (high variance). We illustrate the GP modelling technique on a simulated example of a nonlinear system.  相似文献   

14.
We introduce a class of spatiotemporal models for Gaussian areal data. These models assume a latent random field process that evolves through time with random field convolutions; the convolving fields follow proper Gaussian Markov random field (PGMRF) processes. At each time, the latent random field process is linearly related to observations through an observational equation with errors that also follow a PGMRF. The use of PGMRF errors brings modeling and computational advantages. With respect to modeling, it allows more flexible model structures such as different but interacting temporal trends for each region, as well as distinct temporal gradients for each region. Computationally, building upon the fact that PGMRF errors have proper density functions, we have developed an efficient Bayesian estimation procedure based on Markov chain Monte Carlo with an embedded forward information filter backward sampler (FIFBS) algorithm. We show that, when compared with the traditional one-at-a-time Gibbs sampler, our novel FIFBS-based algorithm explores the posterior distribution much more efficiently. Finally, we have developed a simulation-based conditional Bayes factor suitable for the comparison of nonnested spatiotemporal models. An analysis of the number of homicides in Rio de Janeiro State illustrates the power of the proposed spatiotemporal framework.

Supplemental materials for this article are available online in the journal’s webpage.  相似文献   

15.
In this paper we consider the estimation of the error distribution in a heteroscedastic nonparametric regression model with multivariate covariates. As estimator we consider the empirical distribution function of residuals, which are obtained from multivariate local polynomial fits of the regression and variance functions, respectively. Weak convergence of the empirical residual process to a Gaussian process is proved. We also consider various applications for testing model assumptions in nonparametric multiple regression. The model tests obtained are able to detect local alternatives that converge to zero at an n−1/2-rate, independent of the covariate dimension. We consider in detail a test for additivity of the regression function.  相似文献   

16.
In this paper, we use simulations to investigate the relationship between data envelopment analysis (DEA) efficiency and major production functions: Cobb-Douglas, the constant elasticity of substitution, and the transcendental logarithmic. Two DEA models were used: a constant return to scale (CCR model), and a variable return to scale (BCC model). Each of the models was investigated in two versions: with bounded and unbounded weights. Two cases were simulated: with and without errors in the production functions estimation. Various degrees of homogeneity (of the production function) were tested, reflecting a constant increasing and decreasing return to scale. With respect to the case with errors, three distribution functions were utilized: uniform, normal, and double exponential. For each distribution, 16 levels of the coefficient of variance (CV) were used. In all the tested cases, two measures were analysed: the percentage of efficient units (from the total number of units), and the average efficiency score. We applied a regression analysis to test the relationship between these two efficiency measures and the above parameters. Overall, we found that the degree of homogeneity has the largest effect on efficiency. Efficiency declines as the errors grow (as reflected by larger CV and of the expansion of the probability distribution function away from the centre). The bounds on the weights tend to smooth the effect, and bring the various DEA versions closer to one other. The type of efficiency measure has similar regression tendencies. Finally, the relationship between the efficiency measures and the explanatory variables is quadratic.  相似文献   

17.
This paper studies properties of an estimator of mean–variance portfolio weights in a market model with multiple risky assets and a riskless asset. Theoretical formulas for the mean square error are derived in the case when asset excess returns are multivariate normally distributed and serially independent. The sensitivity of the portfolio estimator to errors arising from the estimation of the covariance matrix and the mean vector is quantified. It turns out that the relative contribution of the covariance matrix error depends mainly on the Sharpe ratio of the market portfolio and the sampling frequency of historical data. Theoretical studies are complemented by an investigation of the distribution of portfolio estimator for empirical datasets. An appropriately crafted bootstrapping method is employed to compute the empirical mean square error. Empirical and theoretical estimates are in good agreement, with the empirical values being, in general, higher.  相似文献   

18.
Knowledge of the probability distribution of error in a regression problem plays an important role in verification of an assumed regression model, making inference about predictions, finding optimal regression estimates, suggesting confidence bands and goodness of fit tests as well as in many other issues of the regression analysis. This article is devoted to an optimal estimation of the error probability density in a general heteroscedastic regression model with possibly dependent predictors and regression errors. Neither the design density nor regression function nor scale function is assumed to be known, but they are suppose to be differentiable and an estimated error density is suppose to have a finite support and to be at least twice differentiable. Under this assumption the article proves, for the first time in the literature, that it is possible to estimate the regression error density with the accuracy of an oracle that knows “true” underlying regression errors. Real and simulated examples illustrate importance of the error density estimation as well as the suggested oracle methodology and the method of estimation.  相似文献   

19.
In the context of semi-functional partial linear regression model, we study the problem of error density estimation. The unknown error density is approximated by a mixture of Gaussian densities with means being the individual residuals, and variance a constant parameter. This mixture error density has a form of a kernel density estimator of residuals, where the regression function, consisting of parametric and nonparametric components, is estimated by the ordinary least squares and functional Nadaraya–Watson estimators. The estimation accuracy of the ordinary least squares and functional Nadaraya–Watson estimators jointly depends on the same bandwidth parameter. A Bayesian approach is proposed to simultaneously estimate the bandwidths in the kernel-form error density and in the regression function. Under the kernel-form error density, we derive a kernel likelihood and posterior for the bandwidth parameters. For estimating the regression function and error density, a series of simulation studies show that the Bayesian approach yields better accuracy than the benchmark functional cross validation. Illustrated by a spectroscopy data set, we found that the Bayesian approach gives better point forecast accuracy of the regression function than the functional cross validation, and it is capable of producing prediction intervals nonparametrically.  相似文献   

20.
A linear functional errors-in-variables model with unknown slope parameter and Gaussian errors is considered. The measurement error variance is supposed to be known, while the variance of errors in the equation is unknown. In this model a risk bound of asymptotic minimax type for arbitrary estimators is established. The bound lies above that one which was found previously in the case of both variances known. The bound is attained by an adjusted least-squares estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号