期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Selecting the Number of Knots for Penalized Splines

《Journal of computational and graphical statistics》2013,22(4):735-757

Penalized splines, or P-splines, are regression splines fit by least-squares with a roughness penalty.P-splines have much in common with smoothing splines, but the type of penalty used with a P-spline is somewhat more general than for a smoothing spline. Also, the number and location of the knots of a P-spline is not fixed as with a smoothing spline. Generally, the knots of a P-spline are at fixed quantiles of the independent variable and the only tuning parameters to choose are the number of knots and the penalty parameter. In this article, the effects of the number of knots on the performance of P-splines are studied. Two algorithms are proposed for the automatic selection of the number of knots. The myopic algorithm stops when no improvement in the generalized cross-validation statistic (GCV) is noticed with the last increase in the number of knots. The full search examines all candidates in a fixed sequence of possible numbers of knots and chooses the candidate that minimizes GCV.The myopic algorithm works well in many cases but can stop prematurely. The full-search algorithm worked well in all examples examined. A Demmler–Reinsch type diagonalization for computing univariate and additive P-splines is described. The Demmler–Reinsch basis is not effective for smoothing splines because smoothing splines have too many knots. For P-splines, however, the Demmler–Reinsch basis is very useful for super-fast generalized cross-validation. 相似文献

2.

Variable Selection in Varying-Coefficient Models Using P-Splines

Anestis Antoniadis Irène Gijbels Anneleen Verhasselt 《Journal of computational and graphical statistics》2013,22(3):638-661

In this article, we consider nonparametric smoothing and variable selection in varying-coefficient models. Varying-coefficient models are commonly used for analyzing the time-dependent effects of covariates on responses measured repeatedly (such as longitudinal data). We present the P-spline estimator in this context and show its estimation consistency for a diverging number of knots (or B-spline basis functions). The combination of P-splines with nonnegative garrote (which is a variable selection method) leads to good estimation and variable selection. Moreover, we consider APSO (additive P-spline selection operator), which combines a P-spline penalty with a regularization penalty, and show its estimation and variable selection consistency. The methods are illustrated with a simulation study and real-data examples. The proofs of the theoretical results as well as one of the real-data examples are provided in the online supplementary materials. 相似文献

3.

Noninformative priors for the ratio of the shape parameters of two Weibull distributions

Sang Gil Kang Woo Dong Lee Yongku Kim 《Computational Statistics》2017,32(1):35-50

The Weibull distribution is one of the most widely used lifetime distributions in reliability engineering. Here, the noninformative priors for the ratio of the shape parameters of two Weibull models are introduced. The first criterion used is the asymptotic matching of the coverage probabilities of Bayesian credible intervals with the corresponding frequentist coverage probabilities. We develop the probability matching priors for the ratio of the shape parameters using the following matching criteria: quantile matching, matching of the distribution function, highest posterior density matching, and matching via inversion of the test statistics. We obtain one particular prior that meets all the matching criteria. Next, we derive the reference priors for different groups of ordering. Our findings show that some of the reference priors satisfy a first-order matching criterion and the one-at-a-time reference prior is a second-order matching prior. Lastly, we perform a simulation study and provide a real-world example. 相似文献

4.

Coverage of generalized confidence intervals

Anindya Roy Arup Bose 《Journal of multivariate analysis》2009,100(7):1384-1397

Generalized confidence intervals provide confidence intervals for complicated parametric functions in many common practical problems. They do not have exact frequentist coverage in general, but often provide coverage close to the nominal value and have the correct asymptotic coverage. However, in many applications generalized confidence intervals do not have satisfactory finite sample performance. We derive expansions of coverage probabilities of one-sided generalized confidence intervals and use the expansions to explain the nonuniform performance of the generalized intervals. We then show how to use these expansions to obtain improved coverage by suitable calibration. The benefits of the proposed modification are illustrated via several examples. 相似文献

5.

A note on approximate Bayesian credible sets based on modified loglikelihood ratios

Laura Ventura Erlis Ruli Walter Racugno 《Statistics & probability letters》2013

Higher-order asymptotic arguments for a scalar parameter of interest have been widely investigated for Bayesian inference. In this paper the theory of asymptotic expansions is discussed for a vector parameter of interest. A modified loglikelihood ratio is suggested, which can be used to derive approximate Bayesian credible sets with accurate frequentist coverage. Three examples are illustrated. 相似文献

6.

Objective Bayesian analysis for CAR models

Cuirong Ren Dongchu Sun 《Annals of the Institute of Statistical Mathematics》2013,65(3):457-472

Objective priors, especially reference priors, have been studied extensively for spatial data in the last decade. In this paper, we study objective priors for a CAR model. In particular, the properties of the reference prior and the corresponding posterior are studied. Furthermore, we show that the frequentist coverage probabilities of posterior credible intervals depend only on the spatial dependence parameter $\rho $ , and not on the regression coefficient or the error variance. Based on the simulation study for comparing the reference and Jeffreys priors, the performance of two reference priors is similar and better than the Jeffreys priors. One spatial dataset is used for illustration. 相似文献

7.

Simultaneous credible intervals for small area estimation problems

N. Ganesh 《Journal of multivariate analysis》2009,100(8):1610-1621

In this paper, we fill in an important research gap in small area literature, namely the problem of constructing simultaneous credible intervals. We illustrate how the Bayesian approach can be applied to develop different simultaneous credible interval procedures. The utility of our method is illustrated through simulation and data analysis. 相似文献

8.

Empirical likelihood for single-index models 总被引：1，自引：0，他引：1

Liu-Gen Xue Lixing Zhu 《Journal of multivariate analysis》2006,97(6):1295-1312

The empirical likelihood method is especially useful for constructing confidence intervals or regions of the parameter of interest. This method has been extensively applied to linear regression and generalized linear regression models. In this paper, the empirical likelihood method for single-index regression models is studied. An estimated empirical log-likelihood approach to construct the confidence region of the regression parameter is developed. An adjusted empirical log-likelihood ratio is proved to be asymptotically standard chi-square. A simulation study indicates that compared with a normal approximation-based approach, the proposed method described herein works better in terms of coverage probabilities and areas (lengths) of confidence regions (intervals). 相似文献

9.

Bayesian networks with a logistic regression model for the conditional probabilities

Frank Rijmen 《International Journal of Approximate Reasoning》2008,48(2):659-666

Logistic regression techniques can be used to restrict the conditional probabilities of a Bayesian network for discrete variables. More specifically, each variable of the network can be modeled through a logistic regression model, in which the parents of the variable define the covariates. When all main effects and interactions between the parent variables are incorporated as covariates, the conditional probabilities are estimated without restrictions, as in a traditional Bayesian network. By incorporating interaction terms up to a specific order only, the number of parameters can be drastically reduced. Furthermore, ordered logistic regression can be used when the categories of a variable are ordered, resulting in even more parsimonious models. Parameters are estimated by a modified junction tree algorithm. The approach is illustrated with the Alarm network. 相似文献

10.

Priors for Bayesian adaptive spline smoothing

Yu Ryan Yue Paul L. Speckman Dongchu Sun 《Annals of the Institute of Statistical Mathematics》2012,64(3):577-613

Adaptive smoothing has been proposed for curve-fitting problems where the underlying function is spatially inhomogeneous. Two Bayesian adaptive smoothing models, Bayesian adaptive smoothing splines on a lattice and Bayesian adaptive P-splines, are studied in this paper. Estimation is fully Bayesian and carried out by efficient Gibbs sampling. Choice of prior is critical in any Bayesian non-parametric regression method. We use objective priors on the first level parameters where feasible, specifically independent Jeffreys priors (right Haar priors) on the implied base linear model and error variance, and we derive sufficient conditions on higher level components to ensure that the posterior is proper. Through simulation, we demonstrate that the common practice of approximating improper priors by proper but diffuse priors may lead to invalid inference, and we show how appropriate choices of proper but only weakly informative priors yields satisfactory inference. 相似文献

11.

Bayesian Calibration and Uncertainty Analysis for Computationally Expensive Models Using Optimization and Radial Basis Function Approximation

《Journal of computational and graphical statistics》2013,22(2):270-294

We presenta Bayesian approach to model calibration when evaluation of the model is computationally expensive. Here, calibration is a nonlinear regression problem: given a data vector Y corresponding to the regression model f(β), find plausible values of β. As an intermediate step, Y and f are embedded into a statistical model allowing transformation and dependence. Typically, this problem is solved by sampling from the posterior distribution of β given Y using MCMC. To reduce computational cost, we limit evaluation of f to a small number of points chosen on a high posterior density region found by optimization.Then,we approximate the logarithm of the posterior density using radial basis functions and use the resulting cheap-to-evaluate surface in MCMC.We illustrate our approach on simulated data for a pollutant diffusion problem and study the frequentist coverage properties of credible intervals. Our experiments indicate that our method can produce results similar to those when the true “expensive” posterior density is sampled by MCMC while reducing computational costs by well over an order of magnitude. 相似文献

12.

Modelling LGD for unsecured retail loans using Bayesian methods

Katarzyna Bijak Lyn C Thomas 《The Journal of the Operational Research Society》2015,66(2):342-352

Loss Given Default (LGD) is the loss borne by the bank when a customer defaults on a loan. LGD for unsecured retail loans is often found difficult to model. In the frequentist (non-Bayesian) two-step approach, two separate regression models are estimated independently, which can be considered potentially problematic when trying to combine them to make predictions about LGD. The result is a point estimate of LGD for each loan. Alternatively, LGD can be modelled using Bayesian methods. In the Bayesian framework, one can build a single, hierarchical model instead of two separate ones, which makes this a more coherent approach. In this paper, Bayesian methods as well as the frequentist approach are applied to the data on personal loans provided by a large UK bank. As expected, the posterior means of parameters that have been produced using Bayesian methods are very similar to the frequentist estimates. The most important advantage of the Bayesian model is that it generates an individual predictive distribution of LGD for each loan. Potential applications of such distributions include the downturn LGD and the stressed LGD under Basel II. 相似文献

13.

Simultaneous confidence intervals for several inverse Gaussian populations

《Statistics & probability letters》2014

In this research, we propose simultaneous confidence intervals for all pairwise comparisons of means from inverse Gaussian distribution. Our method is based on fiducial generalized pivotal quantities for vector parameters. We prove that the constructed confidence intervals have asymptotically correct coverage probabilities. Simulation results show that the simulated Type-I errors are close to the nominal level even for small samples. The proposed approach is illustrated by an example. 相似文献

14.

Simultaneous estimation and variable selection in median regression using Lasso-type penalty

Jinfeng Xu Zhiliang Ying 《Annals of the Institute of Statistical Mathematics》2010,62(3):487-514

We consider the median regression with a LASSO-type penalty term for variable selection. With the fixed number of variables in regression model, a two-stage method is proposed for simultaneous estimation and variable selection where the degree of penalty is adaptively chosen. A Bayesian information criterion type approach is proposed and used to obtain a data-driven procedure which is proved to automatically select asymptotically optimal tuning parameters. It is shown that the resultant estimator achieves the so-called oracle property. The combination of the median regression and LASSO penalty is computationally easy to implement via the standard linear programming. A random perturbation scheme can be made use of to get simple estimator of the standard error. Simulation studies are conducted to assess the finite-sample performance of the proposed method. We illustrate the methodology with a real example. 相似文献

15.

MARS: selecting basis functions and knots with an empirical Bayes method

Wataru Sakamoto 《Computational Statistics》2007,22(4):583-597

An empirical Bayes method to select basis functions and knots in multivariate adaptive regression spline (MARS) is proposed, which takes both advantages of frequentist model selection approaches and Bayesian approaches. A penalized likelihood is maximized to estimate regression coefficients for selected basis functions, and an approximated marginal likelihood is maximized to select knots and variables involved in basis functions. Moreover, the Akaike Bayes information criterion (ABIC) is used to determine the number of basis functions. It is shown that the proposed method gives estimation of regression structure that is relatively parsimonious and more stable for some example data sets. 相似文献

16.

Bayesian lasso binary quantile regression

Dries F. Benoit Rahim Alhamzawi Keming Yu 《Computational Statistics》2013,28(6):2861-2873

In this paper, a Bayesian hierarchical model for variable selection and estimation in the context of binary quantile regression is proposed. Existing approaches to variable selection in a binary classification context are sensitive to outliers, heteroskedasticity or other anomalies of the latent response. The method proposed in this study overcomes these problems in an attractive and straightforward way. A Laplace likelihood and Laplace priors for the regression parameters are proposed and estimated with Bayesian Markov Chain Monte Carlo. The resulting model is equivalent to the frequentist lasso procedure. A conceptional result is that by doing so, the binary regression model is moved from a Gaussian to a full Laplacian framework without sacrificing much computational efficiency. In addition, an efficient Gibbs sampler to estimate the model parameters is proposed that is superior to the Metropolis algorithm that is used in previous studies on Bayesian binary quantile regression. Both the simulation studies and the real data analysis indicate that the proposed method performs well in comparison to the other methods. Moreover, as the base model is binary quantile regression, a much more detailed insight in the effects of the covariates is provided by the approach. An implementation of the lasso procedure for binary quantile regression models is available in the R-package bayesQR. 相似文献

17.

Combined analysis of unique and repetitive events in quantitative risk assessment

《International Journal of Approximate Reasoning》2016

For risk assessment to be a relevant tool in the study of any type of system or activity, it needs to be based on a framework that allows for jointly analyzing both unique and repetitive events. Separately, unique events may be handled by predictive probability assignments on the events, and repetitive events with unknown/uncertain frequencies are typically handled by the probability of frequency (or Bayesian) approach. Regardless of the nature of the events involved, there may be a problem with imprecision in the probability assignments. Several uncertainty representations with the interpretation of lower and upper probability have been developed for reflecting such imprecision. In particular, several methods exist for jointly propagating precise and imprecise probabilistic input in the probability of frequency setting. In the present position paper we outline a framework for the combined analysis of unique and repetitive events in quantitative risk assessment using both precise and imprecise probability. In particular, we extend an existing method for jointly propagating probabilistic and possibilistic input by relaxing the assumption that all events involved have frequentist probabilities; instead we assume that frequentist probabilities may be introduced for some but not all events involved, i.e. some events are assumed to be unique and require predictive – possibly imprecise – probabilistic assignments, i.e. subjective probability assignments on the unique events without introducing underlying frequentist probabilities for these. A numerical example related to environmental risk assessment of the drilling of an oil well is included to illustrate the application of the resulting method. 相似文献

18.

Controlling Type II Error While Constructing Triple Sampling Fixed Precision Confidence Intervals for the Normal Mean

M. S. Son L. D. Haugh H. I. Hamdy M. C. Costanza 《Annals of the Institute of Statistical Mathematics》1997,49(4):681-692

The rationale and methodology for estimating a mean with a fixed width confidence interval through sampling in three stages are extended to cover the additional problem of testing hypotheses concerning shifts in the mean with controlled Type II error. The coverage probability and operating characteristic function of the confidence interval based on the integrated approach are derived and compared with those of the usual triple sampling confidence interval. The extended methodology leads to better coverage probability and uniformly better Type II error probabilities. Achieving the additional objective of controlling Type II error inevitably implies a two- to threefold increase in the required optimal sample size. Some suggestions for dealing with this apparent limitation are discussed from a practical viewpoint. It is recommended that an integrated approach to estimation and testing based on confidence intervals be incorporated in the design stage for credible inferences. 相似文献

19.

Frequentist and Bayesian measures of confidence via multiscale bootstrap for testing three regions

Hidetoshi Shimodaira 《Annals of the Institute of Statistical Mathematics》2010,62(1):189-208

A new computation method of frequentist p values and Bayesian posterior probabilities based on the bootstrap probability is discussed for the multivariate normal model with unknown expectation parameter vector. The null hypothesis is represented as an arbitrary-shaped region of the parameter vector. We introduce new functional forms for the scaling-law of bootstrap probability so that the multiscale bootstrap method, which was designed for a one-sided test, can also compute confidence measures of a two-sided test, extending applicability to a wider class of hypotheses. Parameter estimation for the scaling-law is improved by the two-step multiscale bootstrap and also by including higher order terms. Model selection is important not only as a motivating application of our method, but also as an essential ingredient in the method. A compromise between frequentist and Bayesian is attempted by showing that the Bayesian posterior probability with a noninformative prior is interpreted as a frequentist p value of “zero-sided” test. 相似文献

20.

广义非参数回归的B样本贝叶斯估计

卢一强茆诗松《应用数学》2005,18(1):8-13

本文主要研究广义非参数模型B样条Bayes估计 .将回归函数按照B样条基展开 ,我们不具体选择节点的个数 ,而是节点个数取均匀的无信息先验 ,样条函数系数取正态先验 ,用B样条函数的后验均值估计回归函数 .并给出了回归函数B样条Bayes估计的MCMC的模拟计算方法 .通过对Logistic非参数回归的模拟研究 ,表明B样条Bayes估计得到了很好的估计效果相似文献