共查询到20条相似文献,搜索用时 46 毫秒
1.
Quantile regression for longitudinal data 总被引:18,自引:0,他引:18
Roger Koenker 《Journal of multivariate analysis》2004,91(1):74-89
The penalized least squares interpretation of the classical random effects estimator suggests a possible way forward for quantile regression models with a large number of “fixed effects”. The introduction of a large number of individual fixed effects can significantly inflate the variability of estimates of other covariate effects. Regularization, or shrinkage of these individual effects toward a common value can help to modify this inflation effect. A general approach to estimating quantile regression models for longitudinal data is proposed employing ?1 regularization methods. Sparse linear algebra and interior point methods for solving large linear programs are essential computational tools. 相似文献
2.
D Ghosh 《Statistics & probability letters》2012,82(11):1898-1902
In this note, we address the problem of surrogacy using a causal modelling framework that differs substantially from the potential outcomes model that pervades the biostatistical literature. The framework comes from econometrics and conceptualizes direct effects of the surrogate endpoint on the true endpoint. While this framework can incorporate the so-called semi-competing risks data structure, we also derive a fundamental non-identifiability result. Relationships to existing causal modelling frameworks are also discussed. 相似文献
3.
Length-biased data arise in many important fields, including epidemiological cohort studies, cancer screening trials and labor economics. Analysis of such data has attracted much attention in the literature. In this paper we propose a quantile regression approach for analyzing right-censored and length-biased data. We derive an inverse probability weighted estimating equation corresponding to the quantile regression to correct the bias due to length-bias sampling and informative censoring. This method can easily handle informative censoring induced by length-biased sampling. This is an appealing feature of our proposed method since it is generally difficult to obtain unbiased estimates of risk factors in the presence of length-bias and informative censoring. We establish the consistency and asymptotic distribution of the proposed estimator using empirical process techniques. A resampling method is adopted to estimate the variance of the estimator. We conduct simulation studies to evaluate its finite sample performance and use a real data set to illustrate the application of the proposed method. 相似文献
4.
Szeman Tse 《Annals of the Institute of Statistical Mathematics》2005,57(1):61-69
In this paper, we consider the product-limit quantile estimator of an unknown quantile function when the data are subject
to random left truncation and right censorship. This is a parallel problem to the estimation of the unknown distribution function
by the product-limit estimator under the same model. Simultaneous strong Gaussian approximations of the product-limit process
and product-limit quantile process are constructed with rate
. A functional law of the iterated logarithm for the maximal deviation of the estimator from the estimand is derived from
the construction.
Work partially supported by NSC Grant 89-2118-M-259-011. 相似文献
5.
Reda Boukezzoula Sylvie GalichetAmory Bisserier 《International Journal of Approximate Reasoning》2011,52(9):1257-1271
In this paper, a revisited interval approach for linear regression is proposed. In this context, according to the Midpoint-Radius (MR) representation, the uncertainty attached to the set-valued model can be decoupled from its trend. The estimated interval model is built from interval input-output data with the objective of covering all available data. The constrained optimization problem is addressed using a linear programming approach in which a new criterion is proposed for representing the global uncertainty of the interval model. The potential of the proposed method is illustrated by simulation examples. 相似文献
6.
A sliced inverse regression approach for data stream 总被引:1,自引:0,他引:1
Marie Chavent Stéphane Girard Vanessa Kuentz-Simonet Benoit Liquet Thi Mong Ngoc Nguyen Jérôme Saracco 《Computational Statistics》2014,29(5):1129-1152
In this article, we focus on data arriving sequentially by blocks in a stream. A semiparametric regression model involving a common effective dimension reduction (EDR) direction \(\beta \) is assumed in each block. Our goal is to estimate this direction at each arrival of a new block. A simple direct approach consists of pooling all the observed blocks and estimating the EDR direction by the sliced inverse regression (SIR) method. But in practice, some disadvantages appear such as the storage of the blocks and the running time for large sample sizes. To overcome these drawbacks, we propose an adaptive SIR estimator of \(\beta \) based on the optimization of a quality measure. The corresponding approach is faster both in terms of computational complexity and running time, and provides data storage benefits. The consistency of our estimator is established and its asymptotic distribution is given. An extension to multiple indices model is proposed. A graphical tool is also provided in order to detect changes in the underlying model, i.e., drift in the EDR direction or aberrant blocks in the data stream. A simulation study illustrates the numerical behavior of our estimator. Finally, an application to real data concerning the estimation of physical properties of the Mars surface is presented. 相似文献
7.
利用局部多项式方法研究了误差具有异方差结构的非参数回归模型,在左截断数据下构造了回归函数的复合分位数回归估计,并得到了该估计的渐近正态性结果,最后通过模拟,在服从一些非正态分布的误差下,得到该估计比局部线性估计更有效. 相似文献
8.
Ridge regression is an important approach in linear regression when explanatory variables are highly correlated. Although expressions of estimators of ridge regression parameters have been successfully obtained via matrix operation after observed data are standardized, they cannot be used to big data since it is impossible to load the entire data set to the memory of a single computer and it is hard to standardize the original observed data. To overcome these difficulties, the present article proposes new methods and algorithms. The basic idea is to compute a matrix of sufficient statistics by rows. Once the matrix is derived, it is not necessary to use the original data again. Since the entire data set is only scanned once, the proposed methods and algorithms can be extremely efficient in the computation of estimates of ridge regression parameters. It is expected that the basic knowledge gained in this article will have a great impact on statistical approaches to big data. 相似文献
9.
讨论了分组数据下线性回归模型参数的MLE的存在、唯一性.通过EM算法获得MLE的近似解.通过SEM算法获得MLE的渐近协方差阵. 相似文献
10.
Risk concentration is used as a measurement of diversification benefits in the context of risk aggregation. Expectiles, which are known to possess many good properties, have attracted increasing interest in recent years. In this paper, we aim to study the asymptotic properties of risk concentration based on Expectiles. Firstly, we extend the results on the second-order asymptotics of Expectiles in Mao et al. (2015). Secondly, we investigate the second-order asymptotics of tail probabilities and then apply them to risk concentrations based on Expectiles as well as on VaR. 相似文献
11.
Process capability indices (PCIs) have been widely used to measure the actual process information with respect to the manufacturing specifications, and become the common language for process quality between the customer and the supplier. Most of existing research works for capability testing are based on the traditional frequentist point of view and statistical properties of the estimated PCIs are derived based on the assumption of one single sample. In this paper, we consider the problem of estimating and testing process capability using Bayesian approach based on subsamples collected over time from an in-control process. The posterior probability and the credible interval for the most popular index Cpk under a non-informative prior are derived. The manufacturers can use the presented approach to perform capability testing and determine whether their processes are capable of reproducing product items satisfying customers’ stringent quality requirements when a daily-based or weekly-based production control plan is implemented for monitoring process stability. 相似文献
12.
ShuWen Wan 《中国科学A辑(英文版)》2008,51(11):2020-2032
We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data.
The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator. The
statistic has an asymptotic chisquared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed
by Qin and Zhang in 1997, the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic
proposed by Zhang in 2001. The statistic is easy to compute in the sense that it requires none of the following methods: using
a bootstrap method to find its critical values, partitioning the sample data or inverting a high-dimensional matrix. We present
some results on simulation and on analysis of two real examples. Moreover, we discuss how to extend our statistic to a family
of statistics and how to construct its Kolmogorov-Smirnov counterpart.
This work was supported by the 11.5 Natural Scientific Plan (Grant No. 2006BAD09A04) and Nanjing University Start Fund (Grant
No. 020822410110) 相似文献
13.
Pchelintsev Evgeny Pergamenshchikov Serguei Leshchinskaya Maria 《Statistical Inference for Stochastic Processes》2022,25(3):537-576
Statistical Inference for Stochastic Processes - In this paper we study a high dimension (Big Data) regression model in continuous time observed in the discrete time moments with dependent noises... 相似文献
14.
Yunong Zhang W.E. Leithead D.J. Leith L. Walshe 《Journal of Computational and Applied Mathematics》2008,220(1-2):198-214
Maximum likelihood estimation (MLE) of hyperparameters in Gaussian process regression as well as other computational models usually and frequently requires the evaluation of the logarithm of the determinant of a positive-definite matrix (denoted by C hereafter). In general, the exact computation of is of O(N3) operations where N is the matrix dimension. The approximation of could be developed with O(N2) operations based on power-series expansion and randomized trace estimator. In this paper, the accuracy and effectiveness of using uniformly distributed seeds for approximation are investigated. The research shows that uniform-seed based approximation is an equally good alternative to Gaussian-seed based approximation, having slightly better approximation accuracy and smaller variance. Gaussian process regression examples also substantiate the effectiveness of such a uniform-seed based log-det approximation scheme. 相似文献
15.
Simple outlier labeling based on quantile regression,with application to the steelmaking process 下载免费PDF全文
This paper introduces some methods for outlier identification in the regression setting, motivated by the analysis of steelmaking process data. The proposed methodology extends to the regression setting the boxplot rule, commonly used for outlier screening with univariate data. The focus here is on bivariate settings with a single covariate, but extensions are possible. The proposal is based on quantile regression, including an additional transformation parameter for selecting the best scale for linearity of the conditional quantiles. The resulting method is used to perform effective labeling of potential outliers, with a quite low computational complexity, allowing for simple implementation within statistical software as well as commonly used spreadsheets. Some simulation experiments have been carried out to study the swamping and masking properties of the proposal. The methodology is also illustrated by some real life examples, taking as the response variable the energy consumed in the melting process. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
16.
Elcin Kartal Koc Cem Iyigun İnci Batmaz Gerhard-Wilhelm Weber 《Journal of Global Optimization》2014,60(1):103-120
Multivariate adaptive regression splines (MARS) has become a popular data mining (DM) tool due to its flexible model building strategy for high dimensional data. Compared to well-known others, it performs better in many areas such as finance, informatics, technology and science. Many studies have been conducted on improving its performance. For this purpose, an alternative backward stepwise algorithm is proposed through Conic-MARS (CMARS) method which uses a penalized residual sum of squares for MARS as a Tikhonov regularization problem. Additionally, by modifying the forward step of MARS via mapping approach, a time efficient procedure has been introduced by S-FMARS. Inspiring from the advantages of MARS, CMARS and S-FMARS, two hybrid methods are proposed in this study, aiming to produce time efficient DM tools without degrading their performances especially for large datasets. The resulting methods, called SMARS and SCMARS, are tested in terms of several performance criteria such as accuracy, complexity, stability and robustness via simulated and real life datasets. As a DM application, the hybrid methods are also applied to an important field of finance for predicting interest rates offered by a Turkish bank to its customers. The results show that the proposed hybrid methods, being the most time efficient with competing performances, can be considered as powerful choices particularly for large datasets. 相似文献
17.
K. Benhenni S. Hedli-Griche M. Rachdi P. Vieu 《Statistics & probability letters》2008,78(8):1043-1049
We study the nonparametric regression estimation when the explanatory variable takes values in some abstract functional space. We establish some asymptotic results and we give the (pointwise and uniform) convergence of the kernel type estimator constructed from functional data under long memory conditions. 相似文献
18.
《European Journal of Operational Research》2005,165(3):685-695
Using process capability indices to quantify manufacturing process precision (consistency) and performance, is an essential part of implementing any quality improvement program. Most research works for testing the capability indices have focused on using the traditional distribution frequency approaches. Cheng and Spiring [IIE Trans. 21 (1) 97] proposed a Bayesian procedure for assessing process capability index Cp based on one single sample. In practice, manufacturing information regarding product quality characteristic is often derived from multiple samples, particularly, when a routine-based quality control plan is implemented for monitoring process stability. In this paper, we consider estimating and testing Cp with multiple samples using Bayesian approach, and propose accordingly a Bayesian procedure for capability testing. The posterior probability, p, for which the process under investigation is capable, is derived. The credible interval, a Bayesian analogue of the classical lower confidence interval, is obtained. The results obtained in this paper, are generalizations of those obtained in Cheng and Spiring [IIE Trans. 21 (1), 97]. Practitioners can use the proposed procedure to Cheng and Spiring determine whether their manufacturing processes are capable of reproducing products satisfying the preset precision requirement. 相似文献
19.
Conceição Rocha Teresa Mendonça Maria Eduarda Silva 《Mathematical and Computer Modelling of Dynamical Systems: Methods, Tools and Applications in Engineering and Related Sciences》2013,19(6):540-556
During surgical interventions, a muscle relaxant drug is frequently administered with the objective of inducing muscle paralysis. Clinical environment and patient safety issues lead to a huge variety of situations that must be taken into account requiring intensive simulation studies. Hence, population models are crucial for research and development in this field.This work develops a stochastic population model for the neuromuscular blockade (NMB) (muscle paralysis) level induced by atracurium based on a deterministic individual model already proposed in the literature. To achieve this goal, a joint Lognormal distribution is considered for the patient-dependent parameters. This study is based on clinical data collected during general anaesthesia. The procedure developed enables to construct a reliable reference bank of parametrized models that not only reproduces the overall features of the NMB, but also the inter-individual variability characteristic of physiological signals. It turns out that this bank constitutes a fundamental tool to support research on identification and control algorithms and is suitable to be integrated in clinical decision support systems. 相似文献
20.
We consider adaptive maximum likelihood type estimation of both drift and diffusion coefficient parameters for an ergodic diffusion process based on discrete observations. Two kinds of adaptive maximum likelihood type estimators are proposed and asymptotic properties of the adaptive estimators, including convergence of moments, are obtained. 相似文献