首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The task of fitting smoothing spline surfaces to meteorological data such as temperature or rainfall observations is computationally intensive. The generalized cross validation (GCV) smoothing algorithm, if implemented using direct matrix techniques, is O(n 3) computationally, and memory requirements are O(n 2). Thus, for data sets larger than a few hundred observations, the algorithm is prohibitively slow. The core of the algorithm consists of solving series of shifted linear systems, and iterative techniques have been used to lower the computational complexity and facilitate implementation on a variety of supercomputer architectures. For large data sets though, the execution time is still quite high. In this paper we describe a Lanczos based approach that avoids explicitly solving the linear systems and dramatically reduces the amount of time required to fit surfaces to sets of data.   相似文献   

2.
In many problems involving generalized linear models, the covariates are subject to measurement error. When the number of covariates p exceeds the sample size n, regularized methods like the lasso or Dantzig selector are required. Several recent papers have studied methods which correct for measurement error in the lasso or Dantzig selector for linear models in the p > n setting. We study a correction for generalized linear models, based on Rosenbaum and Tsybakov’s matrix uncertainty selector. By not requiring an estimate of the measurement error covariance matrix, this generalized matrix uncertainty selector has a great practical advantage in problems involving high-dimensional data. We further derive an alternative method based on the lasso, and develop efficient algorithms for both methods. In our simulation studies of logistic and Poisson regression with measurement error, the proposed methods outperform the standard lasso and Dantzig selector with respect to covariate selection, by reducing the number of false positives considerably. We also consider classification of patients on the basis of gene expression data with noisy measurements. Supplementary materials for this article are available online.  相似文献   

3.
This article presents an algorithm for accommodating missing data in situations where a natural set of estimating equations exists for the complete data setting. The complete data estimating equations can correspond to the score functions from a standard, partial, or quasi-likelihood, or they can be generalized estimating equations (GEEs). In analogy to the EM, which is a special case, the method is called the ES algorithm, because it iterates between an E-Step wherein functions of the complete data are replaced by their expected values, and an S-Step where these expected values are substituted into the complete-data estimating equation, which is then solved. Convergence properties of the algorithm are established by appealing to general theory for iterative solutions to nonlinear equations. In particular, the ES algorithm (and indeed the EM) are shown to correspond to examples of nonlinear Gauss-Seidel algorithms. An added advantage of the approach is that it yields a computationally simple method for estimating the variance of the resulting parameter estimates.  相似文献   

4.
Abstract Developing models to predict tree mortality using data from long‐term repeated measurement data sets can be difficult and challenging due to the nature of mortality as well as the effects of dependence on observations. Marginal (population‐averaged) generalized estimating equations (GEE) and random effects (subject‐specific) models offer two possible ways to overcome these effects. For this study, standard logistic, marginal logistic based on the GEE approach, and random logistic regression models were fitted and compared. In addition, four model evaluation statistics were calculated by means of K‐fold cross‐valuation. They include the mean prediction error, the mean absolute prediction error, the variance of prediction error, and the mean square error. Results from this study suggest that the random effects model produced the smallest evaluation statistics among the three models. Although marginal logistic regression accommodated for correlations between observations, it did not provide noticeable improvements of model performance compared to the standard logistic regression model that assumed impendence. This study indicates that the random effects model was able to increase the overall accuracy of mortality modeling. Moreover, it was able to ascertain correlation derived from the hierarchal data structure as well as serial correlation generated through repeated measurements.  相似文献   

5.
This paper, of which a preliminary version appeared in ISTCS'92, is concerned with generalized network flow problems. In a generalized network, each edgee = (u, v) has a positive flow multipliera e associated with it. The interpretation is that if a flow ofx e enters the edge at nodeu, then a flow ofa e x e exits the edge atv. The uncapacitated generalized transshipment problem (UGT) is defined on a generalized network where demands and supplies (real numbers) are associated with the vertices and costs (real numbers) are associated with the edges. The goal is to find a flow such that the excess or deficit at each vertex equals the desired value of the supply or demand, and the sum over the edges of the product of the cost and the flow is minimized. Adler and Cosares [Operations Research 39 (1991) 955–960] reduced the restricted uncapacitated generalized transshipment problem, where only demand nodes are present, to a system of linear inequalities with two variables per inequality. The algorithms presented by the authors in [SIAM Journal on Computing, to appear result in a faster algorithm for restricted UGT.Generalized circulation is defined on a generalized network with demands at the nodes and capacity constraints on the edges (i.e., upper bounds on the amount of flow). The goal is to find a flow such that the excesses at the nodes are proportional to the demands and maximized. We present a new algorithm that solves the capacitated generalized flow problem by iteratively solving instances of UGT. The algorithm can be used to find an optimal flow or an approximation thereof. When used to find a constant factor approximation, the algorithm is not only more efficient than previous algorithms but also strongly polynomial. It is believed to be the first strongly polynomial approximation algorithm for generalized circulation. The existence of such an approximation algorithm is interesting since it is not known whether the exact problem has a strongly polynomial algorithm.Corresponding author. Research was done while the first author was attending Stanford University and IBM Almaden Research Center. Research partially supported by ONR-N00014-91-C-0026 and by NSF PYI Grant CCR-8858097, matching funds from AT&T and DEC.Research partially supported by ONR-N00014-91-C-0026.  相似文献   

6.
The paper presents a generalized regression technique centered on a superquantile (also called conditional value-at-risk) that is consistent with that coherent measure of risk and yields more conservatively fitted curves than classical least-squares and quantile regression. In contrast to other generalized regression techniques that approximate conditional superquantiles by various combinations of conditional quantiles, we directly and in perfect analog to classical regression obtain superquantile regression functions as optimal solutions of certain error minimization problems. We show the existence and possible uniqueness of regression functions, discuss the stability of regression functions under perturbations and approximation of the underlying data, and propose an extension of the coefficient of determination R-squared for assessing the goodness of fit. The paper presents two numerical methods for solving the error minimization problems and illustrates the methodology in several numerical examples in the areas of uncertainty quantification, reliability engineering, and financial risk management.  相似文献   

7.
We develop algorithms for performing semiparametric regression analysis in real time, with data processed as it is collected and made immediately available via modern telecommunications technologies. Our definition of semiparametric regression is quite broad and includes, as special cases, generalized linear mixed models, generalized additive models, geostatistical models, wavelet nonparametric regression models and their various combinations. Fast updating of regression fits is achieved by couching semiparametric regression into a Bayesian hierarchical model or, equivalently, graphical model framework and employing online mean field variational ideas. An Internet site attached to this article, realtime-semiparametric-regression.net, illustrates the methodology for continually arriving stock market, real estate, and airline data. Flexible real-time analyses based on increasingly ubiquitous streaming data sources stand to benefit. This article has online supplementary material.  相似文献   

8.
Model selection for regression on a fixed design   总被引:1,自引:0,他引:1  
We deal with the problem of estimating some unknown regression function involved in a regression framework with deterministic design points. For this end, we consider some collection of finite dimensional linear spaces (models) and the least-squares estimator built on a data driven selected model among this collection. This data driven choice is performed via the minimization of some penalized model selection criterion that generalizes on Mallows' C p . We provide non asymptotic risk bounds for the so-defined estimator from which we deduce adaptivity properties. Our results hold under mild moment conditions on the errors. The statement and the use of a new moment inequality for empirical processes is at the heart of the techniques involved in our approach. Received: 2 July 1997 / Revised version: 20 September 1999 / Published online: 6 July 2000  相似文献   

9.
Some simple models are introduced which may be used for modelling or generating sequences of dependent discrete random variables with generalized Poisson marginal distribution. Our approach for building these models is similar to that of the Poisson ARMA processes considered by Al-Osh and Alzaid (1987,J. Time Ser. Anal.,8, 261–275; 1988,Statist. Hefte,29, 281–300) and McKenzie (1988,Adv. in Appl. Probab.,20, 822–835). The models have the same autocorrelation structure as their counterparts of standard ARMA models. Various properties, such as joint distribution, time reversibility and regression behavior, for each model are investigated.  相似文献   

10.
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This article introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. Supplementary materials for this article are available online.  相似文献   

11.
In this paper, we consider a generalized vector variational-like inequality problem (for short, GVVLIP), which includes generalized vector variational inequalities, vector variational inequalities and classical variational inequalities as special cases. The concepts of generalized C-pseudomonotone-like and generalized H-hemicontinuous-like operators are introduced. Some existence results for GVVLIP are obtained under the assumptions of generalized C-pseudomonotone-like property and generalized H-hemicontinuous-like property. These results appear to be new and interesting. New existence results of the classical variational inequality are also obtained. In this research, the first author was partially supported by the Teaching and Research Award Fund for Outstanding Young Teachers in Higher Education Institutions of MOE, China and the Dawn Program Foundation in Shanghai. The third author was partially supported by Grant NSC 94-2213-E-110-035.  相似文献   

12.
Modeling the mean and covariance simultaneously is a common strategy to efciently estimate the mean parameters when applying generalized estimating equation techniques to longitudinal data.In this article,using generalized estimation equation techniques,we propose a new kind of regression models for parameterizing covariance structures.Using a novel Cholesky factor,the entries in this decomposition have moving average and log innovation interpretation and are modeled as linear functions of covariates.The resulting estimators for the regression coefcients in both the mean and the covariance are shown to be consistent and asymptotically normally distributed.Simulation studies and a real data analysis show that the proposed approach yields highly efcient estimators for the parameters in the mean,and provides parsimonious estimation for the covariance structure.  相似文献   

13.
We define and characterize inner generalized inverses with prescribed idempotents. These classes of generalized inverses are natural algebraic extension of generalized inverses of linear operators with prescribed range and kernel. We consider the reverse order rule for inner generalized inverses of elements of a ring, some perturbation bounds, and we construct an iterative method for computing a (p, q)-inner inverse in Banach algebras.  相似文献   

14.
We find two-sided inequalities for the generalized hypergeometric function of the form q+1Fq(−x) with positive parameters restricted by certain additional conditions. Both lower and upper bounds agree with the value of q+1Fq(−x) at the endpoints of positive semi-axis and are asymptotically precise at one of the endpoints. The inequalities are derived from a theorem asserting the monotony of the quotient of two generalized hypergeometric functions with shifted parameters. The proofs hinge on a generalized Stieltjes representation of the generalized hypergeometric function. This representation also provides yet another method to deduce the second Thomae relation for 3F2(1) and leads to an integral representations of 4F3(x) in terms of the Appell function F3. In the last section of the paper we list some open questions and conjectures.  相似文献   

15.
The central mean and central subspaces of generalized multiple index model are the main inference targets of sufficient dimension reduction in regression. In this article, we propose an integral transform (ITM) method for estimating these two subspaces. Applying the ITM method, estimates are derived, separately, for two scenarios: (i) No distributional assumptions are imposed on the predictors, and (ii) the predictors are assumed to follow an elliptically contoured distribution. These estimates are shown to be asymptotically normal with the usual root-n convergence rate. The ITM method is different from other existing methods in that it avoids estimation of the unknown link function between the response and the predictors and it does not rely on distributional assumptions of the predictors under scenario (i) mentioned above.  相似文献   

16.
In distribution theory the pullback of a general distribution by a C -function is well-defined whenever the normal bundle of the C -function does not intersect the wave front set of the distribution. However, the Colombeau theory of generalized functions allows for a pullback by an arbitrary c-bounded generalized function. It has been shown in previous work that in the case of multiplication of Colombeau functions (which is a special case of a C pullback), the generalized wave front set of the product satisfies the same inclusion relation as in the distributional case, if the factors have their wave front sets in favorable position. We prove a microlocal inclusion relation for the generalized pullback (by a c-bounded generalized map) of Colombeau functions. The proof of this result relies on a stationary phase theorem for generalized phase functions, which is given in the Appendix. Furthermore we study an example (due to Hurd and Sattinger), where the pullback function stems from the generalized characteristic flow of a partial differential equation.   相似文献   

17.
In this paper, the semiparametric generalized partially linear models (GPLMs) for longitudinal data is studied. We approximate the nonparametric function in the GPLMs by a regression spline, and use quadratic inference functions (QIF) to take the within-cluster correlation into account without involving direct estimation of nuisance parameters in the correlation matrix. We establish the asymptotic normality of the resulting estimators. The finite sample performance of the proposed methods is evaluated through simulation studies and a real data analysis.  相似文献   

18.
We present a new computational and statistical approach for fitting isotonic models under convex differentiable loss functions through recursive partitioning. Models along the partitioning path are also isotonic and can be viewed as regularized solutions to the problem. Our approach generalizes and subsumes the well-known work of Barlow and Brunk on fitting isotonic regressions subject to specially structured loss functions, and expands the range of loss functions that can be used (e.g., adding Huber loss for robust regression). This is accomplished through an algorithmic adjustment to a recursive partitioning approach recently developed for solving large-scale ?2-loss isotonic regression problems. We prove that the new algorithm solves the generalized problem while maintaining the favorable computational and statistical properties of the l2 algorithm. The results are demonstrated on both real and synthetic data in two settings: fitting count data using negative Poisson log-likelihood loss, and fitting robust isotonic regressions using Huber loss. Proofs of theorems and a MATLAB-based software package implementing our algorithm are available in the online supplementary materials.  相似文献   

19.
In [H. Brézis, A. Friedman, Nonlinear parabolic equations involving measures as initial conditions, J. Math. Pure Appl. (9) (1983) 73–97.] Brézis and Friedman prove that certain nonlinear parabolic equations, with the δ-measure as initial data, have no solution. However in [J.F. Colombeau, M. Langlais, Generalized solutions of nonlinear parabolic equations with distributions as initial conditions, J. Math. Anal. Appl (1990) 186–196.] Colombeau and Langlais prove that these equations have a unique solution even if the δ-measure is substituted by any Colombeau generalized function of compact support. Here we generalize Colombeau and Langlais’ result proving that we may take any generalized function as the initial data. Our approach relies on recent algebraic and topological developments of the theory of Colombeau generalized functions and results from [J. Aragona, Colombeau generalized functions on quasi-regular sets, Publ. Math. Debrecen (2006) 371–399.].  相似文献   

20.
Abstract

Hidden Markov models (HMM) can be applied to the study of time varying unobserved categorical variables for which only indirect measurements are available. An S-Plus module to fit HMMs in continuous time to this type of longitudinal data is presented. Covariates affecting the transition intensities of the hidden Markov process or the conditional distribution of the measured response (given the hidden states of the process) are handled under a generalized regression framework. Users can provide C subroutines specifying the parameterization of the model to adapt the software to a wide variety of data types. HMM analysis using the S-Plus module is illustrated on a dataset from a prospective study of human papillomavirus infection in young women and on simulated data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号