首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F( ·, ·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as by Müller and Yao (2008 Müller, H.G., and Yao, F. (2008), “Functional Additive Models,” Journal of the American Statistical Association, 103, 15341544.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F( ·, ·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. The FGAM is implemented in R in the refund package. There are additional supplementary materials available online.  相似文献   

2.
Recently there has been a lot of effort to model extremes of spatially dependent data. These efforts seem to be divided into two distinct groups: the study of max-stable processes, together with the development of statistical models within this framework; the use of more pragmatic, flexible models using Bayesian hierarchical models (BHM) and simulation based inference techniques. Each modeling strategy has its strong and weak points. While max-stable models capture the local behavior of spatial extremes correctly, hierarchical models based on the conditional independence assumption, lack the asymptotic arguments the max-stable models enjoy. On the other hand, they are very flexible in allowing the introduction of physical plausibility into the model. When the objective of the data analysis is to estimate return levels or kriging of extreme values in space, capturing the correct dependence structure between the extremes is crucial and max-stable processes are better suited for these purposes. However when the primary interest is to explain the sources of variation in extreme events Bayesian hierarchical modeling is a very flexible tool due to the ease with which random effects are incorporated in the model. In this paper we model a data set on Portuguese wildfires to show the flexibility of BHM in incorporating spatial dependencies acting at different resolutions.  相似文献   

3.
This article presents and compares two approaches of principal component (PC) analysis for two-dimensional functional data on a possibly irregular domain. The first approach applies the singular value decomposition of the data matrix obtained from a fine discretization of the two-dimensional functions. When the functions are only observed at discrete points that are possibly sparse and may differ from function to function, this approach incorporates an initial smoothing step prior to the singular value decomposition. The second approach employs a mixed effects model that specifies the PC functions as bivariate splines on triangulations and the PC scores as random effects. We apply the thin-plate penalty for regularizing the function estimation and develop an effective expectation–maximization algorithm for calculating the penalized likelihood estimates of the parameters. The mixed effects model-based approach integrates scatterplot smoothing and functional PC analysis in a unified framework and is shown in a simulation study to be more efficient than the two-step approach that separately performs smoothing and PC analysis. The proposed methods are applied to analyze the temperature variation in Texas using 100 years of temperature data recorded by Texas weather stations. Supplementary materials for this article are available online.  相似文献   

4.
Functional regression modeling via regularized Gaussian basis expansions   总被引:1,自引:0,他引:1  
We consider the problem of constructing functional regression models for scalar responses and functional predictors, using Gaussian basis functions along with the technique of regularization. An advantage of our regularized Gaussian basis expansions to functional data analysis is that it creates a much more flexible instrument for transforming each individual’s observations into functional form. In constructing functional regression models there remains the problem of how to determine the number of basis functions and an appropriate value of a regularization parameter. We present model selection criteria for evaluating models estimated by the method of regularization in the context of functional regression models. The proposed functional regression models are applied to Canadian temperature data. Monte Carlo simulations are conducted to examine the efficiency of our modeling strategies. The simulation results show that the proposed procedure performs well especially in terms of flexibility and stable estimates.  相似文献   

5.
We extend the definition of functional data registration to encompass a larger class of registration models. In contrast to traditional registration models, we allow for registered functions that have more than one primary direction of variation. The proposed Bayesian hierarchical model simultaneously registers the observed functions and estimates the two primary factors that characterize variation in the registered functions. Each registered function is assumed to be predominantly composed of a linear combination of these two primary factors, and the function-specific weights for each observation are estimated within the registration model. We show how these estimated weights can easily be used to classify functions after registration using both simulated data and a juggling dataset. Supplementary materials for this article are available online.  相似文献   

6.
7.
We develop algorithms for performing semiparametric regression analysis in real time, with data processed as it is collected and made immediately available via modern telecommunications technologies. Our definition of semiparametric regression is quite broad and includes, as special cases, generalized linear mixed models, generalized additive models, geostatistical models, wavelet nonparametric regression models and their various combinations. Fast updating of regression fits is achieved by couching semiparametric regression into a Bayesian hierarchical model or, equivalently, graphical model framework and employing online mean field variational ideas. An Internet site attached to this article, realtime-semiparametric-regression.net, illustrates the methodology for continually arriving stock market, real estate, and airline data. Flexible real-time analyses based on increasingly ubiquitous streaming data sources stand to benefit. This article has online supplementary material.  相似文献   

8.
The current parameterization and algorithm used to fit a smoothing spline analysis of variance (SSANOVA) model are computationally expensive, making a generalized additive model (GAM) the preferred method for multivariate smoothing. In this article, we propose an efficient reparameterization of the smoothing parameters in SSANOVA models, and a scalable algorithm for estimating multiple smoothing parameters in SSANOVAs. To validate our approach, we present two simulation studies comparing our reparameterization and algorithm to implementations of SSANOVAs and GAMs that are currently available in R. Our simulation results demonstrate that (a) our scalable SSANOVA algorithm outperforms the currently used SSANOVA algorithm, and (b) SSANOVAs can be a fast and reliable alternative to GAMs. We also provide an example with oceanographic data that demonstrates the practical advantage of our SSANOVA framework. Supplementary materials that are available online can be used to replicate the analyses in this article.  相似文献   

9.
Recurrent event data frequently occur in longitudinal studies, and it is often of interest to estimate the effects of covariates on the recurrent event rate. This paper considers a class of semiparametric transformation rate models for recurrent event data, which uses an additive Aalen model as its covariate dependent baseline. The new models are flexible in that they allow for both additive and multiplicative covariate effects, and some covariate effects are allowed to be nonparametric and time-varying. An estimating procedure is proposed for parameter estimation, and the resulting estimators are shown to be consistent and asymptotically normal. Simulation studies and a real data analysis demonstrate that the proposed method performs well and is appropriate for practical use.  相似文献   

10.
Factor models for multivariate count data   总被引:1,自引:0,他引:1  
We develop a general class of factor-analytic models for the analysis of multivariate (truncated) count data. Dependencies in multivariate counts are of interest in many applications, but few approaches have been proposed for their analysis. Our model class allows for a variety of distributions of the factors in the exponential family. The proposed framework includes a large number of previously proposed factor and random effect models as special cases and leads to many new models that have not been considered so far. Whereas previously these models were proposed separately as different cases, our framework unifies these models and enables one to study them simultaneously. We estimate the Poisson factor models with the method of simulated maximum likelihood. A Monte-Carlo study investigates the performance of this approach in terms of estimation bias and precision. We illustrate the approach in an analysis of TV channels data.  相似文献   

11.
The generalized partially linear additive model (GPLAM) is a flexible and interpretable approach to building predictive models. It combines features in an additive manner, allowing each to have either a linear or nonlinear effect on the response. However, the choice of which features to treat as linear or nonlinear is typically assumed known. Thus, to make a GPLAM a viable approach in situations in which little is known a priori about the features, one must overcome two primary model selection challenges: deciding which features to include in the model and determining which of these features to treat nonlinearly. We introduce the sparse partially linear additive model (SPLAM), which combines model fitting and both of these model selection challenges into a single convex optimization problem. SPLAM provides a bridge between the lasso and sparse additive models. Through a statistical oracle inequality and thorough simulation, we demonstrate that SPLAM can outperform other methods across a broad spectrum of statistical regimes, including the high-dimensional (p ? N) setting. We develop efficient algorithms that are applied to real datasets with half a million samples and over 45,000 features with excellent predictive performance. Supplementary materials for this article are available online.  相似文献   

12.
Conventional data envelopment analysis (DEA) methods assume that input and output variables are continuous. However, in many real managerial cases, some inputs and/or outputs can only take integer values. Simply rounding the performance targets to the nearest integers can lead to misleading solutions and efficiency evaluation. Addressing this kind of integer-valued data, the current paper proposes models that deal directly with slacks to calculate efficiency and super-efficiency scores when integer values are present. Compared with standard radial models, additive (super-efficiency) models demonstrate higher discrimination power among decision making units, especially for integer-valued data. We use an empirical application in early-stage ventures to illustrate our approach.  相似文献   

13.
We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself, as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory, propose an inference procedure that accounts for total variability in the predicted response curves, and construct pointwise prediction intervals. The estimation/inferential procedure accommodates realistic scenarios, such as correlated error structure as well as sparse and/or irregular designs. We investigate our methodology in finite sample size through simulations and two real data applications. Supplementary material for this article is available online.  相似文献   

14.
In this paper, we consider the robust regression problem associated with Huber loss in the framework of functional linear model and reproducing kernel Hilbert spaces. We propose an Ivanov regularized empirical risk minimization estimation procedure to approximate the slope function of the linear model in the presence of outliers or heavy-tailed noises. By appropriately tuning the scale parameter of the Huber loss, we establish explicit rates of convergence for our estimates in terms of excess prediction risk under mild assumptions. Our study in the paper justifies the efficiency of Huber regression for functional data from a theoretical viewpoint.  相似文献   

15.
Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards...  相似文献   

16.
One useful approach for fitting linear models with scalar outcomes and functional predictors involves transforming the functional data to wavelet domain and converting the data-fitting problem to a variable selection problem. Applying the LASSO procedure in this situation has been shown to be efficient and powerful. In this article, we explore two potential directions for improvements to this method: techniques for prescreening and methods for weighting the LASSO-type penalty. We consider several strategies for each of these directions which have never been investigated, either numerically or theoretically, in a functional linear regression context. We compare the finite-sample performance of the proposed methods through both simulations and real-data applications with both 1D signals and 2D image predictors. We also discuss asymptotic aspects. We show that applying these procedures can lead to improved estimation and prediction as well as better stability. Supplementary materials for this article are available online.  相似文献   

17.
We present a unified semiparametric Bayesian approach based on Markov random field priors for analyzing the dependence of multicategorical response variables on time, space and further covariates. The general model extends dynamic, or state space, models for categorical time series and longitudinal data by including spatial effects as well as nonlinear effects of metrical covariates in flexible semiparametric form. Trend and seasonal components, different types of covariates and spatial effects are all treated within the same general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is fully Bayesian and uses MCMC techniques for posterior analysis. The approach in this paper is based on latent semiparametric utility models and is particularly useful for probit models. The methods are illustrated by applications to unemployment data and a forest damage survey.  相似文献   

18.
Conventional DEA models have been introduced to deal with non-negative data. In the real world, in some occasions, we have outputs and/or inputs, which can take negative data. In DEA literature some approaches have been presented for evaluating performance of units, which operate with negative data. In this paper, firstly, we give a brief review of these works, then we present a new additive based approach in this framework. The proposed model is designed to provide a target with non-negative value associated with negative components for each observed unit, failed by other methods. An empirical application in banking is then used to show the applicability of the proposed method and make a comparison with the other approaches in the literature.  相似文献   

19.
New challenges in knowledge extraction include interpreting and classifying data sets while simultaneously considering related information to confirm results or identify false positives. We discuss a data fusion algorithmic framework targeted at this problem. It includes separate base classifiers for each data type and a fusion method for combining the individual classifiers. The fusion method is an extension of current ensemble classification techniques and has the advantage of allowing data to remain in heterogeneous databases. In this paper, we focus on the applicability of such a framework to the protein phosphorylation prediction problem.  相似文献   

20.
Discrete Markov random field models provide a natural framework for representing images or spatial datasets. They model the spatial association present while providing a convenient Markovian dependency structure and strong edge-preservation properties. However, parameter estimation for discrete Markov random field models is difficult due to the complex form of the associated normalizing constant for the likelihood function. For large lattices, the reduced dependence approximation to the normalizing constant is based on the concept of performing computationally efficient and feasible forward recursions on smaller sublattices, which are then suitably combined to estimate the constant for the entire lattice. We present an efficient computational extension of the forward recursion approach for the autologistic model to lattices that have an irregularly shaped boundary and that may contain regions with no data; these lattices are typical in applications. Consequently, we also extend the reduced dependence approximation to these scenarios, enabling us to implement a practical and efficient nonsimulation-based approach for spatial data analysis within the variational Bayesian framework. The methodology is illustrated through application to simulated data and example images. The online supplementary materials include our C++ source code for computing the approximate normalizing constant and simulation studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号