首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
Multiblock component methods are applied to data sets for which several blocks of variables are measured on a same set of observations with the goal to analyze the relationships between these blocks of variables. In this article, we focus on multiblock component methods that integrate the information found in several blocks of explanatory variables in order to describe and explain one set of dependent variables. In the following, multiblock PLS and multiblock redundancy analysis are chosen, as particular cases of multiblock component methods when one set of variables is explained by a set of predictor variables that is organized into blocks. Because these multiblock techniques assume that the observations come from a homogeneous population they will provide suboptimal results when the observations actually come from different populations. A strategy to palliate this problem—presented in this article—is to use a technique such as clusterwise regression in order to identify homogeneous clusters of observations. This approach creates two new methods that provide clusters that have their own sets of regression coefficients. This combination of clustering and regression improves the overall quality of the prediction and facilitates the interpretation. In addition, the minimization of a well-defined criterion—by means of a sequential algorithm—ensures that the algorithm converges monotonously. Finally, the proposed method is distribution-free and can be used when the explanatory variables outnumber the observations within clusters. The proposed clusterwise multiblock methods are illustrated with of a simulation study and a (simulated) example from marketing.  相似文献   

2.
This paper presents an extension of the standard regression tree method to clustered data. Previous works extending tree methods to accommodate correlated data are mainly based on the multivariate repeated-measures approach. We propose a “mixed effects regression tree” method where the correlated observations are viewed as nested within clusters rather than as vectors of multivariate repeated responses. The proposed method can handle unbalanced clusters, allows observations within clusters to be split, and can incorporate random effects and observation-level covariates. We implemented the proposed method using a standard tree algorithm within the framework of the expectation-maximization (EM) algorithm. The simulation results show that the proposed regression tree method provides substantial improvements over standard trees when the random effects are non negligible. A real data example is used to illustrate the method.  相似文献   

3.
A numerical comparison is made between three integration methods for semi-discrete parabolic partial differential equations in two space variables with a mixed derivative. Linear as well as non-linear equations are considered. The integration methods are the well-known one-step line hopscotch method, a four-step line hopscotch method, and a stabilized, explicit Runge-Kutta method.  相似文献   

4.
Additive models and tree-based regression models are two main classes of statistical models used to predict the scores on a continuous response variable. It is known that additive models become very complex in the presence of higher order interaction effects, whereas some tree-based models, such as CART, have problems capturing linear main effects of continuous predictors. To overcome these drawbacks, the regression trunk model has been proposed: a multiple regression model with main effects and a parsimonious amount of higher order interaction effects. The interaction effects can be represented by a small tree: a regression trunk. This article proposes a new algorithm—Simultaneous Threshold Interaction Modeling Algorithm (STIMA)—to estimate a regression trunk model that is more general and more efficient than the initial one (RTA) and is implemented in the R-package stima. Results from a simulation study show that the performance of STIMA is satisfactory for sample sizes of 200 or higher. For sample sizes of 300 or higher, the 0.50 SE rule is the best pruning rule for a regression trunk in terms of power and Type I error. For sample sizes of 200, the 0.80 SE rule is recommended. Results from a comparative study of eight regression methods applied to ten benchmark datasets suggest that STIMA and GUIDE are the best performers in terms of cross-validated prediction error. STIMA appeared to be the best method for datasets containing many categorical variables. The characteristics of a regression trunk model are illustrated using the Boston house price dataset.

Supplemental materials for this article, including the R-package stima, are available online.  相似文献   

5.
In this paper it is argued that all multivariate estimation methods, such as OLS regression, simultaneous linear equations systems and, more widely, what are known as LISREL methods, have merit as geometric approximation methods, even if the observations are not drawn from a multivariate normal parent distribution and consequently cannot be viewed as ML estimators. It is shown that for large samples the asymptotical distribution of any estimator, being a totally differentiable covariance function, may be assessed by the δ method. Finally, we stress that the design of the sample and a priori knowledge about the parent distribution may be incorporated to obtain more specific results. It turns out that some fairly traditional assumptions, such as assuming some variables to be non-random, fixed over repeated samples, or the existence of a parent normal distribution, may have dramatic effects on the assessment of standard deviations and confidence bounds, if such assumptions are not realistic. The method elaborated by us does not make use of such assumptions.  相似文献   

6.
Multivariate cubic polynomial optimization problems, as a special case of the general polynomial optimization, have a lot of practical applications in real world. In this paper, some necessary local optimality conditions and some necessary global optimality conditions for cubic polynomial optimization problems with mixed variables are established. Then some local optimization methods, including weakly local optimization methods for general problems with mixed variables and strongly local optimization methods for cubic polynomial optimization problems with mixed variables, are proposed by exploiting these necessary local optimality conditions and necessary global optimality conditions. A global optimization method is proposed for cubic polynomial optimization problems by combining these local optimization methods together with some auxiliary functions. Some numerical examples are also given to illustrate that these approaches are very efficient.  相似文献   

7.
Hybrid methodology involving differential equations modeling and statistical regression is developed in order to test basic ideas in asset price dynamics. In particular, the method provides a mechanism for testing the relative importance of price trend compared with valuation. The significance of yearly highs in prices can also be understood through this procedure. A large data set of 52 closed-end funds comprising about 61,500 data points is used with the mixed effects model in SPlus. The model suggests that the role of the trend is as significant as the valuation. Upon determination of the coefficients, one has a model that can be used for short term forecasts of asset prices. The model incorporates the finiteness of assets and the importance of “liquidity”, or “excess cash”. The statistics utilize data on the number of shares and the national money supply. The methodology can easily be extended to other behavioral effects.  相似文献   

8.
It is increasingly common to be faced with longitudinal or multi-level data sets that have large numbers of predictors and/or a large sample size. Current methods of fitting and inference for mixed effects models tend to perform poorly in such settings. When there are many variables, it is appealing to allow uncertainty in subset selection and to obtain a sparse characterization of the data. Bayesian methods are available to address these goals using Markov chain Monte Carlo (MCMC), but MCMC is very computationally expensive and can be infeasible in large p and/or large n problems. As a fast approximate Bayes solution, we recommend a novel approximation to the posterior relying on variational methods. Variational methods are used to approximate the posterior of the parameters in a decomposition of the variance components, with priors chosen to obtain a sparse solution that allows selection of random effects. The method is evaluated through a simulation study, and applied to an epidemiological application.  相似文献   

9.
A data analysis method is proposed to derive a latent structure matrix from a sample covariance matrix. The matrix can be used to explore the linear latent effect between two sets of observed variables. Procedures with which to estimate a set of dependent variables from a set of explanatory variables by using latent structure matrix are also proposed. The proposed method can assist the researchers in improving the effectiveness of the SEM models by exploring the latent structure between two sets of variables. In addition, a structure residual matrix can also be derived as a by-product of the proposed method, with which researchers can conduct experimental procedures for variables combinations and selections to build various models for hypotheses testing. These capabilities of data analysis method can improve the effectiveness of traditional SEM methods in data property characterization and models hypotheses testing. Case studies are provided to demonstrate the procedure of deriving latent structure matrix step by step, and the latent structure estimation results are quite close to the results of PLS regression. A structure coefficient index is suggested to explore the relationships among various combinations of variables and their effects on the variance of the latent structure.  相似文献   

10.
This paper provides a new methodology to solve bilinear, non-convex mathematical programming problems by a suitable transformation of variables. Schur's decomposition and special ordered sets (SOS) type 2 constraints are used resulting in a mixed integer linear or quadratic program in the two applications shown. While Beale, Tomlin and others developed the use of SOS type 2 variables to handle non-convexities, our approach is novel in two aspects. First, the use of Schur's decomposition as an integral part of the approximation step is new and leads to a numerically viable method to separate the variables. Second, the combination of our approach for handling bilinear side constraints in a complementarity or equilibrium problem setting is also new and opens the way to many interesting and realistic modifications to such models. We contrast our approach with other methods for solving bilinear problems also known as indefinite quadratic programs. From a practical point of view our methodology is helpful since no specialized procedures need to be created so that existing solvers can be used. The approach is illustrated with two engineering examples and the mathematical analysis appears in the Appendices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号