首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A data analysis method is proposed to derive a latent structure matrix from a sample covariance matrix. The matrix can be used to explore the linear latent effect between two sets of observed variables. Procedures with which to estimate a set of dependent variables from a set of explanatory variables by using latent structure matrix are also proposed. The proposed method can assist the researchers in improving the effectiveness of the SEM models by exploring the latent structure between two sets of variables. In addition, a structure residual matrix can also be derived as a by-product of the proposed method, with which researchers can conduct experimental procedures for variables combinations and selections to build various models for hypotheses testing. These capabilities of data analysis method can improve the effectiveness of traditional SEM methods in data property characterization and models hypotheses testing. Case studies are provided to demonstrate the procedure of deriving latent structure matrix step by step, and the latent structure estimation results are quite close to the results of PLS regression. A structure coefficient index is suggested to explore the relationships among various combinations of variables and their effects on the variance of the latent structure.  相似文献   

2.
We extend the least angle regression algorithm using the information geometry of dually flat spaces. The extended least angle regression algorithm is used for estimating parameters in generalized linear regression, and it can be also used for selecting explanatory variables. We use the fact that a model manifold of an exponential family is a dually flat space. In estimating parameters, curves corresponding to bisectors in the Euclidean space play an important role. Originally, the least angle regression algorithm is used for estimating parameters and selecting explanatory variables in linear regression. It is an efficient algorithm in the sense that the number of iterations is the same as the number of explanatory variables. We extend the algorithm while keeping this efficiency. However, the extended least angle regression algorithm differs significantly from the original algorithm. The extended least angle regression algorithm reduces one explanatory variable in each iteration while the original algorithm increases one explanatory variable in each iteration. We show results of the extended least angle regression algorithm for two types of datasets. The behavior of the extended least angle regression algorithm is shown. Especially, estimates of parameters become smaller and smaller, and vanish in turn.  相似文献   

3.
This paper studies the behavior of the optimum value of a two-stage stochastic program with recourse (random right-hand sides) as the mean and covariance matrices defining the random variables in the program are perturbed. Several results for convex programs are developed and are used to study the effect such perturbations have on the regularity properties of the stochastic programs. Cost associated with incorrectly specifying the mean and covariance matrices are discussed and estimated. A stochastic programming model in which the random variable is dependent on the first-stage decision is presented.  相似文献   

4.
多数基于线性混合效应模型的变量选择方法分阶段对固定效应和随机效应进行选择,方法繁琐、易产生模型偏差,且大部分非参数和半参数的线性混合效应模型只涉及非参数部分的光滑度或者固定效应的选择,并未涉及非参变量或随机效应的选择。本文用B样条函数逼近非参数函数部分,从而把半参数线性混合效应模型转化为带逼近误差的线性混合效应模型。对随机效应的协方差矩阵采用改进的乔里斯基分解并重新参数化线性混合效应模型,接着对该模型的极大似然函数施加集群ALASSO惩罚和ALASSO惩罚两类惩罚,该法能实现非参数变量、固定效应和随机效应的联合变量选择,基于该法得出的估计量也满足相合性、稀疏性和Oracle性质。文章最后做了个数值模拟,模拟结果表明,本文提出的估计方法在变量选择的准确性、参数估计的精度两个方面均表现较好。  相似文献   

5.
Certain constructions of copulas can be interpreted as an eigendecomposition of a kernel. We study some properties of the eigenfunctions and their integrals of a covariance kernel related to a bivariate distribution. The covariance between functions of random variables in terms of the cumulative distribution function is used. Some bounds for the trace of the kernel and some inequalities for a continuous random variable concerning a function and its derivative are obtained. We also obtain relations to diagonal expansions and canonical correlation analysis and, as a by-product, series of constants for some particular distributions.  相似文献   

6.
A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type. The observed data may be any combination of continuous, binary, ordinal or nominal variables. clustMD employs a parsimonious covariance structure for the latent variables, leading to a suite of six clustering models that vary in complexity and provide an elegant and unified approach to clustering mixed data. An expectation maximisation (EM) algorithm is used to estimate clustMD; in the presence of nominal data a Monte Carlo EM algorithm is required. The clustMD model is illustrated by clustering simulated mixed type data and prostate cancer patients, on whom mixed data have been recorded.  相似文献   

7.
8.
We present the autoregressive Hilbertian with exogenous variables model (ARHX) which intends to take into account the dependence structure of random curves viewed as H-valued random variables, where H is a Hilbert space of functions, under the influence of explanatory variables. Limit theorems and consistent estimators are derived from an autoregressive representation. A simulation study illustrates the accuracy of the estimation by making a comparison on forecasts with other functional models.  相似文献   

9.
With advanced capability in data collection, applications of linear regression analysis now often involve a large number of predictors. Variable selection thus has become an increasingly important issue in building a linear regression model. For a given selection criterion, variable selection is essentially an optimization problem that seeks the optimal solution over 2m possible linear regression models, where m is the total number of candidate predictors. When m is large, exhaustive search becomes practically impossible. Simple suboptimal procedures such as forward addition, backward elimination, and backward-forward stepwise procedure are fast but can easily be trapped in a local solution. In this article we propose a relatively simple algorithm for selecting explanatory variables in a linear regression for a given variable selection criterion. Although the algorithm is still a suboptimal algorithm, it has been shown to perform well in extensive empirical study. The main idea of the procedure is to partition the candidate predictors into a small number of groups. Working with various combinations of the groups and iterating the search through random regrouping, the search space is substantially reduced, hence increasing the probability of finding the global optimum. By identifying and collecting “important” variables throughout the iterations, the algorithm finds increasingly better models until convergence. The proposed algorithm performs well in simulation studies with 60 to 300 predictors. As a by-product of the proposed procedure, we are able to study the behavior of variable selection criteria when the number of predictors is large. Such a study has not been possible with traditional search algorithms.

This article has supplementary material online.  相似文献   

10.
面板数据模型在经济、生物、统计等领域有着广泛的应用。经典的面板数据模型假设解释变量系数不随时间变化。然而在现实中,解释变量系数可能会因多种因素的影响而存在多重未知的结构变点。本文假设交互固定效应面板数据模型中含有多重未知的结构变点。研究发现通过Pairwise惩罚的参数估计方法在目标函数中增加对相邻时间解释变量系数的惩罚项,能够同时进行参数估计和结构变点诊断。蒙特卡洛模拟结果显示,不管是否存在同方差假设,该方法估计的解释变量系数均偏差较小且结构变点诊断错误率低。  相似文献   

11.
Regularization of covariance matrices in high dimensions usually either is based on a known ordering of variables or ignores the ordering entirely. This article proposes a method for discovering meaningful orderings of variables based on their correlations using the Isomap, a nonlinear dimension reduction technique designed for manifold embeddings. These orderings are then used to construct a sparse covariance estimator, which is block-diagonal and/or banded. Finding an ordering to which banding can be applied is desirable because banded estimators have been shown to be consistent in high dimensions. We show that in situations where the variables do have such a structure, the Isomap does very well at discovering it, and the resulting regularized estimator performs better for covariance estimation than other regularization methods that ignore variable order, such as thresholding. We also propose a bootstrap approach to constructing the neighborhood graph used by the Isomap, and show it leads to better estimation. We illustrate our method on data on protein consumption, where the variables (food types) have a structure but it cannot be easily described a priori, and on a gene expression dataset. Supplementary materials are available online.  相似文献   

12.
13.
The goal of the present paper is to perform a comprehensive study of the covariance structures in balanced linear models containing random factors which are invariant with respect to marginal permutations of the random factors. We shall focus on model formulation and interpretation rather than the estimation of parameters. It is proven that permutation invariance implies a specific structure for the covariance matrices. Useful results are obtained for the spectra of permutation invariant covariance matrices. In particular, the reparameterization of random effects, i.e., imposing certain constraints, will be considered. There are many possibilities to choose reparameterization constraints in a linear model, however not every reparameterization keeps permutation invariance. The question is if there are natural restrictions on the random effects in a given model, i.e., such reparameterizations which are defined by the covariance structure of the corresponding factor. Examining relationships between the reparameterization conditions applied to the random factors of the models and the spectrum of the corresponding covariance matrices when permutation invariance is assumed, restrictions on the spectrum of the covariance matrix are obtained which lead to “sum-to-zero” reparameterization of the corresponding factor.  相似文献   

14.
In this paper, we consider the problem of selecting the variables of the fixed effects in the linear mixed models where the random effects are present and the observation vectors have been obtained from many clusters. As the variable selection procedure, here we use the Akaike Information Criterion, AIC. In the context of the mixed linear models, two kinds of AIC have been proposed: marginal AIC and conditional AIC. In this paper, we derive three versions of conditional AIC depending upon different estimators of the regression coefficients and the random effects. Through the simulation studies, it is shown that the proposed conditional AIC’s are superior to the marginal and conditional AIC’s proposed in the literature in the sense of selecting the true model. Finally, the results are extended to the case when the random effects in all the clusters are of the same dimension but have a common unknown covariance matrix.  相似文献   

15.
研究了加总式和乘积式的方差分解问题,证明了在因变量等于各自变量之和的条件下,因变量方差等于各自变量与因变量的协方差之和;在因变量等于各自变量之乘积的条件下,因变量对数值的方差等于各自变量对数值与因变量对数值的协方差之和.以中国31个省份2005-2012年的居民人均收入及其影响因素的统计数据资料为例,说明了加总式和乘积式的方差分解法的具体应用.  相似文献   

16.
Märt Möls 《Acta Appl Math》2003,79(1-2):17-23
In Mixed Linear Models theory one is assumed to know the structure of random effects covariance matrix. The suggestions are sometimes contradictious, especially if the model includes interactions between fixed effects and random effects. This paper presents conditions under which two different random effects' variance matrices will yield equal estimation and prediction results. Some examples and simulation results are given also.  相似文献   

17.
This paper investigates an economic order quantity (EOQ) problem with imperfect quality items, where the percentage of imperfect quality items in each lot is characterized as a random fuzzy variable while the setup cost per lot, the holding cost of each unit item per day, and the inspection cost of each unit item are characterized as fuzzy variables, respectively. In order to maximize the expected long-run average profit, a random fuzzy EOQ model is constructed. Since it is almost impossible to find an analytic method to solve the proposed model, a particle swarm optimization (PSO) algorithm based on the random fuzzy simulation is designed. Finally, the effectiveness of the designed algorithm is illustrated by a numerical example.  相似文献   

18.
This paper first reduces the problem of detecting structural breaks in a random walk to that of finding the best subset of explanatory variables in a regression model and then tailors various subset selection criteria to this specific problem. Of particular interest are those new criteria, which are obtained by means of simulation using the efficient algorithm of Bai and Perron (J Appl Econom 18:1–22, 2003). Unlike conventional variable selection methods, which penalize new variables entering a model either in the same way (e.g., AIC and BIC) or milder (e.g., MRIC and $\mathrm {FPE}_\mathrm{{sub}}$ ) than already included variables, they do not follow any monotonic penalizing scheme. In general, their non-monotonicity is more pronounced in the case of fat tails. The characteristics of the different criteria are illustrated using bootstrap samples from the Nile data set.  相似文献   

19.
本文是《厦门港及附近水域交管系统应用研究》课题中关于港口货物吞吐量预测的部分。这一课题已通过专家鉴定。文中应用回归模型预测2000年厦门港货物吞吐量。通过从多个解释变量中选择合适的解释变量,可获得较好的预测结果。其结果说明在应用数学模型预测时,最为关键的是模型、变量和数据三者之间的相互适应,而不在于模型的复杂程度,特别是在历史数据不多的情况下更是如此。  相似文献   

20.
This paper studies estimation in partial functional linear quantile regression in which the dependent variable is related to both a vector of finite length and a function-valued random variable as predictor variables. The slope function is estimated by the functional principal component basis. The asymptotic distribution of the estimator of the vector of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. It is showed that this rate is optimal in a minimax sense under some smoothness assumptions on the covariance kernel of the covariate and the slope function. The convergence rate of the mean squared prediction error for the proposed estimators is also be established. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about Berkeley growth data is used to illustrate our proposed methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号