首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
The d iscrete multi-way layout is an abstract data type associated with regression, experimental designs, digital images or videos, spatial statistics, gene or protein chips, and more. The factors influencing response can be nominal or ordinal. The observed factor level combinations are finitely discrete and often incomplete or irregularly spaced. This paper develops low risk biased estimators of the means at the observed factor level combinations; and extrapolates the estimated means to larger discrete complete layouts. Candidate penalized least squares (PLS) estimators with multiple quadratic penalties express competing conjectures about each of the main effects and interactions in the analysis of variance decomposition of the means. The candidate PLS estimator with smallest estimated quadratic risk attains, asymptotically, the smallest risk over all candidate PLS estimators. In the theoretical analysis, the dimension of the regression space tends to infinity. No assumptions are made about the unknown means or about replication.  相似文献   

2.
The unknown matrix M is the mean of the observed response matrix in a multivariate linear model with independent random errors. This paper constructs regularized estimators of M that dominate, in asymptotic risk, least squares fits to the model and to specified nested submodels. In the first construction, the response matrix is expressed as the sum of orthogonal components determined by the submodels; each component is replaced by an adaptive total least squares fit of possibly lower rank; and these fits are then summed. The second, lower risk, construction differs only in the second step: each orthogonal component is replaced by a modified Efron-Morris fit before summation. Singular value decompositions yield computable formulae for the estimators and their asymptotic and estimated risks. In the asymptotics, the row dimension of M tends to infinity while the column dimension remains fixed. Convergences are uniform when signal-to-noise ratio is bounded. This research was supported in part by National Science Foundation Grant DMS 0404547.  相似文献   

3.
信用分类是信用风险管理中一个重要环节,其主要目的是根据信用申请客户提供的资料从申请客户中区分出可信客户和违约客户,以便为信用决策者提供决策依据.为了正确区分不同的信用客户,特别是违约客户,结合核主元分析和支持向量机算法构造基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型对信用数据进行了分类处理.在基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型中,首先对样本数据进行预处理,然后利用核主元分析以非线性方式降低数据的维数,最后利用带可变惩罚因子最小二乘模糊支持向量机模型对降维后数据进行分类分析.为了验证,选择两个公开的信用数据集来进行实证分析.实证结果表明:基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型取得了较好的分类结果,可为信用决策者提供重要的决策参考依据.  相似文献   

4.
Estimation of the mean function in nonparametric regression is usefully separated into estimating the means at the observed factor levels—a one-way layout problem—and interpolation between the estimated means at adjacent factor levels. Candidate penalized least squares (PLS) estimators for the mean vector of a one-way layout are expressed as shrinkage estimators relative to an orthogonal regression basis determined by the penalty matrix. The shrinkage representation of PLS suggests a larger class of candidate monotone shrinkage (MS) estimators. Adaptive PLS and MS estimators choose the shrinkage vector and penalty matrix to minimize estimated risk. The actual risks of shrinkage-adaptive estimators depend strongly upon the economy of the penalty basis in representing the unknown mean vector. Local annihilators of polynomials, among them difference operators, generate penalty bases that are economical in a range of examples. Diagnostic techniques for adaptive PLS or MS estimators include basis-economy plots and estimates of loss or risk.  相似文献   

5.
We establish the consistency, asymptotic normality, and efficiency for estimators derived by minimizing the median of a loss function in a Bayesian context. We contrast this procedure with the behavior of two Frequentist procedures, the least median of squares (LMS) and the least trimmed squares (LTS) estimators, in regression problems. The LMS estimator is the Frequentist version of our estimator, and the LTS estimator approaches a median-based estimator as the trimming approaches 50% on each side. We argue that the Bayesian median-based method is a good tradeoff between the two Frequentist estimators.  相似文献   

6.
Multilevel modeling is a popular statistical technique for analyzing data in hierarchical format, and thus naturally fits within a distributed database framework. We consider the computational aspects of multilevel modeling across distributed databases. In addition, we consider these aspects under a generalization of the multilevel model where the distributed groups (or databases) are allowed to specify different models at both level-1 (individual) and level-2 (group). For a variety of scenarios, we develop the distributed computation algorithm for two-step least squares (LS) estimators and also for iterative MLE estimators of the parameters of interest; in particular, we determine the required data structure at each computing site, the necessary information (original data, cross-product matrices, coefficient vectors), and the order in which such information needs to be passed between sites. Finally, we discuss recursive updating, fault tolerance, and security issues.  相似文献   

7.
We present a new approach to univariate partial least squares regression (PLSR) based on directional signal-to-noise ratios (SNRs). We show how PLSR, unlike principal components regression, takes into account the actual value and not only the variance of the ordinary least squares (OLS) estimator. We find an orthogonal sequence of directions associated with decreasing SNR. Then, we state partial least squares estimators as least squares estimators constrained to be null on the last directions. We also give another procedure that shows how PLSR rebuilds the OLS estimator iteratively by seeking at each step the direction with the largest difference of signals over the noise. The latter approach does not involve any arbitrary scale or orthogonality constraints.  相似文献   

8.
Time series data with periodic trends like daily temperatures or sales of seasonal products can be seen in periods fluctuating between highs and lows throughout the year. Generalized least squares estimators are often computed for such time series data as these estimators have minimum variance among all linear unbiased estimators. However, the generalized least squares solution can require extremely demanding computation when the data is large. This paper studies an efficient algorithm for generalized least squares estimation in periodic trended regression with autoregressive errors. We develop an algorithm that can substantially simplify generalized least squares computation by manipulating large sets of data into smaller sets. This is accomplished by coining a structured matrix for dimension reduction. Simulations show that the new computation methods using our algorithm can drastically reduce computing time. Our algorithm can be easily adapted to big data that show periodic trends often pertinent to economics, environmental studies, and engineering practices.  相似文献   

9.
For estimating the parameters of models for financial market data, the use of robust techniques is of particular interest. Conditional forecasts, based on the capital asset pricing model, and a factor model are considered. It is proposed to consider least median of squares estimators as one possible alternative to ordinary least squares. Given the complexity of the objective function for the least median of squares estimator, the estimates are obtained by means of optimization heuristics. The performance of two heuristics is compared, namely differential evolution and threshold accepting. It is shown that these methods are well suited to obtain least median of squares estimators for real world problems. Furthermore, it is analyzed to what extent parameter estimates and conditional forecasts differ between the two estimators. The empirical analysis considers daily and monthly data on some stocks from the Dow Jones Industrial Average Index.  相似文献   

10.
在线性混合效应模型下, 方差分析(ANOVA) 估计和谱分解(SD) 估计对构造精确检验和广义P-值枢轴量起着非常重要的作用. 尽管这两估计分别基于不同的方法, 但它们共享许多类似的优点, 如无偏性和有精确的表达式等. 本文借助于已得到的协方差阵的谱分解结果, 揭示了平衡数据一般线性混合效应模型下ANOVA 估计与SD 估计的关系, 并分别针对协方差阵两种结构: 套结构和多项分类随机效应结构, 给出了ANOVA 估计与SD 估计等价的充分必要条件.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号