首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The general multivariate analysis of variance model has been extensively studied in the statistical literature and successfully applied in many different fields for analyzing longitudinal data. In this article, we consider the extension of this model having two sets of regressors constituting a growth curve portion and a multivariate analysis of variance portion, respectively. Nowadays, the data collected in empirical studies have relatively complex structures though often demanding a parsimonious modeling. This can be achieved for example through imposing rank constraints on the regression coefficient matrices. The reduced rank regression structure also provides a theoretical interpretation in terms of latent variables. We derive likelihood based estimators for the mean parameters and covariance matrix in this type of models. A numerical example is provided to illustrate the obtained results.  相似文献   

2.
吕晶  郭朝会  杨虎  李婷婷 《数学学报》2018,61(4):549-568
本文基于修正的Cholesky分解提出新的方法估计纵向秩回归的组内协方差矩阵,进而提出新的无偏估计函数改善不平衡纵向数据的估计效率.在一些正则条件下,建立了所提估计的渐近正态性.进一步,提出稳健的秩得分检验统计量对回归系数做假设检验.模拟研究和实证分析表明所提方法能够获得高度有效的估计以及所提检验方法比存在的方法更好.  相似文献   

3.

Quantile regression is a powerful complement to the usual mean regression and becomes increasingly popular due to its desirable properties. In longitudinal studies, it is necessary to consider the intra-subject correlation among repeated measures over time to improve the estimation efficiency. In this paper, we focus on longitudinal single-index models. Firstly, we apply the modified Cholesky decomposition to parameterize the intra-subject covariance matrix and develop a regression approach to estimate the parameters of the covariance matrix. Secondly, we propose efficient quantile estimating equations for the index coefficients and the link function based on the estimated covariance matrix. Since the proposed estimating equations include a discrete indicator function, we propose smoothed estimating equations for fast and accurate computation of the index coefficients, as well as their asymptotic covariances. Thirdly, we establish the asymptotic properties of the proposed estimators. Finally, simulation studies and a real data analysis have illustrated the efficiency of the proposed approach.

  相似文献   

4.
We consider regression models with multiple correlated responses for each design point. Under the null hypothesis, a linear regression is assumed. For the least-squares residuals of this linear regression, we establish the limit of the partial sums. This limit is a projection on a certain subspace of the reproducing Kernel Hilbert space of a multivariate Brownian motion. Based on this limit, we propose a significance test of Kolmogorov-Smirnov type to test the null hypothesis and show that this result can be used to study a change-point problem in the case of linear profile data (panel data). We compare our proposed method, which does not rely on any distributional assumptions, with the likelihood ratio test in a simulation study.  相似文献   

5.
??In this paper, we propose a joint mean-variance-correlation modeling approach for longitudinal studies. By applying partial autocorrelations, we obtain an unconstrained parametrization for the correlation matrix that automatically guarantees its positive definiteness, and develop a regression approach to model the correlation matrix of the longitudinal measurements by exploiting the parametrization. The proposed modeling framework is parsimonious, interpretable, and flexible for analyzing longitudinal data. Real data example and simulation support the effectiveness of the proposed approach.  相似文献   

6.
The analysis of data generated by animal habitat selection studies, by family studies of genetic diseases, or by longitudinal follow-up of households often involves fitting a mixed conditional logistic regression model to longitudinal data composed of clusters of matched case-control strata. The estimation of model parameters by maximum likelihood is especially difficult when the number of cases per stratum is greater than one. In this case, the denominator of each cluster contribution to the conditional likelihood involves a complex integral in high dimension, which leads to convergence problems in the numerical maximization. In this article we show how these computational complexities can be bypassed using a global two-step analysis for nonlinear mixed effects models. The first step estimates the cluster-specific parameters and can be achieved with standard statistical methods and software based on maximum likelihood for independent data. The second step uses the EM-algorithm in conjunction with conditional restricted maximum likelihood to estimate the population parameters. We use simulations to demonstrate that the method works well when the analysis is based on a large number of strata per cluster, as in many ecological studies. We apply the proposed two-step approach to evaluate habitat selection by pairs of bison roaming freely in their natural environment. This article has supplementary material online.  相似文献   

7.
In many medical applications, longitudinal data sets are available. Longitudinal data, as well as observations from paired organs, show a dependency structure which should be respected in the evaluation. Adler et al. (Comput Stat Data Anal 53(3):718–729, 2009) proposed various bootstrapping strategies for ensemble methods based on classification trees for two measurements of paired organs. These strategies have shown to improve the classification performance compared to the traditional approach, where only one observation per subject is used. We extend the methodology to the situation, where an arbitrary number of observations per individual are available and investigate the performance of the proposed methods with bagged classification trees (bagging) and random forests in the situation of longitudinal data. Moreover, we adapt the estimation of classification performance criteria to repeated measurements data. The clinical data set consists of morphological examinations of both eyes of glaucoma patients and healthy controls over a time period of up to 7 years. The performance of our modified classifiers is evaluated by a subject-based leave-one-out bootstrap ROC analysis. Simulation results and results for the glaucoma data set demonstrate that our proposal is an improvement of adhoc strategies and of the use all measurements of each subject or block strategy.  相似文献   

8.
1.IntroductionLinearregressionmodelsarewidelyusedinstatisticalanalysisofexperimentalandobservationaldata,thatis,oneoftenemploysastandardlinearmodely=or K: E,a.s.,(1.1)todostatisticalanalysis,whereydenotesascalaroutcomevariableand2denotesaP-dimensionalcolumnvectorofregressorvariables.Thismodelmeansthattheprojectionofthepdimensionalexplanatory2ontotheone-dimensionalsubspaceadZcapturesalltheinformationweneedtoknowabouttheoutcomevariabley.Thisisadimension-reductionmodel.Hencewemayreachthegoalofd…  相似文献   

9.
An extension of univariate quantiles in the multivariate set-up has been proposed and studied. The proposed approach is affine equivariant, and it is based on an adaptive transformation retransformation procedure. Behadur type linear representations of the proposed quantiles are established and consequently asymptotic distributions are also derived. As applications of these multivariate quantiles, we develop some affine equivariant quantile contour plots which can be used to study the geometry of the data cloud as well as the underlying probability distribution and to detect outliers. These quantiles can also be used to construct affine invariant versions of multivariate Q-Q plots which are useful in checking how well a given multivariate probability distribution fits the data and for comparing the distributions of two data sets. We illustrate these applications with some simulated and real data sets. We also indicate a way of extending the notion of univariate L-estimates and trimmed means in the multivariate set-up using these affine equivariant quantiles.  相似文献   

10.
Pair-copula constructions of multiple dependence   总被引:9,自引:0,他引:9  
Building on the work of Bedford, Cooke and Joe, we show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modelled using a cascade of pair-copulae, acting on two variables at a time. We use the pair-copula decomposition of a general multivariate distribution and propose a method for performing inference. The model construction is hierarchical in nature, the various levels corresponding to the incorporation of more variables in the conditioning sets, using pair-copulae as simple building blocks. Pair-copula decomposed models also represent a very flexible way to construct higher-dimensional copulae. We apply the methodology to a financial data set. Our approach represents the first step towards the development of an unsupervised algorithm that explores the space of possible pair-copula models, that also can be applied to huge data sets automatically.  相似文献   

11.
This paper develops credibility predictors of aggregate losses using a longitudinal data framework. For a model of aggregate losses, the interest is in predicting both the claims number process as well as the claims amount process. In a longitudinal data framework, one encounters data from a cross-section of risk classes with a history of insurance claims available for each risk class. Further, explanatory variables for each risk class over time are available to help explain and predict both the claims number and claims amount process.For the marginal claims distributions, this paper uses generalized linear models, an extension of linear regression, to describe cross-sectional characteristics. Elliptical copulas are used to model the dependencies over time, extending prior work that used multivariate t-copulas. The claims number process is represented using a Poisson regression model that is conditioned on a sequence of latent variables. These latent variables drive the serial dependencies among claims numbers; their joint distribution is represented using an elliptical copula. In this way, the paper provides a unified treatment of both the continuous claims amount and discrete claims number processes.The paper presents an illustrative example of Massachusetts automobile claims. Estimates of the latent claims process parameters are derived and simulated predictions are provided.  相似文献   

12.
Abstract

An essential feature of longitudinal data is the existence of autocorrelation among the observations from the same unit or subject. Two-stage random-effects linear models are commonly used to analyze longitudinal data. These models are not flexible enough, however, for exploring the underlying data structures and, especially, for describing time trends. Semi-parametric models have been proposed recently to accommodate general time trends. But these semi-parametric models do not provide a convenient way to explore interactions among time and other covariates although such interactions exist in many applications. Moreover, semi-parametric models require specifying the design matrix of the covariates (time excluded). We propose nonparametric models to resolve these issues. To fit nonparametric models, we use the novel technique of the multivariate adaptive regression splines for the estimation of mean curve and then apply an EM-like iterative procedure for covariance estimation. After giving a general algorithm of model building, we show how to design a fast algorithm. We use both simulated and published data to illustrate the use of our proposed method.  相似文献   

13.
Multivariate longitudinal data arise frequently in a variety of applications, where multiple outcomes are measured repeatedly from the same subject. In this paper, we first propose a two-stage weighted least square estimation procedure for the regression coefficients when the random error follows an irregular autoregressive(AR) process, and establish asymptotic normality properties for the resulting estimators. We then apply the smoothly clipped absolute deviation(SCAD) variable selection approach to determine the order of the AR error process. We further propose a test statistic to check whether multiple responses are correlated at the same observation time, and derive the asymptotic distribution of the proposed test statistic. Several simulated examples and real data analysis are presented to illustrate the finite-sample performance of the proposed method.  相似文献   

14.
面板数据经常出现在许多研究领域, 比如纵向跟踪研究. 在很多情况下, 纵向反应变量与观察 时间和删失时间都有关系. 本文在有偏抽样下, 针对这些相关性存在的情况, 利用一个不能观察的潜在 变量, 提出了一个联合建模方法来刻画纵向反应变量与观察时间和删失时间的相关性, 获得了模型中 回归参数的估计方程以及估计的渐近性质, 并通过数值模拟验证了这些估计在小样本下也是有效的, 同时把该估计方法用于一组实际的膀胱癌数据分析中.  相似文献   

15.
This paper presents an extension of the standard regression tree method to clustered data. Previous works extending tree methods to accommodate correlated data are mainly based on the multivariate repeated-measures approach. We propose a “mixed effects regression tree” method where the correlated observations are viewed as nested within clusters rather than as vectors of multivariate repeated responses. The proposed method can handle unbalanced clusters, allows observations within clusters to be split, and can incorporate random effects and observation-level covariates. We implemented the proposed method using a standard tree algorithm within the framework of the expectation-maximization (EM) algorithm. The simulation results show that the proposed regression tree method provides substantial improvements over standard trees when the random effects are non negligible. A real data example is used to illustrate the method.  相似文献   

16.
Clusterwise regression consists of finding a number of regression functions each approximating a subset of the data. In this paper, a new approach for solving the clusterwise linear regression problems is proposed based on a nonsmooth nonconvex formulation. We present an algorithm for minimizing this nonsmooth nonconvex function. This algorithm incrementally divides the whole data set into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate a good starting point for solving global optimization problems at each iteration of the incremental algorithm. Such an approach allows one to find global or near global solution to the problem when the data sets are sufficiently dense. The algorithm is compared with the multistart Späth algorithm on several publicly available data sets for regression analysis.  相似文献   

17.
In this paper, we develop a semi-parametric Bayesian estimation approach through the Dirichlet process (DP) mixture in fitting linear mixed models. The random-effects distribution is specified by introducing a multivariate skew-normal distribution as base for the Dirichlet process. The proposed approach efficiently deals with modeling issues in a wide range of non-normally distributed random effects. We adopt Gibbs sampling techniques to achieve the parameter estimates. A small simulation study is conducted to show that the proposed DP prior is better at the prediction of random effects. Two real data sets are analyzed and tested by several hypothetical models to illustrate the usefulness of the proposed approach.  相似文献   

18.
Non-random missing data poses serious problems in longitudinal studies. The binomial distribution parameter becomes to be unidentifiable without any other auxiliary information or assumption when it suffers from ignorable missing data. Existing methods are mostly based on the log-linear regression model. In this article, a model is proposed for longitudinal data with non-ignorable non-response. It is considered to use the pre-test baseline data to improve the identifiability of the post-test parameter. Furthermore, we derive the identified estimation (IE), the maximum likelihood estimation (MLE) and its associated variance for the post-test parameter. The simulation study based on the model of this paper shows that the proposed approach gives promising results.  相似文献   

19.
对现象之间客观存在的因果关系建立回归分析模型,这是实际中较为普遍的做法.在这篇文章中,我们根据MULTIVARIATE回归分析的基本原理,利用从生产现场采集的观测数据,对产品两个质量特性及其五个关键影响因素之间的关系建立了多重多元回归分析方程,为说明MULTIVARIATE回归应用的可行性,我们还结合实例给出了因变量向量估计的两种形式,以及无条件预报的置信区间。  相似文献   

20.
This paper investigates solving the knapsack problem with imprecise weight coefficients using genetic algorithms. This work is based on the assumption that each weight coefficient is imprecise due to decimal truncation or coefficient rough estimation by the decision-maker. To deal with this kind of imprecise data, fuzzy sets provide a powerful tool to model and solve this problem. We investigate the possibility of using genetic algorithms in solving the fuzzy knapsack problem without defining membership functions for each imprecise weight coefficient. The proposed approach simulates a fuzzy number by distributing it into some partition points. We use genetic algorithms to evolve the values in each partition point so that the final values represent the membership grade of a fuzzy number. The empirical results show that the proposed approach can obtain very good solutions within the given bound of each imprecise weight coefficient than the fuzzy knapsack approach. The fuzzy genetic algorithm concept approach is different, but gives better results than the traditional fuzzy approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号