首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
The multivariate probit model is very useful for analyzing correlated multivariate dichotomous data. Recently, this model has been generalized with a confirmatory factor analysis structure for accommodating more general covariance structure, and it is called the MPCFA model. The main purpose of this paper is to consider local influence analysis, which is a well-recognized important step of data analysis beyond the maximum likelihood estimation, of the MPCFA model. As the observed-data likelihood associated with the MPCFA model is intractable, the famous Cook's approach cannot be applied to achieve local influence measures. Hence, the local influence measures are developed via Zhu and Lee's [Local influence for incomplete data model, J. Roy. Statist. Soc. Ser. B 63 (2001) 111-126.] approach that is closely related to the EM algorithm. The diagnostic measures are derived from the conformal normal curvature of an appropriate function. The building blocks are computed via a sufficiently large random sample of the latent response strengths and latent variables that are generated by the Gibbs sampler. Some useful perturbation schemes are discussed. Results that are obtained from analyses of an artificial example and a real example are presented to illustrate the newly developed methodology.  相似文献   

2.
Our aim is to construct a general measurement framework for analyzing the effects of measurement errors in multivariate measurement scales. We define a measurement model, which forms the core of the framework. The measurement scales in turn are often produced by methods of multivariate statistical analysis. As a central element of the framework, we introduce a new, general method of estimating the reliability of measurement scales. It is more appropriate than the classical procedures, especially in the context of multivariate analyses. The framework provides methods for various topics related to the quality of measurement, such as assessing the structural validity of the measurement model, estimating the standard errors of measurement, and correcting the predictive validity of a measurement scale for attenuation. A proper estimate of reliability is a requisite in each task. We illustrate the idea of the measurement framework with an example based on real data.  相似文献   

3.
Analysis of repeated measures under unequal variances   总被引:1,自引:0,他引:1  
Problem of making inferences on a widely used repeated measures model is considered without the assumption of equal error variances. By taking the generalized approach to making statistical inference, we derive necessary formulae to compute exact generalized p-values for testing the equality of treatment effects, occasion effects, and their interactions. We also provide formulae for making inferences about the variance components of the model. Advantage of the generalized p-values over the classical F-test is demonstrated by means of an example.  相似文献   

4.
A weighted multivariate signed-rank test is introduced for an analysis of multivariate clustered data. Observations in different clusters may then get different weights. The test provides a robust and efficient alternative to normal theory based methods. Asymptotic theory is developed to find the approximate p-value as well as to calculate the limiting Pitman efficiency of the test. A conditionally distribution-free version of the test is also discussed. The finite-sample behavior of different versions of the test statistic is explored by simulations and the new test is compared to the unweighted and weighted versions of Hotelling’s T2 test and the multivariate spatial sign test introduced in [D. Larocque, J. Nevalainen, H. Oja, A weighted multivariate sign test for cluster-correlated data, Biometrika 94 (2007) 267-283]. Finally, a real data example is used to illustrate the theory.  相似文献   

5.
In longitudinal studies with small samples and incomplete data, multivariate normal-based models continue to be a powerful tool for analysis. This has included a broad scope of biomedical studies. Testing the assumption of multivariate normality (MVN) is critical. Although many methods are available for testing normality in complete data with large samples, a few deal with the testing in small samples. For example, Liang et al. (J. Statist. Planning and Inference 86 (2000) 129) propose a projection procedure for testing MVN for complete-data with small samples where the sample sizes may be close to the dimension. To our knowledge, no statistical methods for testing MVN in incomplete data with small samples are yet available. This article develops a test procedure in such a setting using multiple imputations and the projection test. To utilize the incomplete data structure in multiple imputation, we adopt a noniterative inverse Bayes formulae (IBF) sampling procedure instead of the iterative Gibbs sampling to generate iid samples. Simulations are performed for both complete and incomplete data when the sample size is less than the dimension. The method is illustrated with a real study on an anticancer drug.  相似文献   

6.
For two multivariate normal populations with unequal covariance matrices, a procedure is developed for testing the equality of the mean vectors based on the concept of generalized p-values. The generalized p-values we have developed are functions of the sufficient statistics. The computation of the generalized p-values is discussed and illustrated with an example. Numerical results show that one of our generalized p-value test has a type I error probability not exceeding the nominal level. A formula involving only a finite number of chi-square random variables is provided for computing this generalized p-value. The formula is useful in a Bayesian solution as well. The problem of constructing a confidence region for the difference between the mean vectors is also addressed using the concept of generalized confidence regions. Finally, using the generalized p-value approach, a solution is developed for the heteroscedastic MANOVA problem.  相似文献   

7.
This paper establishes a link between a generalized matrix Matsumoto-Yor (MY) property and the Wishart distribution. This link highlights certain conditional independence properties within blocks of the Wishart and leads to a new characterization of the Wishart distribution similar to the one recently obtained by Geiger and Heckerman but involving independences for only three pairs of block partitionings of the random matrix.In the process, we obtain two other main results. The first one is an extension of the MY independence property to random matrices of different dimensions. The second result is its converse. It extends previous characterizations of the matrix generalized inverse Gaussian and Wishart seen as a couple of distributions.We present two proofs for the generalized MY property. The first proof relies on a new version of Herz's identity for Bessel functions of matrix arguments. The second proof uses a representation of the MY property through the structure of the Wishart.  相似文献   

8.
Bayesian l0‐regularized least squares is a variable selection technique for high‐dimensional predictors. The challenge is optimizing a nonconvex objective function via search over model space consisting of all possible predictor combinations. Spike‐and‐slab (aka Bernoulli‐Gaussian) priors are the gold standard for Bayesian variable selection, with a caveat of computational speed and scalability. Single best replacement (SBR) provides a fast scalable alternative. We provide a link between Bayesian regularization and proximal updating, which provides an equivalence between finding a posterior mode and a posterior mean with a different regularization prior. This allows us to use SBR to find the spike‐and‐slab estimator. To illustrate our methodology, we provide simulation evidence and a real data example on the statistical properties and computational efficiency of SBR versus direct posterior sampling using spike‐and‐slab priors. Finally, we conclude with directions for future research.  相似文献   

9.
In this paper we consider the problem of testing the hypothesis about the sub-mean vector. For this propose, the asymptotic expansion of the null distribution of Rao's U-statistic under a general condition is obtained up to order of n-1. The same problem in the k-sample case is also investigated. We find that the asymptotic distribution of generalized U-statistic in the k-sample case is identical to that of the generalized Hotelling's T2 distribution up to n-1. A simulation experiment is carried out and its results are presented. It shows that the asymptotic distributions have significant improvement when comparing with the limiting distributions both in the small sample case and the large sample case. It also demonstrates the equivalence of two testing statistics mentioned above.  相似文献   

10.
Our aim is to construct a factor analysis method that can resist the effect of outliers. For this we start with a highly robust initial covariance estimator, after which the factors can be obtained from maximum likelihood or from principal factor analysis (PFA). We find that PFA based on the minimum covariance determinant scatter matrix works well. We also derive the influence function of the PFA method based on either the classical scatter matrix or a robust matrix. These results are applied to the construction of a new type of empirical influence function (EIF), which is very effective for detecting influential data. To facilitate the interpretation, we compute a cutoff value for this EIF. Our findings are illustrated with several real data examples.  相似文献   

11.
In many reliability analyses, the probability of obtaining a defective unit in a production process should not be considered constant even though the process is stable and in control. Engineering experience or previous data of similar or related products may often be used in the proper selection of a prior model to describe the random fluctuations in the fraction defective. A generalized beta family of priors, several maximum entropy priors and other prior models are considered for this purpose. In order to determine the acceptability of a product based on the lifelengths of some test units, failure-censored reliability sampling plans for location-scale distributions using average producer and consumer risks are designed. Our procedure allows the practitioners to incorporate a restricted parameter space into the reliability analysis, and it is reasonably insensitive to small disturbances in the prior information. Impartial priors are used to reflect prior neutrality between the producer and the consumer when a consensus on the elicited prior model is required. Nonetheless, our approach also enables the producer and the consumer to assume their own prior distributions. The use of substantial prior information can, in many cases, significantly reduce the amount of testing required. However, the main advantage of utilizing a prior model for the fraction defective is not necessarily reduced sample size but improved assessment of the true sampling risks. An example involving shifted exponential lifetimes is considered to illustrate the results.  相似文献   

12.
In this paper we introduce generalized S-estimators for the multivariate regression model. This class of estimators combines high robustness and high efficiency. They are defined by minimizing the determinant of a robust estimator of the scatter matrix of differences of residuals. In the special case of a multivariate location model, the generalized S-estimator has the important independence property, and can be used for high breakdown estimation in independent component analysis. Robustness properties of the estimators are investigated by deriving their breakdown point and the influence function. We also study the efficiency of the estimators, both asymptotically and at finite samples. To obtain inference for the regression parameters, we discuss the fast and robust bootstrap for multivariate generalized S-estimators. The method is illustrated on a real data example.  相似文献   

13.
Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, bootstrap aggregating (bagging) has been introduced as a method to reduce the variance of a given estimator at little cost to bias. Bagging involves applying an estimator to multiple bootstrap samples and averaging the result across bootstrap samples. In order to address the curse of dimensionality, a common practice has been to apply bagging to estimators which themselves use cross-validation, thereby using cross-validation within a bootstrap sample to select fine-tuning parameters trading off bias and variance of the bootstrap sample-specific candidate estimators. In this article we point out that in order to achieve the correct bias variance trade-off for the parameter of interest, one should apply the cross-validation selector externally to candidate bagged estimators indexed by these fine-tuning parameters. We use three simulations to compare the new cross-validated bagging method with bagging of cross-validated estimators and bagging of non-cross-validated estimators.  相似文献   

14.
We compare correspondence analysis (CA) and the alternative approach using Hellinger distance (HD), for representing categorical data in a contingency table. As both methods may be appropriate, we introduce a parameter and define a generalized version of correspondence analysis (GCA) which contains CA and HD as particular cases. Comparison with alternative approaches are performed. We propose a coefficient which globally measures the similarity between CA and GCA, which can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions on the difference between both representations. Two criteria for choosing the best value of the parameter are proposed.  相似文献   

15.
This paper presents a method of determining joint distributions by known conditional distributions. A generalization of the Factorization Theorem is proposed. The generalized theorem is proved under the assumption that the support of unknown joint distribution may be divided into a countable number of sets, which all satisfy the relative weak positivity condition. This condition is defined in the paper and it generalizes the positivity condition introduced by Hammersley and Clifford. The theorem is illustrated with three examples. In the first example we determine a joint density in the case when the support of an unknown density is a continuous nonproduct set from Euclidean space . In the second example we seek the joint probability for the number of trials and the number of successes in Bernoulli's scheme. We also examine a simple example given by Kaiser and Cressie (J. Multivariate Anal. 73 (2000) 199).  相似文献   

16.
In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology.  相似文献   

17.
Data in social and behavioral sciences are often hierarchically organized. Multilevel statistical methodology was developed to analyze such data. Most of the procedures for analyzing multilevel data are derived from maximum likelihood based on the normal distribution assumption. Standard errors for parameter estimates in these procedures are obtained from the corresponding information matrix. Because practical data typically contain heterogeneous marginal skewnesses and kurtoses, this paper studies how nonnormally distributed data affect the standard errors of parameter estimates in a two-level structural equation model. Specifically, we study how skewness and kurtosis in one level affect standard errors of parameter estimates within its level and outside its level. We also show that, parallel to asymptotic robustness theory in conventional factor analysis, conditions exist for asymptotic robustness of standard errors in a multilevel factor analysis model.  相似文献   

18.
The theory of Gaussian graphical models is a powerful tool for independence analysis between continuous variables. In this framework, various methods have been conceived to infer independence relations from data samples. However, most of them result in stepwise, deterministic, descent algorithms that are inadequate for solving this issue. More recent developments have focused on stochastic procedures, yet they all base their research on strong a priori knowledge and are unable to perform model selection among the set of all possible models. Moreover, convergence of the corresponding algorithms is slow, precluding applications on a large scale. In this paper, we propose a novel Bayesian strategy to deal with structure learning. Relating graphs to their supports, we convert the problem of model selection into that of parameter estimation. Use of non-informative priors and asymptotic results yield a posterior probability for independence graph supports in closed form. Gibbs sampling is then applied to approximate the full joint posterior density. We finally give three examples of structure learning, one from synthetic data, and the two others from real data.  相似文献   

19.
Statistical modeling is an important area of biomarker research of important genes for new drug targets, drug candidate validation, disease diagnoses, personalized treatment, and prediction of clinical outcome of a treatment. A widely adopted technology is the use of microarray data that are typically very high dimensional. After screening chromosomes for relative genes using methods such as quantitative trait locus mapping, there may still be a few thousands of genes related to the clinical outcome of interest. On the other hand, the sample size (the number of subjects) in a clinical study is typically much smaller. Under the assumption that only a few important genes are actually related to the clinical outcome, we propose a variable screening procedure to eliminate genes having negligible effects on the clinical outcome. Once the dimension of microarray data is reduced to a manageable number relative to the sample size, one can select a final set of genes via a well-known variable selection method such as the cross-validation. We establish the asymptotic consistency of the proposed variable screening procedure. Some simulation results are also presented.  相似文献   

20.
Objective functions that are applied in ordinal data analysis must be adequate, i.e. carefully adapted to the structure of the observed data. In addition, any analysis of data that is based upon objective functions must lead to interpretable results. After a general characterization of adequate objective functions in ordinal data analysis, therefore, the particular problems of constructing adequate and interpretable dissimilarity coefficients and correlation coefficients in ordinal data analysis, stress measures (stress functions) in non-metric scaling and generalized stress measures or correlation coefficients in any theory of rank estimation will be discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号