首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
The analysis of variance (ANOVA) is widely used in biological studies, yet there remains considerable confusion among researchers about the interpretation of hypotheses being tested. Ambiguities arise when statistical designs are unbalanced, and in particular when not all combinations of design factors are represented in the data. This paper clarifies the relationship among hypothesis testing, statistical modelling and computing procedures in ANOVA for unbalanced data. A simple two-factor fixed effects design is used to illustrate three common parametrizations for ANOVA models, and some associations among these parametrizations are developed. Biologically meaningful hypotheses for main effects and interactions are given in terms of each parametrization, and procedures for testing the hypotheses are described. The standard statistical computing procedures in ANOVA are given along with their corresponding hypotheses. Throughout the development unbalanced designs are assumed and attention is given to problems that arise with missing cells.  相似文献   

2.
Editorial     
Complete randomization for many industrial and agricultural experiments is frequently impractical due to constraints in time, cost, or existence of one or more hard-to-change factors. In these situations, restrictions on randomization may lead to split-plot designs, allowing certain factor levels to be randomly applied to the whole plot units and remaining factor levels randomly applied to the subplot units. Separate random errors in whole and subplot units from the two randomizations results in a compound symmetric error structure, which affects estimation, inference, and choice of design. In this article, we consider the prediction properties of split-plot designs, expanding the comparison between designs beyond parameter estimation properties and present three-dimensional variance dispersion graphs. These graphs can be used for evaluating the prediction capability of split-plot designs as well as for developing design strategies. We demonstrate, through several examples, that three-dimensional variance dispersion plots offer a more comprehensive study of competing designs than what is offered when comparing designs with single number optimality criteria (such as D-, G-, and V-criteria).  相似文献   

3.
We consider the class of saturated main effect plans for the 2k factorial. With these saturated designs, the overall mean and all main effects can be unbiasedly estimated provided that there are no interactions. However, there is no way to estimate the error variance with such designs. Because of this and other reasons, we like to add some additional runs to the set of (k+1) runs in the D‐optimal design in this class. Our goals here are: (1) to search for s additional runs so that the resulting design based on (k+s+1) runs yields a D‐optimal design in the class of augmented designs; (2) to classify all the runs into equivalent classes so that the runs in the same equivalent class give us the same value of the determinant of the information matrix. This allows us to trade runs for runs if this becomes necessary; (3) to obtain upper bounds for determinant of the information matrices of augmented designs. In this article we shall address these approaches and present some new results. © 2002 Wiley Periodicals, Inc. J Combin Designs 11: 51–77, 2003; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/jcd.10026  相似文献   

4.
Abstract Developing models to predict tree mortality using data from long‐term repeated measurement data sets can be difficult and challenging due to the nature of mortality as well as the effects of dependence on observations. Marginal (population‐averaged) generalized estimating equations (GEE) and random effects (subject‐specific) models offer two possible ways to overcome these effects. For this study, standard logistic, marginal logistic based on the GEE approach, and random logistic regression models were fitted and compared. In addition, four model evaluation statistics were calculated by means of K‐fold cross‐valuation. They include the mean prediction error, the mean absolute prediction error, the variance of prediction error, and the mean square error. Results from this study suggest that the random effects model produced the smallest evaluation statistics among the three models. Although marginal logistic regression accommodated for correlations between observations, it did not provide noticeable improvements of model performance compared to the standard logistic regression model that assumed impendence. This study indicates that the random effects model was able to increase the overall accuracy of mortality modeling. Moreover, it was able to ascertain correlation derived from the hierarchal data structure as well as serial correlation generated through repeated measurements.  相似文献   

5.
The analysis of local changes in sequence data is of interest for various applications such as the segmentation of DNA and other genetic sequences, or financial data sequences. Patterns of change that can be characterized as local jump change or slope change are of special interest. We propose simple graphical tools to visualize such patterns of local change. The concept of mode trees—developed for the visualization of local patterns in densities—is adapted to visualize patterns of local change in dependency on a threshold parameter by means of a change tree . The simultaneous visualization of scale effects, in analogy to SiZer, motivates another graphical device, the mutagram . We illustrate these concepts with several sets of sequence data.  相似文献   

6.
Two-factor fixed-effect unbalanced nested design model without the assumption of equal error variance is considered. Using the generalized definition ofp-values, exact tests under heteroscedasticity are derived for testing main effects of both factors. These generalizedF-tests can be utilized in significance testing or in fixed level testing under the Neyman-Pearson theory. Two examples are given to illustrate the proposed test and to demonstrate its advantages over the classicalF-test. Extensions of the procedure for three-factor nested designs are briefly discussed.  相似文献   

7.
This article introduces a class of central composite designs with nested sub-experiment, which allow for the estimation of both response surface effects (fixed effects of crossed factors) and variance components arising from nested random effects. An iterated least squared method using sufficient statistics is given for obtaining maximum likelihood estimates of the parameters in a mixed model. Simulation results show that advantages for unbalanced designs are greatest when error variance is small.  相似文献   

8.
Triangle‐free quasi‐symmetric 2‐ (v, k, λ) designs with intersection numbers x, y; 0<x<y<kand λ>1, are investigated. It is proved that λ?2y ? x ? 3. As a consequence it is seen that for fixed λ, there are finitely many triangle‐free quasi‐symmetric designs. It is also proved that: k?y(y ? x) + x. Copyright © 2011 Wiley Periodicals, Inc. J Combin Designs 19:422‐426, 2011  相似文献   

9.
Blocking is often used to reduce known variability in designed experiments by collecting together homogeneous experimental units. A common modeling assumption for such experiments is that responses from units within a block are dependent. Accounting for such dependencies in both the design of the experiment and the modeling of the resulting data when the response is not normally distributed can be challenging, particularly in terms of the computation required to find an optimal design. The application of copulas and marginal modeling provides a computationally efficient approach for estimating population‐average treatment effects. Motivated by an experiment from materials testing, we develop and demonstrate designs with blocks of size two using copula models. Such designs are also important in applications ranging from microarray experiments to experiments on human eyes or limbs with naturally occurring blocks of size two. We present a methodology for design selection, make comparisons to existing approaches in the literature, and assess the robustness of the designs to modeling assumptions.  相似文献   

10.
The generalized information criterion (GIC) proposed by Rao and Wu [A strongly consistent procedure for model selection in a regression problem, Biometrika 76 (1989) 369-374] is a generalization of Akaike's information criterion (AIC) and the Bayesian information criterion (BIC). In this paper, we extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. The procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects.  相似文献   

11.
Generalized linear mixed effects models (GLMM) provide useful tools for correlated and/or over-dispersed non-Gaussian data. This article considers generalized nonparametric mixed effects models (GNMM), which relax the rigid linear assumption on the conditional predictor in a GLMM. We use smoothing splines to model fixed effects. The random effects are general and may also contain stochastic processes corresponding to smoothing splines. We show how to construct smoothing spline ANOVA (SS ANOVA) decompositions for the predictor function. Components in a SS ANOVA decomposition have nice interpretations as main effects and interactions. Experimental design considerations help determine which components are fixed or random. We estimate all parameters and spline functions using stochastic approximation with Markov chain Monte Carlo (MCMC). As iteration increases we increase the MCMC sample size and decrease the step-size of the parameter update. This approach guarantees convergence of the estimates to the expected fixed points. We evaluate our methods through a simulation study.  相似文献   

12.
Let Ψ(t,k) denote the set of pairs (v,λ) for which there exists a graphical t‐(v,k,λ) design. Most results on graphical designs have gone to show the finiteness of Ψ(t,k) when t and k satisfy certain conditions. The exact determination of Ψ(t,k) for specified t and k is a hard problem and only Ψ(2,3), Ψ(2,4), Ψ(3,4), Ψ(4,5), and Ψ(5,6) have been determined. In this article, we determine completely the sets Ψ(2,5) and Ψ(3,5). As a result, we find more than 270,000 inequivalent graphical designs, and more than 8,000 new parameter sets for which there exists a graphical design. Prior to this, graphical designs are known for only 574 parameter sets. © 2006 Wiley Periodicals, Inc. J Combin Designs 16: 70–85, 2008  相似文献   

13.
The prediction of breeding values depends on the reliable estimation of variance components. This complex task leads to nonlinear minimization problems that have to be solved by numerical algorithms. In order to evaluate the reliability of these algorithms benchmark problems have to be constructed where the exact solution is a priori known. We develop techniques to construct such benchmark problems for mixed models including fixed and random effects, ANOVA, ML and REML predictors, balanced and unbalanced data for 1-way classification. Besides the construction of artificial data that produce the desired variance components we describe a projection method to construct benchmark data from simulated data. We discuss the cases where exact expressions for the projection can be given and where a numerical approximation procedure has to be used.  相似文献   

14.
The following results are obtained, (i) It is possible to obtain a time series of market data {y(t)} in which the fluctuations in fundamental value have been compensated for. An objective test of the efficient market hypothesis (EMH), which would predict random correlations about a constant value, is thereby possible, (ii) A time series procedure can be used to determine the extent to which the differences in the data and the moving averages are significant. This provides a model of the form y(t)-y(t-l)=0.5{y(t- l)-y(t-2)}+ε(t)+0.8ε(r-1) where ε(t) is the error at time t, and the coefficients 0.5 and 0.8 are determined from the data. One concludes that today's price is not a random perturbation from yesterday's; rather, yesterday's rate of change is a significant predictor of today's rate of change. This confirms the concept of momentum that is crucial to market participants. (iii) The model provides out-of-sample predictions that can be tested statistically. (iv) The model and coefficients obtained in this way can be used to make predictions on laboratory experiments to establish an objective and quantitative link between the experiments and the market data. These methods circumvent the central difficulty in testing market data, namely, that changes in fundamentals obscure intrinsic trends and autocorrelations. This procedure is implemented by considering the ratio of two similar funds (Germany and Future Germany) with the same manager and performing a set of statistical tests that have excluded fluctuations in fundamental factors. For the entire data of the first 1149 days beginning with the introduction of the latter fund, a standard runs test indicates that the data is 29 standard deviations away from that which would be expected under a hypothesis of random fluctuations about the fundamental value. This and other tests provide strong evidence against the efficient market hypothesis and in favour of autocorrelations in the data. An ARIMA time series finds strong evidence (9.6 and 21.6 standard deviations in the two coefficients) that the data is described by a model that involves the first difference, indicating that momentum is the significant factor. The first quarter's data is used to make out-of-sample predictions for the second quarter with results that are significant to 3 standard deviations. Finally, the ARIMA model and coefficients are used to make predictions on laboratory experiments of Porter and Smith in which the intrinsic value is clear. The model's forecasts are decidedly more accurate than that of the null hypothesis of random fluctuations about the fundamental value.  相似文献   

15.
This paper deals with nonparametric regression estimation under arbitrary sampling with an unknown distribution. The effect of the distribution of the design, which is a nuisance parameter, can be eliminated by conditioning. An upper bound for the conditional mean squared error of kNN estimates leads us to consider an optimal number of neighbors, which is a random function of the sampling. The corresponding estimate can be used for nonasymptotic inference and is also consistent under a minimal recurrence condition. Some deterministic equivalents are found for the random rate of convergence of this optimal estimate, for deterministic and random designs with vanishing or diverging densities. The proposed estimate is rate optimal for standard designs.  相似文献   

16.
We propose a two-stage model selection procedure for the linear mixed-effects models. The procedure consists of two steps: First, penalized restricted log-likelihood is used to select the random effects, and this is done by adopting a Newton-type algorithm. Next, the penalized log-likelihood is used to select the fixed effects via pathwise coordinate optimization to improve the computation efficiency. We prove that our procedure has the oracle properties. Both simulation studies and a real data example are carried out to examine finite sample performance of the proposed fixed and random effects selection procedure. Supplementary materials including R code used in this article and proofs for the theorems are available online.  相似文献   

17.
Multivariate analysis of variance (MANOVA) extends the ideas and methods of univariate ANOVA in simple and straightforward ways. But the familiar graphical methods typically used for univariate ANOVA are inadequate for showing how measures in a multivariate response vary with each other, and how their means vary with explanatory factors. Similarly, the graphical methods commonly used in multiple regression are not widely available or used in multivariate multiple regression (MMRA). We describe a variety of graphical methods for multiple-response (MANOVA and MMRA) data aimed at understanding what is being tested in a multivariate test, and how factor/predictor effects are expressed across multiple response measures.

In particular, we describe and illustrate: (a) Data ellipses and biplots for multivariate data; (b) HE plots, showing the hypothesis and error covariance matrices for a given pair of responses, and a given effect; (c) HE plot matrices, showing all pairwise HE plots; and (d) reduced-rank analogs of HE plots, showing all observations, group means, and their relations to the response variables. All of these methods are implemented in a collection of easily used SAS macro programs.  相似文献   

18.
A Menon design of order h2 is a symmetric (4h2,2h2h,h2h)‐design. Quasi‐residual and quasi‐derived designs of a Menon design have parameters 2‐(2h2 + h,h2,h2h) and 2‐(2h2h,h2h,h2h‐1), respectively. In this article, regular Hadamard matrices are used to construct non‐embeddable quasi‐residual and quasi‐derived Menon designs. As applications, we construct the first two new infinite families of non‐embeddable quasi‐residual and quasi‐derived Menon designs. © 2008 Wiley Periodicals, Inc. J Combin Designs 17: 53–62, 2009  相似文献   

19.
方差分析自试验设计诞生以来一直是用于分析试验中各因子是否显著的统计方法,对于正交试验设计而言其更是唯一的分析方法.然而,当正交表各列放满了被考虑的各个因子及其交互作用并且各条件组合下只能进行一次试验时,方差分析中的误差项将恒等于0,从而方差分析不再能用于对此试验设计的分析.对此,本文针对使用多水平完备正交表的单次正交试验,提出了一种新的统计分析方法.示例表明:本文提出的检验法不仅解决了方差分析无法胜任的问题,而且在表头设计有空白列从而方差分析仍能实施时,其比方差分析具有更大的局部功效.  相似文献   

20.
We analyze the reliability of NASA composite pressure vessels by using a new Bayesian semiparametric model. The data set consists of lifetimes of pressure vessels, wrapped with a Kevlar fiber, grouped by spool, subject to different stress levels; 10% of the data are right censored. The model that we consider is a regression on the log‐scale for the lifetimes, with fixed (stress) and random (spool) effects. The prior of the spool parameters is nonparametric, namely they are a sample from a normalized generalized gamma process, which encompasses the well‐known Dirichlet process. The nonparametric prior is assumed to robustify inferences to misspecification of the parametric prior. Here, this choice of likelihood and prior yields a new Bayesian model in reliability analysis. Via a Bayesian hierarchical approach, it is easy to analyze the reliability of the Kevlar fiber by predicting quantiles of the failure time when a new spool is selected at random from the population of spools. Moreover, for comparative purposes, we review the most interesting frequentist and Bayesian models analyzing this data set. Our credibility intervals of the quantiles of interest for a new random spool are narrower than those derived by previous Bayesian parametric literature, although the predictive goodness‐of‐fit performances are similar. Finally, as an original feature of our model, by means of the discreteness of the random‐effects distribution, we are able to cluster the spools into three different groups. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号