首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 783 毫秒
1.
For the well-known Fay-Herriot small area model, standard variance component estimation methods frequently produce zero estimates of the strictly positive model variance. As a consequence, an empirical best linear unbiased predictor of a small area mean, commonly used in small area estimation, could reduce to a simple regression estimator, which typically has an overshrinking problem. We propose an adjusted maximum likelihood estimator of the model variance that maximizes an adjusted likelihood defined as a product of the model variance and a standard likelihood (e.g., a profile or residual likelihood) function. The adjustment factor was suggested earlier by Carl Morris in the context of approximating a hierarchical Bayes solution where the hyperparameters, including the model variance, are assumed to follow a prior distribution. Interestingly, the proposed adjustment does not affect the mean squared error property of the model variance estimator or the corresponding empirical best linear unbiased predictors of the small area means in a higher order asymptotic sense. However, as demonstrated in our simulation study, the proposed adjustment has a considerable advantage in small sample inference, especially in estimating the shrinkage parameters and in constructing the parametric bootstrap prediction intervals of the small area means, which require the use of a strictly positive consistent model variance estimate.  相似文献   

2.
In this paper Bayesian statistical analysis of masked data is considered based on the Pareto distribution. The likelihood function is simplified by introducing auxiliary variables, which describe the causes of failure. Three Bayesian approaches (Bayes using subjective priors, hierarchical Bayes and empirical Bayes) are utilized to estimate the parameters, and we compare these methods by analyzing a real data. Finally we discuss the method of avoiding the choice of the hyperparameters in the prior distributions.  相似文献   

3.
Inference on the largest mean of a multivariate normal distribution is a surprisingly difficult and unexplored topic. Difficulties arise when two or more of the means are simultaneously the largest mean. Our proposed solution is based on an extension of R.A. Fisher’s fiducial inference methods termed generalized fiducial inference. We use a model selection technique along with the generalized fiducial distribution to allow for equal largest means and alleviate the overestimation that commonly occurs. Our proposed confidence intervals for the largest mean have asymptotically correct frequentist coverage and simulation results suggest that they possess promising small sample empirical properties. In addition to the theoretical calculations and simulations we also applied this approach to the air quality index of the four largest cities in the northeastern United States (Baltimore, Boston, New York, and Philadelphia).  相似文献   

4.
Finding predictive gene groups from microarray data   总被引:1,自引:0,他引:1  
Microarray experiments generate large datasets with expression values for thousands of genes, but not more than a few dozens of samples. A challenging task with these data is to reveal groups of genes which act together and whose collective expression is strongly associated with an outcome variable of interest. To find these groups, we suggest the use of supervised algorithms: these are procedures which use external information about the response variable for grouping the genes. We present Pelora, an algorithm based on penalized logistic regression analysis, that combines gene selection, gene grouping and sample classification in a supervised, simultaneous way. With an empirical study on six different microarray datasets, we show that Pelora identifies gene groups whose expression centroids have very good predictive potential and yield results that can keep up with state-of-the-art classification methods based on single genes. Thus, our gene groups can be beneficial in medical diagnostics and prognostics, but they may also provide more biological insights into gene function and regulation.  相似文献   

5.
Outcome-dependent sampling designs are commonly used in economics, market research and epidemiological studies. Case-control sampling design is a classic example of outcome-dependent sampling, where exposure information is collected on subjects conditional on their disease status. In many situations, the outcome under consideration may have multiple categories instead of a simple dichotomization. For example, in a case-control study, there may be disease sub-classification among the “cases” based on progression of the disease, or in terms of other histological and morphological characteristics of the disease. In this note, we investigate the issue of fitting prospective multivariate generalized linear models to such multiple-category outcome data, ignoring the retrospective nature of the sampling design. We first provide a set of necessary and sufficient conditions for the link functions that will allow for equivalence of prospective and retrospective inference for the parameters of interest. We show that for categorical outcomes, prospective-retrospective equivalence does not hold beyond the generalized multinomial logit link. We then derive an approximate expression for the bias incurred when link functions outside this class are used. Most popular models for ordinal response fall outside the multiplicative intercept class and one should be cautious while performing a naive prospective analysis of such data as the bias could be substantial. We illustrate the extent of bias through a real data example, based on the ongoing Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening trial by the National Cancer Institute. The simulations based on the real study illustrate that the bias approximations work well in practice.  相似文献   

6.
In this study, the theory of statistical kernel density estimation has been applied for deriving non-parametric kernel prior to the empirical Bayes which frees the Bayesian inference from subjectivity that has worried some statisticians. For comparing the empirical Bayes based on the kernel prior with the fully Bayes based on the informative prior, the mean square error and the mean percentage error for the Weibull model parameters are studied based on these approaches under both symmetric and asymmetric loss functions, via Monte Carlo simulations. The results are quite favorable to the empirical Bayes that provides better estimates and outperforms the fully Bayes for different sample sizes and several values of the true parameters. Finally, a numerical example is given to demonstrate the efficiency of the empirical Bayes.  相似文献   

7.
Clustering is one of the most widely used procedures in the analysis of microarray data, for example with the goal of discovering cancer subtypes based on observed heterogeneity of genetic marks between different tissues. It is well known that in such high-dimensional settings, the existence of many noise variables can overwhelm the few signals embedded in the high-dimensional space. We propose a novel Bayesian approach based on Dirichlet process with a sparsity prior that simultaneous performs variable selection and clustering, and also discover variables that only distinguish a subset of the cluster components. Unlike previous Bayesian formulations, we use Dirichlet process (DP) for both clustering of samples as well as for regularizing the high-dimensional mean/variance structure. To solve the computational challenge brought by this double usage of DP, we propose to make use of a sequential sampling scheme embedded within Markov chain Monte Carlo (MCMC) updates to improve the naive implementation of existing algorithms for DP mixture models. Our method is demonstrated on a simulation study and illustrated with the leukemia gene expression dataset.  相似文献   

8.
This paper obtains conditions for minimaxity of hierarchical Bayes estimators in the estimation of a mean vector of a multivariate normal distribution. Hierarchical prior distributions with three types of second stage priors are treated. Conditions for admissibility and inadmissibility of the hierarchical Bayes estimators are also derived using the arguments in Berger and Strawderman [Choice of hierarchical priors: admissibility in estimation of normal means, Ann. Statist. 24 (1996) 931-951]. Combining these results yields admissible and minimax hierarchical Bayes estimators.  相似文献   

9.
The model we discuss in this paper deals with inequality in distribution in the presence of a covariate. To elucidate that dependence, we propose to consider the composition of the cumulative quantile regression (CQR) function and the Goldie concentration curve, the standardized counterpart of which gives a fraction to fraction plot of the response and the covariate. It has the merit of enhancing the visibility of inequality in distribution when the latter is present. We shall examine the asymptotic properties of the corresponding empirical estimator. The associated empirical process involves a randomly stopped partial sum process of induced order statistics. Strong Gaussian approximations of the processes are constructed. The result forms the basis for the asymptotic theory of functional statistics based on these processes.  相似文献   

10.
We study empirical and hierarchical Bayes approaches to the problem of estimating an infinite-dimensional parameter in mildly ill-posed inverse problems. We consider a class of prior distributions indexed by a hyperparameter that quantifies regularity. We prove that both methods we consider succeed in automatically selecting this parameter optimally, resulting in optimal convergence rates for truths with Sobolev or analytic “smoothness”, without using knowledge about this regularity. Both methods are illustrated by simulation examples.  相似文献   

11.
Because of the high costs of microarray experiments and the availability of only limited biological materials, microarray experiments are often performed with a small number of replicates. Investigators, therefore, often have to perform their experiments with low replication or without replication. However, the heterogeneous error variability observed in microarray experiments increases the difficulty in analyzing microarray data without replication. No current analysis techniques are practically applicable to such microarray data analysis. We here introduce a statistical method, the so-called unreplicated heterogeneous error model (UHEM) for the microarray data analysis without replication. This method is possible by utilizing many adjacent-intensity genes for estimating local error variance after nonparametric elimination of differentially expressed genes between different biological conditions. We compared the performance of UHEM with three empirical Bayes prior specification methods: between-condition local pooled error, pseudo standard error, or adaptive standard error-based HEM. We found that our unreplicated HEM method is effective for the microarray data analysis when replication of an array experiment is impractical or prohibited.  相似文献   

12.
In this paper hierarchical Bayes and empirical Bayes results are used to obtain confidence intervals of the population means in the case of real problems. This is achieved by approximating the posterior distribution with a Pearson distribution. In the first example hierarchical Bayes confidence intervals for the Efron and Morris (1975, J. Amer. Statist. Assoc., 70, 311–319) baseball data are obtained. The same methods are used in the second example to obtain confidence intervals of treatment effects as well as the difference between treatment effects in an analysis of variance experiment. In the third example hierarchical Bayes intervals of treatment effects are obtained and compared with normal approximations in the unequal variance case.Financially supported by the CSIR and the University of the Orange Free State, Central Research Fund.  相似文献   

13.
The penalized profile sampler for semiparametric inference is an extension of the profile sampler method [B.L. Lee, M.R. Kosorok, J.P. Fine, The profile sampler, Journal of the American Statistical Association 100 (2005) 960-969] obtained by profiling a penalized log-likelihood. The idea is to base inference on the posterior distribution obtained by multiplying a profiled penalized log-likelihood by a prior for the parametric component, where the profiling and penalization are applied to the nuisance parameter. Because the prior is not applied to the full likelihood, the method is not strictly Bayesian. A benefit of this approximately Bayesian method is that it circumvents the need to put a prior on the possibly infinite-dimensional nuisance components of the model. We investigate the first and second order frequentist performance of the penalized profile sampler, and demonstrate that the accuracy of the procedure can be adjusted by the size of the assigned smoothing parameter. The theoretical validity of the procedure is illustrated for two examples: a partly linear model with normal error for current status data and a semiparametric logistic regression model. Simulation studies are used to verify the theoretical results.  相似文献   

14.
This paper treats the problem of estimating the restricted means of normal distributions with a known variance, where the means are restricted to a polyhedral convex cone which includes various restrictions such as positive orthant, simple order, tree order and umbrella order restrictions. In the context of the simultaneous estimation of the restricted means, it is of great interest to investigate decision-theoretic properties of the generalized Bayes estimator against the uniform prior distribution over the polyhedral convex cone. In this paper, the generalized Bayes estimator is shown to be minimax. It is also proved that it is admissible in the one- or two-dimensional case, but is improved on by a shrinkage estimator in the three- or more-dimensional case. This means that the so-called Stein phenomenon on the minimax generalized Bayes estimator can be extended to the case where the means are restricted to the polyhedral convex cone. The risk behaviors of the estimators are investigated through Monte Carlo simulation, and it is revealed that the shrinkage estimator has a substantial risk reduction.  相似文献   

15.
General procedures are proposed for nonparametric classification in the presence of missing covariates. Both kernel-based imputation as well as Horvitz-Thompson-type inverse weighting approaches are employed to handle the presence of missing covariates. In the case of imputation, it is a certain regression function which is being imputed (and not the missing values). Using the theory of empirical processes, the performance of the resulting classifiers is assessed by obtaining exponential bounds on the deviations of their conditional errors from that of the Bayes classifier. These bounds, in conjunction with the Borel-Cantelli lemma, immediately provide various strong consistency results.  相似文献   

16.
The conjugate prior for the exponential family, referred to also as the natural conjugate prior, is represented in terms of the Kullback-Leibler separator. This representation permits us to extend the conjugate prior to that for a general family of sampling distributions. Further, by replacing the Kullback-Leibler separator with its dual form, we define another form of a prior, which will be called the mean conjugate prior. Various results on duality between the two conjugate priors are shown. Implications of this approach include richer families of prior distributions induced by a sampling distribution and the empirical Bayes estimation of a high-dimensional mean parameter.  相似文献   

17.
Admissibility and minimaxity of Bayes estimators for a normal mean matrix   总被引:1,自引:1,他引:0  
In some invariant estimation problems under a group, the Bayes estimator against an invariant prior has equivariance as well. This is useful notably for evaluating the frequentist risk of the Bayes estimator. This paper addresses the problem of estimating a matrix of means in normal distributions relative to quadratic loss. It is shown that a matricial shrinkage Bayes estimator against an orthogonally invariant hierarchical prior is admissible and minimax by means of equivariance. The analytical improvement upon every over-shrinkage equivariant estimator is also considered and this paper justifies the corresponding positive-part estimator preserving the order of the sample singular values.  相似文献   

18.
This paper provides an asymptotics look at the generalized inference through showing connections between the generalized inference and two widely used asymptotic methods, the bootstrap and plug-in method. A generalized bootstrap method and a generalized plug-in method are introduced. The generalized bootstrap method can not only be used to prove asymptotic frequentist properties of existing generalized confidence regions through viewing fiducial generalized pivotal quantities as generalized bootstrap variables, but also yield new confidence regions for the situations where the generalized inference is unavailable. Some examples are presented to illustrate the method. In addition, the generalized F-test (Weerahandi, 1995 [26]) can be derived by the generalized plug-in method, then its asymptotic validity is obtained.  相似文献   

19.
Evaluation of reproducibility is important in assessing whether a new method or instrument can reproduce the results from a traditional gold standard approach. In this paper, we propose a measure to assess measurement agreement for functional data which are frequently encountered in medical research and many other research fields. Formulae to compute the standard error of the proposed estimator and confidence intervals for the proposed measure are derived. The estimators and the coverage probabilities of the confidence intervals are empirically tested for small-to-moderate sample sizes via Monte Carlo simulations. A real data example in physiology study is used to illustrate the proposed statistical inference procedures.  相似文献   

20.
Clustering and classification are important tasks for the analysis of microarray gene expression data. Classification of tissue samples can be a valuable diagnostic tool for diseases such as cancer. Clustering samples or experiments may lead to the discovery of subclasses of diseases. Clustering genes can help identify groups of genes that respond similarly to a set of experimental conditions. We also need validation tools for clustering and classification. Here, we focus on the identification of outliers—units that may have been misallocated, or mislabeled, or are not representative of the classes or clusters.We present two new methods: DDclust and DDclass, for clustering and classification. These non-parametric methods are based on the intuitively simple concept of data depth. We apply the methods to several gene expression and simulated data sets. We also discuss a convenient visualization and validation tool—the relative data depth plot.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号