共查询到20条相似文献,搜索用时 3 毫秒
1.
Annals of the Institute of Statistical Mathematics - In this paper, we propose improved statistical inference and variable selection methods for generalized linear models based on empirical... 相似文献
2.
With high-dimensional data, the number of covariates is considerably larger than the sample size. We propose a sound method for analyzing these data. It performs simultaneously clustering and variable selection. The method is inspired by the plaid model. It may be seen as a multiplicative mixture model that allows for overlapping clustering. Unlike conventional clustering, within this model an observation may be explained by several clusters. This characteristic makes it specially suitable for gene expression data. Parameter estimation is performed with the Monte Carlo expectation maximization algorithm and importance sampling. Using extensive simulations and comparisons with competing methods, we show the advantages of our methodology, in terms of both variable selection and clustering. An application of our approach to the gene expression data of kidney renal cell carcinoma taken from The Cancer Genome Atlas validates some previously identified cancer biomarkers. 相似文献
3.
Sándor Bozóki Linda Dezső Attila Poesz József Temesi 《Annals of Operations Research》2013,211(1):511-528
Pairwise comparison (PC) matrices are used in multi-attribute decision problems (MADM) in order to express the preferences of the decision maker. Our research focused on testing various characteristics of PC matrices. In a controlled experiment with university students (N=227) we have obtained 454 PC matrices. The cases have been divided into 18 subgroups according to the key factors to be analyzed. Our team conducted experiments with matrices of different size given from different types of MADM problems. Additionally, the matrix elements have been obtained by different questioning procedures differing in the order of the questions. Results are organized to answer five research questions. Three of them are directly connected to the inconsistency of a PC matrix. Various types of inconsistency indices have been applied. We have found that the type of the problem and the size of the matrix had impact on the inconsistency of the PC matrix. However, we have not found any impact of the questioning order. Incomplete PC matrices played an important role in our research. The decision makers behavioral consistency was as well analyzed in case of incomplete matrices using indicators measuring the deviation from the final order of alternatives and from the final score vector. 相似文献
4.
Extremes - The distributed Hill estimator is a divide-and-conquer algorithm for estimating the extreme value index when data are stored in multiple machines. In applications, estimates based on the... 相似文献
5.
In this paper, we propose a robust empirical likelihood (REL) inference for the parametric component in a generalized partial linear model (GPLM) with longitudinal data. We make use of bounded scores and leverage-based weights in the auxiliary random vectors to achieve robustness against outliers in both the response and covariates. Simulation studies demonstrate the good performance of our proposed REL method, which is more accurate and efficient than the robust generalized estimating equation (GEE) method (X. He, W.K. Fung, Z.Y. Zhu, Robust estimation in generalized partial linear models for clustered data, Journal of the American Statistical Association 100 (2005) 1176-1184). The proposed robust method is also illustrated by analyzing a real data set. 相似文献
6.
Empirical likelihood inference for censored median regression with weighted empirical hazard functions 总被引:1,自引:0,他引:1
In recent years, median regression models have been shown to be useful for analyzing a variety of censored survival data in
clinical trials. For inference on the regression parameter, there have been a variety of semiparametric procedures. However,
the accuracy of such procedures in terms of coverage probability can be quite low when the censoring rate is heavy. In this
paper, based on weighted empirical hazard functions, we apply an empirical likelihood (EL) ratio method to the median regression
model with censoring data and derive the limiting distribution of EL ratio. Confidence region for the regression parameter
can then be obtained accordingly. Furthermore, we compared the proposed method with the standard method through extensive
simulation studies. The proposed method almost always outperformed the existing method. 相似文献
7.
Dirk Temme 《Computational Statistics》2006,21(1):151-182
Summary Graphical methods for the discovery of structural models from observational data provide interesting tools for applied researchers.
A problem often faced in empirical studies is the presence of latent confounders which produce associations between the observed
variables. Although causal inference algorithms exist which can cope with latent confounders, empirical applications assessing
the performance of such algorithms are largely lacking. In this study, we apply the constraint based Fast Causal Inference
algorithm implemented in the software program TETRAD on a data set containing strategy and performance information about 608
business units. In contrast to the informative and reasonable results for the impirical data, simulation findings reveal problems
in recovering some of the structural relations. 相似文献
8.
《Mathematical and Computer Modelling》1995,21(7):29-42
Two models for the dynamics of an epidemic of S-I-R type are described in which the active population is randomly screened. Infectivity is not required to be constant in one of them. The positive screened individuals move into the class of “removed” together with the immune. Global existence and uniqueness results are established. 相似文献
9.
In this paper, we consider the semiparametric regression model for longitudinal data. Due to the correlation within groups, a generalized empirical log-likelihood ratio statistic for the unknown parameters in the model is suggested by introducing the working covariance matrix. It is proved that the proposed statistic is asymptotically standard chi-squared under some suitable conditions, and hence it can be used to construct the confidence regions of the parameters. A simulation study is conducted to compare the proposed method with the generalized least squares method in terms of coverage accuracy and average lengths of the confidence intervals. 相似文献
10.
《European Journal of Operational Research》1998,111(2):248-267
Experimental and verbal protocol research suggest that consumers appear to use noncompensatory screening strategies to remove alternatives and simplify complex choice situations prior to making a choice. Existing multi-phased choice models assume that the consumer initially evaluates each alternative to determine whether it should pass the first-stage screen and enter the choice set. The feature-based elimination model proposed in this study allows the consumer to avoid processing information for each alternative when forming the choice set. The consumer is assumed to apply a sequence of noncompensatory screens, similar to the elimination-by-aspects strategy, to form the choice set. An empirical application of the model demonstrates that cross-sectional heterogeneity in screening strategies can also be accommodated. One finding from this application is that heterogeneity in screening strategies may be at least as prevalent as heterogeneity in preferences. A comprehensive empirical comparison of the proposed model with existing two-stage models for scanner panel data shows that the model performs at least as well as all existing models and substantially better than most. The empirical performance of the model, coupled with its theoretical appeal and consistency with actual accounts of decision making in complex situations, make the proposed model an appealing alternative to existing multi-phased choice models. 相似文献
11.
Julien Villemonteix Emmanuel Vazquez Maryan Sidorkiewicz Eric Walter 《Journal of Global Optimization》2009,43(2-3):373-389
In many global optimization problems motivated by engineering applications, the number of function evaluations is severely limited by time or cost. To ensure that each of these evaluations usefully contributes to the localization of good candidates for the role of global minimizer, a stochastic model of the function can be built to conduct a sequential choice of evaluation points. Based on Gaussian processes and Kriging, the authors have recently introduced the informational approach to global optimization (IAGO) which provides a one-step optimal choice of evaluation points in terms of reduction of uncertainty on the location of the minimizers. To do so, the probability density of the minimizers is approximated using conditional simulations of the Gaussian process model behind Kriging. In this paper, an empirical comparison between the underlying sampling criterion called conditional minimizer entropy (CME) and the standard expected improvement sampling criterion (EI) is presented. Classical test functions are used as well as sample paths of the Gaussian model and an industrial application. They show the interest of the CME sampling criterion in terms of evaluation savings. 相似文献
12.
We propose randomized inference(RI), a new statistical inference approach. RI may be realized through a randomized estimate(RE) of a parameter vector, which is a random vector that takes values in the parameter space with a probability density function(PDF) that depends on the sample or sufficient statistics,such as the posterior distributions in Bayesian inference. Based on the PDF of an RE of an unknown parameter,we propose a framework for both the vertical density representation(VDR) test and the construction of a confidence region. This approach is explained with the aid of examples. For the equality hypothesis of multiple normal means without the condition of variance homogeneity, we present an exact VDR test, which is shown as an extension of one-way analysis of variance(ANOVA). In the case of two populations, the PDF of the Welch statistics is given by using the RE. Furthermore, through simulations, we show that the empirical distribution function, the approximated t, and the RE distribution function of Welch statistics are almost equal. The VDR test of the homogeneity of variance is shown to be more efficient than both the Bartlett test and the revised Bartlett test. Finally, we discuss the prospects of RI. 相似文献
13.
14.
关于高维、相依和不完全数据的统计分析 总被引:4,自引:0,他引:4
本文包括3部分:第1部分简要讲述统计学的发展和面临的挑战,说明高维,相依和不完全数据的统计分析是在现代科学技术和社会经济中普遍存在的困难问题,第2部分概述我国学者在相关领域所取得的成果:最后谈谈对该领域当前研究趋势的个人认识。 相似文献
15.
This paper proposes an estimator combining empirical likelihood (EL) and the generalized method of moments (GMM) by allowing the sample average moment vector to deviate from zero and the sample weights to deviate from n−1. The new estimator may be adjusted through free parameter δ∈(0,1) with GMM behavior attained as δ?0 and EL as δ?1. When the sample size is small and the number of moment conditions is large, the parameter space under which the EL estimator is defined may be restricted at or near the population parameter value. The support of the parameter space for the new estimator may be adjusted through δ. The new estimator performs well in Monte Carlo simulations. 相似文献
16.
R. A. Bandaliev 《Mathematical Notes》2008,84(3-4):303-313
The main goal in this paper is to obtain an analog of the generalized Minkowski inequality and an embedding between the Lebesgue spaces with mixed norm and with variable summability exponent. 相似文献
17.
18.
Variational Bayesian methods aim to address some of the weaknesses (computation time, storage costs and convergence monitoring) of mainstream Markov chain Monte Carlo based inference at the cost of a biased but more tractable approximation to the posterior distribution. We investigate the performance of variational approximations in the context of the mixed logit model, which is one of the most used models for discrete choice data. A typical treatment using the variational Bayesian methodology is hindered by the fact that the expectation of the so called log-sum-exponential function has no explicit expression. Therefore additional approximations are required to maintain tractability. In this paper we compare seven different possible bounds or approximations. We found that quadratic bounds are not sufficiently accurate. A recently proposed non-quadratic bound did perform well. We also found that the Taylor series approximation used in a previous study of variational Bayes for mixed logit models is only accurate for specific settings. Our proposed approximation based on quasi Monte Carlo sampling performed consistently well across all simulation settings while remaining computationally tractable. 相似文献
19.
20.
《European Journal of Operational Research》2005,164(3):760-777
We consider the multiattribute design problem (MADP) which contains a considerable number of alternatives, resulting from the combination of a limited number of discrete levels of several quantitative and/or qualitative attributes. In order to solve such problems, the preferences of individual decision makers have to be measured. Though a considerable number of methods is available from different research areas, only a subset is applicable to MADP.In this paper, we report on an empirical study which considered the problem of designing a university and involved more than 300 respondents. Because of this large-scale design, we performed a paper-and-pencil investigation and selected methods which could concisely be applied in such a setting: the analytic hierarchy process (AHP) and the conjoint analysis (CA).The results show that both methods give useful models of the respondents' preferences. However, inspecting the utility functions determined in detail reveals considerable discrepancies between them. Most of the measures used for comparison indicate AHP to be the better choice for the special decision situation considered. In order to get a more general recommendation, we categorize different types of MADP and discuss the applicability of AHP and CA. 相似文献