首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Abstract

This article describes estimation of the cell probabilities in an R × C contingency table with ignorable missing data. Popular methods for maximizing the incomplete data likelihood are the EM-algorithm and the Newton-Raphson algorithm. Both of these methods require some modification of existing statistical software to get the MLEs of the cell probabilities as well as the variance estimates. We make the connection between the multinomial and Poisson likelihoods to show that the MLEs can be obtained in any generalized linear models program without additional programming or iteration loops.  相似文献   

3.
In this paper we use profile empirical likelihood to construct confidence regions for regression coefficients in partially linear model with longitudinal data. The main contribution is that the within-subject correlation is considered to improve estimation efficiency. We suppose a semi-parametric structure for the covariances of observation errors in each subject and employ both the first order and the second order moment conditions of the observation errors to construct the estimating equations. Although there are nonparametric estimators, the empirical log-likelihood ratio statistic still tends to a standard ?? p 2 variable in distribution after the nuisance parameters are profiled away. A data simulation is also conducted.  相似文献   

4.
In multivariate categorical data, models based on conditional independence assumptions, such as latent class models, offer efficient estimation of complex dependencies. However, Bayesian versions of latent structure models for categorical data typically do not appropriately handle impossible combinations of variables, also known as structural zeros. Allowing nonzero probability for impossible combinations results in inaccurate estimates of joint and conditional probabilities, even for feasible combinations. We present an approach for estimating posterior distributions in Bayesian latent structure models with potentially many structural zeros. The basic idea is to treat the observed data as a truncated sample from an augmented dataset, thereby allowing us to exploit the conditional independence assumptions for computational expediency. As part of the approach, we develop an algorithm for collapsing a large set of structural zero combinations into a much smaller set of disjoint marginal conditions, which speeds up computation. We apply the approach to sample from a semiparametric version of the latent class model with structural zeros in the context of a key issue faced by national statistical agencies seeking to disseminate confidential data to the public: estimating the number of records in a sample that are unique in the population on a set of publicly available categorical variables. The latent class model offers remarkably accurate estimates of population uniqueness, even in the presence of a large number of structural zeros.  相似文献   

5.
Zighera (App Stoch Mod Data Anal 1:93–108 1985) introduced a new parameterization of log-linear models for analyzing categorical data, directly linked to a thorough analysis of discrimination information through Kullback-Leibler divergence. The method mainly aims at quantifying in terms of information the variations of a binary variable of interest, by comparing two contingency tables – or sub-tables – through effects of explanatory categorical variables. The present paper settles the mathematical background necessary to rigorously apply Zighera’s parameterization to any categorical data. In particular, identifiability and good properties of asymptotically χ 2-distributed test statistics are proven to hold. Determination of parameters and all tests of effects due to explanatory variables are simultaneous. Application to classical data sets illustrates contribution with respect to existing methods.  相似文献   

6.
We investigate GI X /M(n)//N systems with stochastic customer acceptance policy, function of the customer batch size and the number of customers in the system at its arrival. We address the time-dependent and long-run analysis of the number of customers in the system at prearrivals and postarrivals of batches and seen by customers at their arrival to the system, as well as customer blocking probabilities. These results are then used to derive the continuous-time long-run distribution of the number of customers in the system. Our analysis combines Markov chain embedding with uniformization and uses stochastic ordering as a way to bound the errors of the computed performance measures.   相似文献   

7.
Normal copula with a correlation coefficient between-1 and 1 is tail independent and so it severely underestimates extreme probabilities. By letting the correlation coefficient in a normal copula depend on the sample size, H¨usler and Reiss(1989) showed that the tail can become asymptotically dependent. We extend this result by deriving the limit of the normalized maximum of n independent observations, where the i-th observation follows from a normal copula with its correlation coefficient being either a parametric or a nonparametric function of i/n. Furthermore, both parametric and nonparametric inference for this unknown function are studied, which can be employed to test the condition by H¨usler and Reiss(1989). A simulation study and real data analysis are presented too.  相似文献   

8.
This paper studies the connections between relational probabilistic models and reference classes, with specific focus on the ability of these models to generate the correct answers to probabilistic queries. We distinguish between relational models that represent only observed relations and those which additionally represent latent properties of individuals. We show how both types of relational models can be understood in terms of reference classes, and that learning such models correspond to different ways of identifying reference classes. Rather than examining the impact of philosophical issues associated with reference classes on relational learning, we directly assess whether relational models can represent the correct probabilities of a simple generative process for relational data. We show that models with only observed properties and relations can only represent the correct probabilities under restrictive conditions, whilst models that also represent latent properties avoids such restrictions. As such, methods for acquiring latent-property models are an attractive alternatives to traditional ways of identifying reference classes. Our experiments on synthetic as well as real-world domains support the analysis, demonstrating that models with latent relations are significantly more accurate than those without latent relations.  相似文献   

9.
The independent variables of linear mixed models are subject to measurement errors in practice. In this paper, we present a unified method for the estimation in linear mixed models with errors-in-variables, based upon the corrected score function of Nakamura (1990, Biometrika, 77, 127–137). Asymptotic normality properties of the estimators are obtained. The estimators are shown to be consistent and convergent at the order of n –1/2. The performance of the proposed method is studied via simulation and the analysis of a data set on hedonic housing prices.  相似文献   

10.
In this paper we give a new proof that for controllable and observable linear systems every L2[0,T] function can be approximated in the L2[0,T] sense with an output function generated by an L2[0,T] input function. We also give a new characterization of how continuous functions on [0,T] are uniformly approximated by an output generated by a continuous input function. The relative degree of the transfer function of the system determines those functions that can be approximated. We further show that if the initial data is allowed to vary then every continuous function is uniformly approximated by outputs generated by continuous functions.  相似文献   

11.
Many problems in management science and telecommunications can be solved by the analysis of aD X/Dm/1 queueing model. In this paper, we use the zeros, both inside and outside the unit circle, of the denominator of the generating function of the model to obtain an explicit closed-form solution for the equilibrium probabilities of the number of customers in the system. The moments of the number of customers in the queue or in the system are also studied. When there are infinitely many zeros outside the unit circle, we propose an approximation method using polynomials. This method yields correct values for a finite number of the probabilities, the number depending on the degree of the polynomial approximation.  相似文献   

12.
13.
We propose an approach to compute the boundary crossing probabilities for a class of diffusion processes which can be expressed as piecewise monotone (not necessarily one-to-one) functionals of a standard Brownian motion. This class includes many interesting processes in real applications, e.g., Ornstein–Uhlenbeck, growth processes and geometric Brownian motion with time dependent drift. This method applies to both one-sided and two-sided general nonlinear boundaries, which may be discontinuous. Using this approach explicit formulas for boundary crossing probabilities for certain nonlinear boundaries are obtained, which are useful in evaluation and comparison of various computational algorithms. Moreover, numerical computation can be easily done by Monte Carlo integration and the approximation errors for general boundaries are automatically calculated. Some numerical examples are presented.   相似文献   

14.
Type II topoisomerases are enzymes that change the topology of DNA by performing strand-passage. In particular, they unknot knotted DNA very efficiently. Motivated by this experimental observation, we investigate transition probabilities between knots. We use the BFACF algorithm to generate ensembles of polygons in Z3 of fixed knot type. We introduce a novel strand-passage algorithm which generates a Markov chain in knot space. The entries of the corresponding transition probability matrix determine state-transitions in knot space and can track the evolution of different knots after repeated strand-passage events. We outline future applications of this work to DNA unknotting.  相似文献   

15.
Abstract

A simple matrix formula is given for the observed information matrix when the EM algorithm is applied to categorical data with missing values. The formula requires only the design matrices, a matrix linking the complete and incomplete data, and a few simple derivatives. It can be easily programmed using a computer language with operators for matrix multiplication, element-by-element multiplication and division, matrix concatenation, and creation of diagonal and block diagonal arrays. The formula is applicable whenever the incomplete data can be expressed as a linear function of the complete data, such as when the observed counts represent the sum of latent classes, a supplemental margin, or the number censored. In addition, the formula applies to a wide variety of models for categorical data, including those with linear, logistic, and log-linear components. Examples include a linear model for genetics, a log-linear model for two variables and nonignorable nonresponse, the product of a log-linear model for two variables and a logit model for nonignorable nonresponse, a latent class model for the results of two diagnostic tests, and a product of linear models under double sampling.  相似文献   

16.
We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.  相似文献   

17.

We introduce and study the class of holographic models which can be defined by copying of some of its finite parts by means of automorphisms. We prove this class to differ from the class of countably categorical models. Characterizations of the classes of holographic Boolean algebras, abelian groups, linear orderings, fields, and equivalences are given.

  相似文献   

18.
We propose a multinomial probit (MNP) model that is defined by a factor analysis model with covariates for analyzing unordered categorical data, and discuss its identification. Some useful MNP models are special cases of the proposed model. To obtain maximum likelihood estimates, we use the EM algorithm with its M-step greatly simplified under Conditional Maximization and its E-step made feasible by Monte Carlo simulation. Standard errors are calculated by inverting a Monte Carlo approximation of the information matrix using Louis’s method. The methodology is illustrated with a simulated data.  相似文献   

19.
Asymptotic expansions for large deviation probabilities are used to approximate the cumulative distribution functions of noncentral generalized chi-square distributions, preferably in the far tails. The basic idea of how to deal with the tail probabilities consists in first rewriting these probabilities as large parameter values of the Laplace transform of a suitably defined function fk; second making a series expansion of this function, and third applying a certain modification of Watson's lemma. The function fk is deduced by applying a geometric representation formula for spherical measures to the multivariate domain of large deviations under consideration. At the so-called dominating point, the largest main curvature of the boundary of this domain tends to one as the large deviation parameter approaches infinity. Therefore, the dominating point degenerates asymptotically. For this reason the recent multivariate asymptotic expansion for large deviations in Breitung and Richter (1996, J. Multivariate Anal.58, 1–20) does not apply. Assuming a suitably parametrized expansion for the inverse g−1 of the negative logarithm of the density-generating function, we derive a series expansion for the function fk. Note that low-order coefficients from the expansion of g−1 influence practically all coefficients in the expansion of the tail probabilities. As an application, classification probabilities when using the quadratic discriminant function are discussed.  相似文献   

20.
In this paper, we investigate the problem of determining the number of clusters in the k-modes based categorical data clustering process. We propose a new categorical data clustering algorithm with automatic selection of k. The new algorithm extends the k-modes clustering algorithm by introducing a penalty term to the objective function to make more clusters compete for objects. In the new objective function, we employ a regularization parameter to control the number of clusters in a clustering process. Instead of finding k directly, we choose a suitable value of regularization parameter such that the corresponding clustering result is the most stable one among all the generated clustering results. Experimental results on synthetic data sets and the real data sets are used to demonstrate the effectiveness of the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号