首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对采用经典划分思想的聚类算法以一个点来代表类的局限,提出一种基于泛化中心的分类属性数据聚类算法。该算法通过定义包含多个点的泛化中心来代表类,能够体现出类的数据分布特征,并进一步提出泛化中心距离及类间距离度量的新方法,给出泛化中心的确定方法及基于泛化中心进行对象到类分配的聚类策略,一般只需一次划分迭代就能得到最终聚类结果。将泛化中心算法应用到四个基准数据集,并与著名的划分聚类算法K-modes及其两种改进算法进行比较,结果表明泛化中心算法聚类正确率更高,迭代次数更少,是有效可行的。  相似文献   

2.
Markov models are widely used as a method for describing categorical data that exhibit stationary and nonstationary autocorrelation. However, diagnostic methods are a largely overlooked topic for Markov models. We introduce two types of residuals for this purpose: one for assessing the length of runs between state changes, and the other for assessing the frequency with which the process moves from any given state to the other states. Methods for calculating the sampling distribution of both types of residuals are presented, enabling objective interpretation through graphical summaries. The graphical summaries are formed using a modification of the probability integral transformation that is applicable for discrete data. Residuals from simulated datasets are presented to demonstrate when the model is, and is not, adequate for the data. The two types of residuals are used to highlight inadequacies of a model posed for real data on seabed fauna from the marine environment.

Supplemental materials, including an R-package RMC with functions to perform the diagnostic measures on the class of models considered in this article, are at the journal’s website. The R-package is also available at CRAN.  相似文献   

3.
4.
We present a categorical/denotational semantics for the Lambek Syntactic Calculus (LSC), indeed for a λlD-typed version Curry-Howard isomorphic to it. The main novelty of our approach is an abstract noncommutative construction with right and left adjoints, called sequential product. It is defined through a hierarchical structure of categories reflecting the implicit permission to sequence expressions and the inductive construction of compound expressions. We claim that Lambek's noncommutative product corresponds to a noncommutative bi-endofunctor into a category, which encloses all categories of such hierarchical structure. A soundness theorem for LSC is shown with respect to this semantical framework.  相似文献   

5.
Graphical models are wildly used to describe conditional dependence relationships among interacting random variables. Among statistical inference problems of a graphical model, one particular interest is utilizing its interaction structure to reduce model complexity. As an important approach to utilizing structural information, decomposition allows a statistical inference problem to be divided into some sub-problems with lower complexities. In this paper, to investigate decomposition of covariate-dependent graphical models, we propose some useful definitions of decomposition of covariate-dependent graphical models with categorical data in the form of contingency tables. Based on such a decomposition, a covariate-dependent graphical model can be split into some sub-models, and the maximum likelihood estimation of this model can be factorized into the maximum likelihood estimations of the sub-models. Moreover, some sufficient and necessary conditions of the proposed definitions of decomposition are studied.  相似文献   

6.
Fuzzy BCC Model for Data Envelopment Analysis   总被引:2,自引:0,他引:2  
Fuzzy Data Envelopment Analysis (FDEA) is a tool for comparing the performance of a set of activities or organizations under uncertainty environment. Imprecise data in FDEA models is represented by fuzzy sets and FDEA models take the form of fuzzy linear programming models. Previous research focused on solving the FDEA model of the CCR (named after Charnes, Cooper, and Rhodes) type (FCCR). In this paper, the FDEA model of the BCC (named after Banker, Charnes, and Cooper) type (FBCC) is studied. Possibility and Credibility approaches are provided and compared with an -level based approach for solving the FDEA models. Using the possibility approach, the relationship between the primal and dual models of FBCC models is revealed and fuzzy efficiency can be constructed. Using the credibility approach, an efficiency value for each DMU (Decision Making Unit) is obtained as a representative of its possible range. A numerical example is given to illustrate the proposed approaches and results are compared with those obtained with the -level based approach.  相似文献   

7.
Abstract

We present an efficient algorithm for generating exact permutational distributions for linear rank statistics defined on stratified 2 × c contingency tables. The algorithm can compute exact p values and confidence intervals for a rich class of nonparametric problems. These include exact p values for stratified two-population Wilcoxon, Logrank, and Van der Waerden tests, exact p values for stratified tests of trend across several binomial populations, exact p values for stratified permutation tests with arbitrary scores, and exact confidence intervals for odds ratios embedded in stratified 2 × c tables. The algorithm uses network-based recursions to generate stratum-specific distributions and then combines them into an overall permutation distribution by convolution. Where only the tail area of a permutation distribution is desired, additional efficiency gains are achieved by backward induction and branch-and-bound processing of the network. The algorithm is especially efficient for highly imbalanced categorical data, a situation where the asymptotic theory is unreliable. The backward induction component of the algorithm can also be used to evaluate the conditional maximum likelihood, and its higher order derivatives, for the logistic regression model with grouped data. We illustrate the techniques with an analysis of two data sets: The leukemia data on survivors of the Hiroshima atomic bomb and data from an animal toxicology experiment provided by the U.S. Food and Drug Administration.  相似文献   

8.
In this article, we model multivariate categorical (binary and ordinal) response data using a very rich class of scale mixture of multivariate normal (SMMVN) link functions to accommodate heavy tailed distributions. We consider both noninformative as well as informative prior distributions for SMMVN-link models. The notation of informative prior elicitation is based on available similar historical studies. The main objectives of this article are (i) to derive theoretical properties of noninformative and informative priors as well as the resulting posteriors and (ii) to develop an efficient Markov chain Monte Carlo algorithm to sample from the resulting posterior distribution. A real data example from prostate cancer studies is used to illustrate the proposed methodologies.  相似文献   

9.
??In survival analysis, most existing approaches for analysing right-censored failure time data assume that the censoring time is independent of the failure time. However, investigators often face problems involving dependent censoring, i.e., failure time and censoring time are possibly dependent and they may be censored one another, especially in clinical trials. Without accounting for such dependence, survival distributions cannot be estimated consistently. Numerous attempts to model this dependence have been made. Among them, copula models are of particular interest because of their simple structure. Proportional hazard model analysis for informative right-censored data has been discussed in this paper. An Archimedean copula is assumed for the joint distribution function of failure time and censoring time variables. Under the conditions of identifiability of the parameter of the Archimedean copula, the maximum likelihood estimators of the parameter of Archimedean copula, the parameters and the cumulative hazard function of PH model are worked out. Extensive simulation studies show that the feasibility of the proposed method and the consistency of the estimators.  相似文献   

10.
引入时间变量的数据包络分析模型   总被引:1,自引:0,他引:1  
考虑到实际中的生产过程大多数都是多阶段的生产过程,而传统的数据包络分析模型只能对单阶段的生产过程进行评价.传统的数据包络分析模型在应用中的局限性很大.本文是在传统数据包络分析模型的基础上,通过引入离散的时间变量来建立对整个多阶段生产过程进行评价的数据包络分析模型.  相似文献   

11.
研究了由一艘驱逐舰和四艘护卫舰组成的水面舰艇编队的防空建模问题,建立了突出信息平衡度的修正兰彻斯特战争模型.对于问题1,利用编队与来袭导弹的时变相对位置速度数据求解模型,得到编队最佳队形.对于问题2,建立了来袭导弹对舰艇编队追击问题的几何模型,从单舰到编队的拦截区域分析了编队最小防御纵深的估算方法.对于问题3,采用问题2中类似的方法,计算出有信息支援条件下可拦截来袭导弹15批,相比问题二提高了37.20%.对于问题4,采用聚类分析方法对已知意图的附件数据进行聚类,将分类后的类别中心作为意图识别的模式,采用基于所提出的意图偏差度判别分析方法对空中可疑目标可能的意图作出了识别.对于问题5,在分析信息优势的特征和信息平衡度的重要作用的基础上,根据交战双方对对方作战信息的掌控能力和实时水平,给出了交战双方之间的信息平衡度计算模型.以海湾战争作为案例,结合两类模型进行了对比计算,对修正模型的有效性作出了初步验证.  相似文献   

12.
潜变量模型是一种广泛应用于表征多个观察变量之间相关性的统计方法.在刻画多重分类数据关联性方面,这类模型通常假定每个分类变量都与一个潜在连续变量或向量相联系,通过潜变量或向量在窗口部分的观察值来确定分类变量的值,从而达到对类别界定.然而该方法存在一个弱点:观察似然或模型存在确定性问题.模型缺乏识别性必然会对估计构成影响....  相似文献   

13.
对于纵向数据下半参数回归模型,基于广义估计方程和一般权函数方法构造了模型中参数分量和非参数分量的估计.在适当的条件下证明了参数估计量具有渐近正态性,并得到了非参数回归函数估计量的最优收敛速度.通过模拟研究说明了所提出的估计量在有限样本下的精确性.  相似文献   

14.
利用常微分方程定性和稳定性理论、计算机工具建立并研究了没有疫苗和带有疫苗的流感模型.根据中国疾控中心的数据,利用MATLAB进行参数模拟,得到了流感基本再生数的取值范围,并对疫苗的年生产量做出了估计;同时,求出了模型的无病平衡点和地方病平衡点,证明了无病平衡点当基本再生数小于1时是全局渐进稳定的、地方病平衡点存在时是局部稳定的.  相似文献   

15.
Abstract

A simple matrix formula is given for the observed information matrix when the EM algorithm is applied to categorical data with missing values. The formula requires only the design matrices, a matrix linking the complete and incomplete data, and a few simple derivatives. It can be easily programmed using a computer language with operators for matrix multiplication, element-by-element multiplication and division, matrix concatenation, and creation of diagonal and block diagonal arrays. The formula is applicable whenever the incomplete data can be expressed as a linear function of the complete data, such as when the observed counts represent the sum of latent classes, a supplemental margin, or the number censored. In addition, the formula applies to a wide variety of models for categorical data, including those with linear, logistic, and log-linear components. Examples include a linear model for genetics, a log-linear model for two variables and nonignorable nonresponse, the product of a log-linear model for two variables and a logit model for nonignorable nonresponse, a latent class model for the results of two diagnostic tests, and a product of linear models under double sampling.  相似文献   

16.
Acta Mathematicae Applicatae Sinica, English Series - In survival analysis, data are frequently collected by some complex sampling schemes, e.g., length biased sampling, case-cohort sampling and so...  相似文献   

17.
This paper proposes and analyzes a method called meshless parameterization for reconstructing curves from unordered point samples. The method solves a linear system of equations based on convex combinations so as to map the sampled points into corresponding parameter values, whose natural ordering provides the ordering of the points. Using the theory of M-matrices, we derive natural conditions on the point sample which guarantee the correct ordering. A sufficient condition is that the underlying curve be tangent-continuous and free of self-intersections and that the sample is dense enough.  相似文献   

18.
以范畴逻辑与类型论为基础,引入类型中的交换群理论、环理论以及左R-模理论.证明了类型中的交换群理论在满足分配律的范畴中的模型是交换群对象,环理论的模型是环对象,左R-模理论的模型是左R-模对象,并给出左R-模理论在集合范畴和层范畴等几个具体范畴中的模型.  相似文献   

19.
本将随机效应当作是缺失数据,基于Q函数和EM算法并利用P-样条拟合非参数部分,得到了纵向数据半参数Beta回归模型估计方法.基于数据删除模型,我们得到了模型参数部分的广义Cook距离以及非参数部分的广义DFIT.此外,本文还研究了在四种不同扰动情形下模型的局部影响分析,得到了相应的影响矩阵.最后,我们通过两个数值实例验证了所得诊断统计量的有效性.  相似文献   

20.
针对传统方法中的不足,在引入标准治愈率模型的基础上,提出在屏蔽数据可靠性分析中应用一种扩展的治愈率模型的建模方法;分析证明了利用该方法进行建模分析时仅需对模型作较少的前提假设,在信息不足的情况下能够识别出伴随变量对系统寿命分布的影响,进而有效提高模型估计的稳健性.通过运用基于Gibbs抽样的MCMC方法动态模拟出相关参数后验分布的马尔可夫链,给出随机截尾条件下模型参数的贝叶斯估计;实例分析的结果,证明了该模型在可靠性应用中的直观性与有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号