首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
There is a growing interest in planning and implementing broad‐scale clinical trials with a focus on prevention and screening. Often, the data‐generating mechanism for such experiments can be viewed as a semi‐Markov process. In this communication, we develop general expressions for the steady‐state probabilities for regenerative semi‐Markov processes. Hence, the probability of being in a certain state at the time of recruitment to a clinical trial can be calculated. An application to breast cancer prevention is demonstrated. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

2.
Yang  Yuehan  Zhu  Ji 《中国科学 数学(英文版)》2020,63(6):1203-1218
The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.  相似文献   

3.
In this paper, we propose a new optimization framework for improving feature selection in medical data classification. We call this framework Support Feature Machine (SFM). The use of SFM in feature selection is to find the optimal group of features that show strong separability between two classes. The separability is measured in terms of inter-class and intra-class distances. The objective of SFM optimization model is to maximize the correctly classified data samples in the training set, whose intra-class distances are smaller than inter-class distances. This concept can be incorporated with the modified nearest neighbor rule for unbalanced data. In addition, a variation of SFM that provides the feature weights (prioritization) is also presented. The proposed SFM framework and its extensions were tested on 5 real medical datasets that are related to the diagnosis of epilepsy, breast cancer, heart disease, diabetes, and liver disorders. The classification performance of SFM is compared with those of support vector machine (SVM) classification and Logical Data Analysis (LAD), which is also an optimization-based feature selection technique. SFM gives very good classification results, yet uses far fewer features to make the decision than SVM and LAD. This result provides a very significant implication in diagnostic practice. The outcome of this study suggests that the SFM framework can be used as a quick decision-making tool in real clinical settings.  相似文献   

4.
In developing decision-making models for the evaluation of medical procedures, the model parameters can be estimated by fitting the model to data observed in (randomized) trials. For complex models that are implemented by discrete event simulation (microsimulation) of individual life histories, the Score Function (SF) method can potentially be an appropriate approach for such estimation exercises. We test this approach for a microsimulation model for breast cancer screening that is fitted to data from the HIP randomized trial for early detection of breast cancer. Comparison of the parameter values estimated using the SF method and the analytical solution shows that method performs well on this simple model. The precision of the estimated parameter values depends (as expected) on the size of the sample of simulated life histories, and on the number of parameters estimated. Using analytical representations for parts of the microsimulation model can increase the precision of the estimated parameter values. Compared to the Nelder and Mead Simplex method which is often used in stochastic simulation because of its ease of implementation, the SF method is clearly more efficient (ratio computer time: precision of estimates). The additional analytical investment needed to implement the SF method in an (existing) simulation model may well be worth the effort.  相似文献   

5.

We investigate semiparametric estimation of regression coefficients through generalized estimating equations with single-index models when some covariates are missing at random. Existing popular semiparametric estimators may run into difficulties when some selection probabilities are small or the dimension of the covariates is not low. We propose a new simple parameter estimator using a kernel-assisted estimator for the augmentation by a single-index model without using the inverse of selection probabilities. We show that under certain conditions the proposed estimator is as efficient as the existing methods based on standard kernel smoothing, which are often practically infeasible in the case of multiple covariates. A simulation study and a real data example are presented to illustrate the proposed method. The numerical results show that the proposed estimator avoids some numerical issues caused by estimated small selection probabilities that are needed in other estimators.

  相似文献   

6.
The case-cohort design is widely used in large epidemiological studies and prevention trials for cost reduction. In such a design, covariates are assembled only for a subcohort which is a random subset of the entire cohort and any additional cases outside the subcohort. In this paper, we discuss the case-cohort analysis with a class of general additive-multiplicative hazard models which includes the commonly used Cox model and additive hazard model as special cases. Two sampling schemes for the subcohort, Bernoulli sampling with arbitrary selection probabilities and stratified simple random sampling with fixed subcohort sizes, are discussed. In each setting, an estimating function is constructed to estimate the regression parameters. The resulting estimator is shown to be consistent and asymptotically normally distributed. The limiting variance-covariance matrix can be consistently estimated by the case-cohort data. A simulation study is conducted to assess the finite sample performances of the proposed method and a real example is provided.  相似文献   

7.
Cancer virotherapy is studied in mathematical modeling to improve tumor elimination. Since various oncolytic viruses are used for cancer therapy and virus selection is an important research problem, we, therefore, constructed deterministic and stochastic models of cancer-virus dynamics. We investigated virus characteristic parameter sensitivities using a reproduction ratio. Locally and globally asymptotically stable equilibrium points that are respectively related to therapy failure/partial success and therapy failure were determined. A stochastic system was derived from the deterministic model. Tumor extinction probabilities depending on changing parameter values were investigated. Results suggest that viruses with high infection rates and optimal cytotoxicity are effective for cancer treatment.  相似文献   

8.
In the selection of investment projects, it is important to account for exogenous uncertainties (such as macroeconomic developments) which may impact the performance of projects. These uncertainties can be addressed by examining how the projects perform across several scenarios; but it may be difficult to assign well-founded probabilities to such scenarios, or to characterize the decision makers’ risk preferences through a uniquely defined utility function. Motivated by these considerations, we develop a portfolio selection framework which (i) uses set inclusion to capture incomplete information about scenario probabilities and utility functions, (ii) identifies all the non-dominated project portfolios in view of this information, and (iii) offers decision support for rejection and selection of projects. The proposed framework enables interactive decision support processes where the implications of additional probability and utility information or further risk constraints are shown in terms of corresponding decision recommendations.  相似文献   

9.
This paper addresses a kind of risk decision-making problem existing widely in public administration and business management, which is characterized by (1) occurrence probabilities of states of nature can be estimated by analysing historical observations, but historical observations of different objects are unhomogeneous, (2) the relation between observations and occurrence probabilities of states of nature are affected by some qualitative and quantitative indicators, (3) it is a real-time decision-making problem, that is, there are many decisions for different objects to be made in a limited time, (4) considering decision's execution, impact of resource constrains is an important issue in decision-making process. In this paper, we develop a rule-based approach to address the problem. In the proposed approach, a two-step clustering method is employed to classify objects into categories, and observations in each category can be approximately viewed as homogeneous. For objects in each category, occurrence probabilities of states of nature are estimated by logistic regression, and the decision rule is obtained through solving an optimization model, which is to minimize the total decision risks while satisfying resource constrains. Effect and efficacy of our approach are illustrated through its application to China's customs inspection decision.  相似文献   

10.
This paper distinguishes between objective probability—or chance—and subjective probability. Most statistical methods in machine learning are based on the hypothesis that there is a random experiment from which we get a set of observations. This random experiment could be identified with a chance or objective probability, but these probabilities depend on some unknown parameters. Our knowledge of these parameters is not objective and in order to learn about them, we must assess some epistemic probabilities about their values. In some cases, our objective knowledge about these parameters is vacuous, so the question is: What epistemic probabilities should be assumed? In this paper we argue for the assumption of non-vacuous (a proper subset of [0, 1]) interval probabilities. There are several reasons for this; some are based on the betting interpretation of epistemic probabilities while others are based on the learning capabilities under the vacuous representation. The implications of the selection of epistemic probabilities in different concepts as conditioning and learning are studied. It is shown that in order to maintain some reasonable learning capabilities we have to assume more informative prior models than those frequently used in the literature, such as the imprecise Dirichlet model.  相似文献   

11.
Cancer stem cells are responsible for tumor survival and resurgence and are thus essential in developing novel therapeutic strategies against cancer. Mathematical models can help understand cancer stem and differentiated cell interaction in tumor growth, thus having the potential to help in designing experiments to develop novel therapeutic strategies against cancer. In this paper, by using theory of functional and ordinary differential equations, we study the existence and stability of nonlinear growth kinetics of breast cancer stem cells. First, we provide a sufficient condition for the existence and uniqueness of the solution for nonlinear growth kinetics of breast cancer stem cells. Then we study the uniform asymptotic stability of the zero solution. By using linearization techniques, we also provide a criteria for uniform asymptotic stability of a nontrivial steady‐state solution with and without time delays. We present a theorem from complex analysis that gives certain conditions that allow for this criteria to be satisfied. Next, we apply these theorems to a special case of the system of functional differential equations that has been used to model nonlinear growth kinetics of breast cancer stem cells. The theoretical results are further justified by numerical testing examples. Consistent with the theories, our numerical examples show that the time delays can disrupt the stability. All the results can be easily extended to study more general cell lineage models. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

12.
A population-based cohort consisting of 126,141 men and 122,208 women born between 1874 and 1931 and at risk for breast or colorectal cancer after 1965 was identified by linking the Utah Population Data Base and the Utah Cancer Registry. The hazard function for cancer incidence is estimated from left truncated and right censored data based on the conditional likelihood. Four estimation procedures based on the conditional likelihood are used to estimate the age-specific hazard function from the data; these were the life-table method, a kernel method based on the Nelson Aalen estimator, a spline estimate, and a proportional hazards estimate based on splines with birth year as sole covariate.The results are consistent with an increasing hazard for both breast and colorectal cancer through age 85 or 90. After age 85 or 90, the hazard function for female breast and colorectal cancer may reach a plateua or decrease, although the hazard function for male colorectal cancer appears to continue to rise through age 105. The hazard function for both breast and colorectal cancer appears to be higher for more recent birth cohorts, with a more pronounced birth-cohort effect for breast cancer than for colorectal cancer. The age specific for colorectal cancer appears to be higher for men than for women. The shape of the hazard function for both breast and colorectal cancer appear to be consistent with a two-stage model for spontaneous carcinogenesis in which the initiation rate is constant or increasing. Inheritance of initiated cells appears to play a minor role.  相似文献   

13.
In this article, a conditional likelihood approach is developed for dealing with ordinal data with missing covariates in proportional odds model. Based on the validation data set, we propose the Breslow and Cain (Biometrika 75:11–20, 1988) type estimators using different estimates of the selection probabilities, which may be treated as nuisance parameters. Under the assumption that the observed covariates and surrogate variables are categorical, we present large sample theory for the proposed estimators and show that they are more efficient than the estimator using the true selection probabilities. Simulation results support the theoretical analysis. We also illustrate the approaches using data from a survey of cable TV satisfaction.  相似文献   

14.
Logistic regression techniques can be used to restrict the conditional probabilities of a Bayesian network for discrete variables. More specifically, each variable of the network can be modeled through a logistic regression model, in which the parents of the variable define the covariates. When all main effects and interactions between the parent variables are incorporated as covariates, the conditional probabilities are estimated without restrictions, as in a traditional Bayesian network. By incorporating interaction terms up to a specific order only, the number of parameters can be drastically reduced. Furthermore, ordered logistic regression can be used when the categories of a variable are ordered, resulting in even more parsimonious models. Parameters are estimated by a modified junction tree algorithm. The approach is illustrated with the Alarm network.  相似文献   

15.
Model selection algorithms are required to efficiently traverse the space of models. In problems with high-dimensional and possibly correlated covariates, efficient exploration of the model space becomes a challenge. To overcome this, a multiset is placed on the model space to enable efficient exploration of multiple model modes with minimal tuning. The multiset model selection (MSMS) framework is based on independent priors for the parameters and model indicators on variables. Posterior model probabilities can be easily obtained from multiset averaged posterior model probabilities in MSMS. The effectiveness of MSMS is demonstrated for linear and generalized linear models. Supplementary material for this article is available online.  相似文献   

16.
For a general class of order selection criteria, we establish analytic and non-asymptotic evaluations of both the underfitting and overfitting sets of selected models. These evaluations are further specified in various situations including regressions and autoregressions with finite or infinite variances. We also show how upper bounds for the misfitting probabilities and hence conditions ensuring the weak consistency can be derived from the given evaluations. Moreover, it is demonstrated how these evaluations, combined with a law of the iterated logarithm for some relevant statistic, can provide conditions ensuring the strong consistency of the model selection criterion used.  相似文献   

17.
We present an alternative model for multifactorial inheritance. By changing the way the malformation (and selection) is determined from the genetic information, we arrive at a model that can be properly handled in the mathematical sense. This includes the proof of population convergence and computation of conditional malformation probabilities in a closed form. We also present a comparison to similar models and results of fitting our model to Hungarian data.  相似文献   

18.
19.
当上市银行的长期负债系数γ的取值不同时,应用KMV模型测算出的银行违约概率大相径庭。根据债券的实际信用利差可以推算出上市银行的违约概率PDi,CS,根据长期负债系数γ可以运用KMV模型确定上市银行的理论违约概率PDi,KMV。本文通过理论违约率与实际违约率的总体差异∑ni=1|PDi,KMV-PDi,cs|最小的思路建立规划模型,确定了KMV模型的最优长期负债γ系数;通过最优长期负债系数γ建立了未发债上市银行的违约率测算模型、并实证测算了我国14家全部上市银行的违约概率。本文的创新与特色一是采用KMV模型计算的银行违约概率PDi,KMV与实际信用利差确定的银行违约概率PDi,CS总体差异∑ni=1|PDi,KMV-PDi,cs|最小的思路建立规划模型,确定了KMV模型中的最优长期负债γ系数;使γ系数的确定符合资本市场利差的实际状况,解决了现有研究中在0和1之间当采用不同的长期负债系数γ、其违约概率的计算结果截然不同的问题。二是实证研究表明,当长期负债系数γ=0.7654时,应用KMV模型测算出的我国上市银行违约概率与我国债券市场所接受的上市银行违约概率最为接近。三是实证研究表明国有上市银行违约概率最低,区域性的上市银行违约概率较高,其他上市银行的违约概率居中。  相似文献   

20.
The combination of mathematical models and uncertainty measures can be applied in the area of data mining for diverse objectives with as final aim to support decision making. The maximum entropy function is an excellent measure of uncertainty when the information is represented by a mathematical model based on imprecise probabilities. In this paper, we present algorithms to obtain the maximum entropy value when the information available is represented by a new model based on imprecise probabilities: the nonparametric predictive inference model for multinomial data (NPI-M), which represents a type of entropy-linear program. To reduce the complexity of the model, we prove that the NPI-M lower and upper probabilities for any general event can be expressed as a combination of the lower and upper probabilities for the singleton events, and that this model can not be associated with a closed polyhedral set of probabilities. An algorithm to obtain the maximum entropy probability distribution on the set associated with NPI-M is presented. We also consider a model which uses the closed and convex set of probability distributions generated by the NPI-M singleton probabilities, a closed polyhedral set. We call this model A-NPI-M. A-NPI-M can be seen as an approximation of NPI-M, this approximation being simpler to use because it is not necessary to consider the set of constraints associated with the exact model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号