首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
如何分离出少量区别不同组织类型的特异性基因是DNA微阵列数据分析中的主要问题,特别是构建恰当的统计模型来刻画这些不同组织类型的DNA表达形式尤为重要.为此,基于基因DNA微阵列数据的特点,我们假定对数变换后的微阵列数据服从混合正态分布.我们采用分级Bayesian先验刻画不同基因的相关性,利用分级Bayesian方法构建模型,给出了刻画不同组织基因表达的差异的一个标准,用MCMC迭代计算该标准.模拟计算表明我们的模型具有较好的识别能力.  相似文献   

2.
一种新的分类方法   总被引:5,自引:0,他引:5  
本文在属性聚类网络的基础上 ,提出了堆近邻分类方法 .通过将无监督的属性聚类加上有监督信息 ,能自适应地优选堆数 .样本所考察的近邻个数依据它所在的堆的大小 ,因而每个样本所考查的近邻的个数不是完全相等的 .这种方法可用到高维小样本的数据分类问题中 .我们将它应用到基因表达谱形式的癌症辩识问题中 ,结果表明分类性能得到了较大的提高  相似文献   

3.
The proposed model describes the interaction among normal, immune and tumor cells in a tumor with a chemotherapeutic drug, using a system of four coupled partial differential equations. The dimensions of the tumor and initial conditions of tumor cells are chosen under the assumption that the tumor is already large enough in size to be detectable with the available clinical devices. The pattern of distribution of tumor cells is drafted on the basis of clinical observations. The stability of the system is established with tumor and tumor-free equilibria. The process of tumor regression with the introduction of different diffusion coefficients of tumor and immune cells is considered along with normal cells of tissue without any diffusive movement. It is shown that the results of chemotherapy treatment are in agreement with Jeff’s phenomenon. The response of three different levels of immune system strength to the pulsed chemotherapy are investigated. It is observed that the tumor performs better if a chemotherapeutic drug is injected near the invasive fronts of the tumor.  相似文献   

4.
Finding predictive gene groups from microarray data   总被引:1,自引:0,他引:1  
Microarray experiments generate large datasets with expression values for thousands of genes, but not more than a few dozens of samples. A challenging task with these data is to reveal groups of genes which act together and whose collective expression is strongly associated with an outcome variable of interest. To find these groups, we suggest the use of supervised algorithms: these are procedures which use external information about the response variable for grouping the genes. We present Pelora, an algorithm based on penalized logistic regression analysis, that combines gene selection, gene grouping and sample classification in a supervised, simultaneous way. With an empirical study on six different microarray datasets, we show that Pelora identifies gene groups whose expression centroids have very good predictive potential and yield results that can keep up with state-of-the-art classification methods based on single genes. Thus, our gene groups can be beneficial in medical diagnostics and prognostics, but they may also provide more biological insights into gene function and regulation.  相似文献   

5.
针对肿瘤的早期诊断,提出了一种基于提升小波变换的特征提取的方法,对肿瘤数据样本进行分析鉴别.该方法利用提升小波变换对190例肝癌(包括对照)和107例肺癌(包括对照)基因表达谱芯片数据进行处理后,提取信号的低频信息,经支持向量机训练学习,构造分类器模型,用于癌和非癌样本的区分甄别.实验结果表明,经提升小波变换提取的特征基因,送入分类器中能得到较高的分类率,且在支持向量机中选取线性核函数或径向基函数都能达到较好的分类效果.通过随机选取的20例基因表达谱芯片样本,对所建立的模型进行了测试,获得了很好的效果,因此,本文提出的方法对肿瘤的诊断有一定的应用意义.  相似文献   

6.
Recently, a Bayesian network model for inferring non-stationary regulatory processes from gene expression time series has been proposed. The Bayesian Gaussian Mixture (BGM) Bayesian network model divides the data into disjunct compartments (data subsets) by a free allocation model, and infers network structures, which are kept fixed for all compartments. Fixing the network structure allows for some information sharing among compartments, and each compartment is modelled separately and independently with the Gaussian BGe scoring metric for Bayesian networks. The BGM model can equally be applied to both static (steady-state) and dynamic (time series) gene expression data. However, it is this flexibility that renders its application to time series data suboptimal. To improve the performance of the BGM model on time series data we propose a revised approach in which the free allocation of data points is replaced by a changepoint process so as to take the temporal structure into account. The practical inference follows the Bayesian paradigm and approximately samples the network, the number of compartments and the changepoint locations from the posterior distribution with Markov chain Monte Carlo (MCMC). Our empirical results show that the proposed modification leads to a more efficient inference tool for analysing gene expression time series.  相似文献   

7.
8.
We discuss the theoretical structure and constructive methodology for large-scale graphical models, motivated by their potential in evaluating and aiding the exploration of patterns of association in gene expression data. The theoretical discussion covers basic ideas and connections between Gaussian graphical models, dependency networks and specific classes of directed acyclic graphs we refer to as compositional networks. We describe a constructive approach to generating interesting graphical models for very high-dimensional distributions that builds on the relationships between these various stylized graphical representations. Issues of consistency of models and priors across dimension are key. The resulting methods are of value in evaluating patterns of association in large-scale gene expression data with a view to generating biological insights about genes related to a known molecular pathway or set of specified genes. Some initial examples relate to the estrogen receptor pathway in breast cancer, and the Rb-E2F cell proliferation control pathway.  相似文献   

9.
In certain cancer chemoprevention experiments both the number of observed tumors per animal and their times to detection are used in subsequent statistical analyses. The mathematical models used to represent these experiments usually include the Poisson distribution to characterize the tumor multiplicity data. Very often however, there is excess variance due to interanimal heterogeneity of tumor response. Thus, the number of induced tumors is better characterized by the negative binomial distribution. In this paper we modify an existing statistical technique, which explicitly acknowledges the confounding inherent in these systems, in order to provide a more efficient procedure for utilizing the information in a sample and to more accurately assess treatment effects.  相似文献   

10.
基于贝叶斯统计方法的两总体基因表达数据分类   总被引:1,自引:0,他引:1  
在疾病的诊断过程中,对疾病的精确分类是提高诊断准确率和疾病治愈率至 关重要的一个环节,DNA芯片技术的出现使得我们从微观的层次获得与疾病分类及诊断 密切相关的基因功能信息.但是DNA芯片技术得到的基因的表达模式数据具有多变量小 样本特点,使得分类过程极不稳定,因此我们首先筛选出表达模式发生显著性变化的基因 作为特征基因集合以减少变量个数,然后再根据此特征基因集合建立分类器对样本进行分 类.本文运用似然比检验筛选出特征基因,然后基于贝叶斯方法建立了统计分类模型,并 应用马尔科夫链蒙特卡罗(MCMC)抽样方法计算样本归类后验概率.最后我们将此模型 应用到两组真实的DNA芯片数据上,并将样本成功分类.  相似文献   

11.
Sufficient dimension reduction (SDR) is a paradigm for reducing the dimension of the predictors without losing regression information. Most SDR methods require inverting the covariance matrix of the predictors. This hinders their use in the analysis of contemporary datasets where the number of predictors exceeds the available sample size and the predictors are highly correlated. To this end, by incorporating the seeded SDR idea and the sequential dimension-reduction framework, we propose a SDR method for high-dimensional data with correlated predictors. The performance of the proposed method is studied via extensive simulations. To demonstrate its use, an application to microarray gene expression data where the response is the production rate of riboflavin (vitamin B2) is presented.  相似文献   

12.
恶性肿瘤的传质问题(Ⅰ)——流体动力学部分   总被引:3,自引:0,他引:3  
本文提出肿瘤内部液体和药物传质的三重介质模型·在这部分,研究间隙压力和对流的作用·对于孤立肿瘤和被正常组织包围的肿瘤得到了分析解·计算结果与实验一致,即组织间隙的高压是阻碍药物进入肿瘤的主要原因·文章详细分析了降低间隙压力的参数·  相似文献   

13.
The class of microarray games and the relevance index for genes   总被引:2,自引:1,他引:1  
Nowadays, microarray technology is available to generate a huge amount of information on gene expression. This information must be statistically processed and analyzed, in particular, to identify those genes which are useful for the diagnosis and prognosis of specific diseases. We discuss the possibility of applying game-theoretical tools, like the Shapley value, to the analysis of gene expression data. Via a “truncation” technique, we build a coalitional game whose aim is to stress the relevance (“sufficiency”) of groups of genes for the specific disease we are interested in. The Shapley value of this game is used to select those genes which deserve further investigation. To justify the use of the Shapley value in this context, we axiomatically characterize it using properties with a genetic interpretation. The authors are grateful to two anonymous referees for their extremely helpful comments. An earlier version of this paper was presented at the VI Spanish Meeting on Game Theory and Practice, July 12–14, 2004, Elche, Spain. S. Moretti gratefully acknowledges the financial support of the EU project NewGeneris, European Union 6th FP (FOOD-CT-2005-016320).  相似文献   

14.
The development of sensor networks has enabled detailed tracking of customer behavior in stores. Shopping path data which records each customer??s position and time information is attracting attention as new marketing data. However, there are no proposed marketing models which can identify good customers from huge amounts of time series data on customer movement in the store. This research aims to use shopping path data resulting from tracking customer behavior in the store, using information on the sequence of visiting each product zone in the store and staying time at each product zone, to find how they affect purchasing. To discover useful knowledge for store management, shopping paths data has been transformed into sequence data including information on visit sequence and staying times in the store, and LCMseq has been applied to them to extract frequent sequence patterns. In this paper, we find characteristic in-store behavior patterns of good customers by using actual data of a Japanese supermarket.  相似文献   

15.
A mathematical model of tumor cell population dynamics is considered. The tumor is assumed to consist of cells of two types: amenable and resistant to chemotherapeutic treatment. It is assumed that the growth of the cell populations of both types is governed by logistic equations. The effect of a chemotherapeutic drug on the tumor is specified by a therapy function. Two types of therapy functions are considered: a monotonically increasing function and a nonmonotone one with a threshold. In the former case, the effect of a drug on the tumor is stronger at a higher drug concentration. In the latter case, a threshold drug concentration exists above which the effect of the therapy reduces. The case when the total drug amount is subject to an integral constraint is also studied. A similar problem was previously studied in the case of a linear therapy function with no constraint imposed on the drug amount. By applying the Pontryagin maximum principle, necessary optimality conditions are found, which are used to draw important conclusions about the character of the optimal therapy strategy. The optimal control problem of minimizing the total number of tumor cells is solved numerically in the case of a monotone or threshold therapy function with allowance for the integral constraint on the drug amount.  相似文献   

16.
17.
The effective treatment of brain diseases, such as malignant brain tumours, is generally constricted by the controlled contribution of therapeutic agents. Novel brain tumour therapy proceeds from a direct infusion of the drug into the extra-vascular space of the nervous brain tissue (convection-enhanced delivery). This is carried out using catheter to bypass the blood-brain barrier, which effectively separates brain tissue from the intra-vascular space and hence hamper drug delivery through the bloodstream. The dilation of the target tissue, as response to the local pressure increase, initiates interstitial fluid flow and, thus, the distribution of the chemical agents. An adequate constitutive model of the complex tissue aggregate in the framework of the Theory of Porous Media is essential in order to assist modern clinical application via numerical simulations. The presented model consists of an elastically deformable solid skeleton, provided by the tissue cells, permeated by two viscous, materially incompressible pore-liquid phases, interstitial fluid and blood plasma. Both liquids are mobile within the solid skeleton and separated from each other. With regard to simulate a drug infusion process in the extra-vascular space, the interstitial fluid is treated as a solution of a liquid solvent and a dissolved therapeutic solute. The constitutive assumptions for the involved constituents are adjusted in order to describe the physical behaviour of human brain tissue. The presented numerical examples illustrate the fundamental effects during an infusion process. Therefore, the resulting set of coupled partial differential equations is spatially discretised using hexahedral mixed finite elements with an implicit (backward) Euler time integration scheme to solve the considered problem in a monolithic manner for the primary variables. (© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

18.
F. Fenneteau  J. Li  L. Couture  J. Turgeon  F. Nekka 《PAMM》2007,7(1):1121905-1121906
In order to improve understanding and prediction of drug disposition prior to in vivo experiments, we aimed to develop a PBPK model that accounts for the involvement of P-glycoprotein activity and expression in mouse brain, liver, kidney and heart tissues. Model parameters of P-gp activity and drug diffusion were mainly extrapolated from in vitro data. Model simulations, compared with tissue concentration of 3H-domperidone intravenously administered toWT and KO mice, suggest the involvement of additional membrane transporters in heart and brain tissues. The global sensitivity analysis showed that the variability of model predictions is related to the variability of the unbound fraction to plasma protein, whereas the uncertainty of the model predictions is associated with the uncertainty of the parameters related to P-gp genetic expression, and to the activity of additional transporters in heart and brain tissues. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

19.
Rounding methods are common techniques in many statistical offices to protect sensitive information when publishing data in tabular form. Classical versions of these methods do not consider protection levels while searching patterns with minimum information loss, and therefore typically the so-called auditing phase is required to check the protection of the proposed patterns. This paper presents a mathematical model for the whole problem of finding a protected pattern with minimum loss of information, and proposes a branch-and-cut algorithm to solve it. It also describes a new methodology closely related to the classical Controlled Rounding methods but with several advantages. The new methodology is named Cell Perturbation and leads to a different optimization problem which is simpler to solve than the previous problem. This paper presents a cutting-plane algorithm for finding an exact solution of the new problem, which is a pattern guaranteeing the same protection level requirements but with smaller loss of information when compared with the classical Controlled Rounding optimal patterns. The auditing phase is unnecessary on the solutions generated by the two algorithms. The paper concludes with computational results on real-world instances and discusses a modification in the objective function to guarantee statistical properties in the solutions. Received: April, 2004  相似文献   

20.
Clustering and classification are important tasks for the analysis of microarray gene expression data. Classification of tissue samples can be a valuable diagnostic tool for diseases such as cancer. Clustering samples or experiments may lead to the discovery of subclasses of diseases. Clustering genes can help identify groups of genes that respond similarly to a set of experimental conditions. We also need validation tools for clustering and classification. Here, we focus on the identification of outliers—units that may have been misallocated, or mislabeled, or are not representative of the classes or clusters.We present two new methods: DDclust and DDclass, for clustering and classification. These non-parametric methods are based on the intuitively simple concept of data depth. We apply the methods to several gene expression and simulated data sets. We also discuss a convenient visualization and validation tool—the relative data depth plot.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号