首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Since the driver pathway in cancer plays a crucial role in the formation and progression of cancer, it is very imperative to identify driver pathways, which will offer important information for precision medicine or personalized medicine. In this paper, an improved maximum weight submatrix problem model is proposed by integrating such three kinds of omics data as somatic mutations, copy number variations, and gene expressions. The model tries to adjust coverage and mutual exclusivity with the average weight of genes in a pathway, and simultaneously considers the correlation among genes, so that the pathway having high coverage but moderate mutual exclusivity can be identified. By introducing a kind of short chromosome code and a greedy based recombination operator, a parthenogenetic algorithm PGA-MWS is presented to solve the model. Experimental comparisons among algorithms GA, MOGA, iMCMC and PGA-MWS were performed on biological and simulated data sets. The experimental results show that, compared with the other three algorithms, the PGA-MWS one based on the improved model can identify the gene sets with high coverage but moderate mutual exclusivity and scales well. Many of the identified gene sets are involved in known signaling pathways, most of the implicated genes are oncogenes or tumor suppressors previously reported in literatures. The experimental results indicate that the proposed approach may become a useful complementary tool for detecting cancer pathways.  相似文献   

2.
A large collection of studies has shown that the occurrence of cancer is related to the functional dysfunction of the pathways. Identification of cancer-related pathways could help researchers understand the mechanisms of complex diseases well. Whereas, most current signaling pathway analysis methods take no account of the gene interaction variations within pathways. Furthermore, considering that some pathways have connection with two or more cancer types, while some are likely to be cancer-type specific pathways. Identifying cancer-type specific pathways contributes to interpreting the different mechanisms of different cancer types. In this study, we first proposed a pathway analysis method named Pathway Analysis of Intergenic Regulation (PAIGR) to identify pathways with dysregulation between genes and compared the performance of this method with four existing methods on four colorectal cancer (CRC) datasets. The results showed that PAIGR could find cancer-related pathways more accurately. Moreover, in order to explore the relationship between the identified pathways and the cancer type, we constructed a pathway interaction network, in which nodes and edges represented pathways and interactions between pathways respectively. Highly connected pathways were considered to play a central role in an extensive range of biological processes, while sparsely connected pathways are considered to have certain specificity. Our results showed that pathways identified by PAIGR had a low nodal degree (i.e., a few numbers of interactions), which suggested that most of these pathways were cancer-type specific.  相似文献   

3.
The molecular mechanism playing a role in the development of prostate cancer (PCA) is not well defined. We decided to determine the changes in gene expression in PCA tissues and to compare them to those in non-cancerous samples. Prostate tissue samples were collected by needle biopsy from 21 PCA and 10 benign prostate hyperplasic (BPH) patients. Total RNA was isolated, cDNA was synthesized, and gene expression levels were determined by microarray method. In the progression to PCA, 738 up-regulated and 515 down-regulated genes were detected in samples. Analysis using Ingenuity Pathway Analysis (IPA) software revealed that 466 network and 423 functions-pathways eligible genes were up-regulated, and 363 network and 342 functions-pathways eligible genes were down-regulated. Up-regulated networks were identified around IL-1beta and insulin-like growth factor-1 (IGF-1) genes. The NFKB gene was centered around two up- and down-regulated networks. Up-regulated canonical pathways were assigned and four of them were evaluated in detail: acute phase response, hepatic fibrosis, actin cytoskeleton, and coagulation pathways. Axonal guidance signaling was the most significant down-regulated canonical pathway. Our data provide not only networks between the genes for understanding the biologic properties of PCA but also useful pathway maps for future understanding of disease and the construction of new therapeutic targets.  相似文献   

4.
We analyze publicly available data on Affymetrix microarray spike-in experiments on the human HGU133 chipset in which sequences are added in solution at known concentrations. The spike-in set contains sequences of bacterial, human, and artificial origin. Our analysis is based on a recently introduced molecular-based model (Carlon, E.; Heim, T. Physica A 2006, 362, 433) that takes into account both probe-target hybridization and target-target partial hybridization in solution. The hybridization free energies are obtained from the nearest-neighbor model with experimentally determined parameters. The molecular-based model suggests a rescaling that should result in a "collapse" of the data at different concentrations into a single universal curve. We indeed find such a collapse, with the same parameters as obtained previously for the older HGU95 chip set. The quality of the collapse varies according to the probe set considered. Artificial sequences, chosen by Affymetrix to be as different as possible from any other human genome sequence, generally show a much better collapse and thus a better agreement with the model than all other sequences. This suggests that the observed deviations from the predicted collapse are related to the choice of probes or have a biological origin rather than being a problem with the proposed model.  相似文献   

5.
Motivation: Microarrays have allowed the expression level of thousands of genes or proteins to be measured simultaneously. Data sets generated by these arrays consist of a small number of observations (e.g., 20-100 samples) on a very large number of variables (e.g., 10,000 genes or proteins). The observations in these data sets often have other attributes associated with them such as a class label denoting the pathology of the subject. Finding the genes or proteins that are correlated to these attributes is often a difficult task since most of the variables do not contain information about the pathology and as such can mask the identity of the relevant features. We describe a genetic algorithm (GA) that employs both supervised and unsupervised learning to mine gene expression and proteomic data. The pattern recognition GA selects features that increase clustering, while simultaneously searching for features that optimize the separation of the classes in a plot of the two or three largest principal components of the data. Because the largest principal components capture the bulk of the variance in the data, the features chosen by the GA contain information primarily about differences between classes in the data set. The principal component analysis routine embedded in the fitness function of the GA acts as an information filter, significantly reducing the size of the search space since it restricts the search to feature sets whose principal component plots show clustering on the basis of class. The algorithm integrates aspects of artificial intelligence and evolutionary computations to yield a smart one pass procedure for feature selection, clustering, classification, and prediction.  相似文献   

6.
Many microarray experiments involve examining the time elapsed prior to the occurrence of a specific event. One purpose of these studies is to relate the gene expressions to the survival times. The Cox proportional hazards model has been the major tool for analyzing such data. The transformation model provides a viable alternative to the classical Cox's model. We investigate the use of transformation models in microarray survival data in this paper. The transformation model, which can be viewed as a generalization of proportional hazards model and the proportional odds model, is more robust than the proportional hazards model, because it is not susceptible to erroneous results for cases when the assumption of proportional hazards is violated. We analyze a gene expression dataset from Beer et al. [Beer, D.G., Kardia, S.L., Huang, C.C., Giordano, T.J., Levin, A.M., Misek, D.E., Lin, L., Chen, G., Gharib, T.G., Thomas, D.G., Lizyness, M.L., Kuick, R., Hayasaka, S., Taylor, J.M., Iannettoni, M.D., Orringer, M.B., Hanash, S., 2002. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 8 (8), 816-824] and show that the transformation model provides higher prediction precision than the proportional hazards model.  相似文献   

7.
A DNA microarray can track the expression levels of thousands of genes simultaneously. Previous research has demonstrated that this technology can be useful in the classification of cancers. Cancer microarray data normally contains a small number of samples which have a large number of gene expression levels as features. To select relevant genes involved in different types of cancer remains a challenge. In order to extract useful gene information from cancer microarray data and reduce dimensionality, feature selection algorithms were systematically investigated in this study. Using a correlation-based feature selector combined with machine learning algorithms such as decision trees, nave Bayes and support vector machines, we show that classification performance at least as good as published results can be obtained on acute leukemia and diffuse large B-cell lymphoma microarray data sets. We also demonstrate that a combined use of different classification and feature selection approaches makes it possible to select relevant genes with high confidence. This is also the first paper which discusses both computational and biological evidence for the involvement of zyxin in leukaemogenesis.  相似文献   

8.
This paper proposes a novel approach for the estimation of spectroscopic data by combining the predictions of an ensemble of estimators using the induced ordered weighted averaging (IOWA) fusion operators. For ensemble generation, we use Gaussian process regression (GPR) and extreme learning machine (ELM) estimators associated with different kernels. To render the model selection issue of ELM as efficiently as in the GPR Bayesian estimation method, we develop an automatic solution based on the powerful differential evolution (DE) algorithm. During the fusion process, the IOWA operator needs two things: (1) an order‐inducing value; and (2) a way to determine its weights. For the order‐inducing value, we propose to use the residual of each estimated output value. Because we cannot compute the true residual, we explore the idea of estimating the residuals themselves by associating to each estimator of the ensemble a second estimator of the same kind called a residual estimator. To learn the weights associated with these nonlinear operators, the proposed method relies on the concept of prioritized aggregation, where we generate the weights directly from the estimated residuals. Experimental results obtained on three real spectroscopic datasets confirm the interesting capabilities of the proposed IOWA fusion method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

9.
The classification of cancer is a major research topic in bioinformatics. The nature of high dimensionality and small size associated with gene expression data,however,makes the classification quite challenging. Although principal component analysis (PCA) is of particular interest for the high-dimensional data,it may overemphasize some aspects and ignore some other important information contained in the richly complex data,because it displays only the difference in the first twoor three-dimensional PC subsp...  相似文献   

10.
Gene expression data sets hold the promise to provide cancer diagnosis on the molecular level. However, using all the gene profiles for diagnosis may be suboptimal. Detection of the molecular signatures not only reduces the number of genes needed for discrimination purposes, but may elucidate the roles they play in the biological processes. Therefore, a central part of diagnosis is to detect a small set of tumor biomarkers which can be used for accurate multiclass cancer classification. This task calls for effective multiclass classifiers with built-in biomarker selection mechanism. We propose the sparse optimal scoring (SOS) method for multiclass cancer characterization. SOS is a simple prototype classifier based on linear discriminant analysis, in which predictive biomarkers can be automatically determined together with accurate classification. Thus, SOS differentiates itself from many other commonly used classifiers, where gene preselection must be applied before classification. We obtain satisfactory performance while applying SOS to several public data sets.  相似文献   

11.
Mining patterns of co-expressed genes across the subset of conditions help to narrow down the search space for the analysis of gene expression data. Identifying conditions specific key genes from the large-scale gene expression data is a challenging task. The conditions specific key gene signifies functional behavior of a group of co-expressed genes across the subset of conditions and can be act as biomarkers of the diseases. In this paper, we have propose a novel approach for identification of conditions specific key genes from Basal-Like Breast Cancer (BLBC) disease using biclustering algorithm and Gene Co-expression Network (GCN). The proposed approach is a two-stage approach. In the first stage, significant biclusters have been extracted with the help of ‘runibic’ biclustering algorithm. The second stage identifies conditions specific key genes from the extracted significant biclusters with the help of GCN. By using difference matrix and gene correlation matrix, we have constructed biologically meaningful and statistically strong GCN. Also, presented the proposed approach with the help of a process diagram and demonstrated the procedure with an example of bicluster number 93 (Bic93). From the experimental results, we observed that 95% and 85% of the extracted biclusters are found to be biologically significant at the p-values less than 0.05 and 0.01 respectively. We have compared proposed approach with the Weighted Gene Co-expression Network Analysis (WGCNA) based approach. From the comparison, our approach has performed effectively and extracted biologically significant biclusters. Also, identified conditions specific key genes which cannot be extracted using the WGCNA based approach. Some of the important identified known key genes are PIK3CA, SHC3, ERBB2, SHC4, PTOV1, STAG1, ZNF215 etc. These key genes can be used as a diagnostic and prognostic biomarker for the BLBC disease after the rigorous analysis. The identified conditions specific key genes can be helpful to reduce the analysis time and increase the accuracy of further research such as biomarker identification, drug target discovery etc.  相似文献   

12.
13.
Bacterial contamination of indoor air is a serious threat to human health. Pathogenic germs can be transferred from the liquid to the aerosol phase, for instance, when water is sprayed in the air, such as in shower rooms, air conditioners, or fountains. Existing analytical methods for biological indoor air-quality assessment and contamination monitoring are mostly time consuming as they generally require a cultivation step. The need for a rapid, sensitive, and selective detection method for bioaerosols is evident. Our approach is based on the combination of a commercial wet particle sampler (Coriolis μ, Bertin Technologies, France) and a label-free microarray readout based on surface-enhanced Raman scattering (SERS) for detection, which was established in our laboratories. Heat-inactivated Escherichia coli bacteria were used as test microorganisms. An E. coli suspension was sprayed into the chamber by a jet air nebulizer. The resulting bioaerosol was dried, neutralized, and then collected by a Coriolis μ sampler. The bacteria collected were detected by a recently developed microarray readout system, based on label-free SERS detection. A special data evaluation procedure was applied in order to fully exploit the selectivity of the detection scheme, resulting in a detection limit of 144 particles per cubic centimeter.  相似文献   

14.
15.
Cancer is in general not a result of an abnormality of a single gene but a consequence of changes in many genes, it is therefore of great importance to understand the roles of different oncogenic and tumor suppressor pathways in tumorigenesis. In recent years, there have been many computational models developed to study the genetic alterations of different pathways in the evolutionary process of cancer. However, most of the methods are knowledge-based enrichment analyses and inflexible to analyze user-defined pathways or gene sets. In this paper, we develop a nonparametric and data-driven approach to testing for the dynamic changes of pathways over the cancer progression. Our method is based on an expansion and refinement of the pathway being studied, followed by a graph-based multivariate test, which is very easy to implement in practice. The new test is applied to the rich Cancer Genome Atlas data to study the (epi)genetic alterations of 186 KEGG pathways in the development of serous ovarian cancer. To make use of the comprehensive data, we incorporate three data types in the analysis representing gene expression level, copy number and DNA methylation level. Our analysis suggests a list of nine pathways that are closely associated with serous ovarian cancer progression, including cell cycle, ERBB, JAK-STAT signaling and p53 signaling pathways. By pairwise tests, we found that most of the identified pathways contribute only to a particular transition step. For instance, the cell cycle and ERBB pathways play key roles in the early-stage transition, while the ECM receptor and apoptosis pathways contribute to the progression from stage III to stage IV. The proposed computational pipeline is powerful in detecting important pathways and gene sets that drive cancers at certain stage(s). It offers new insights into the understanding of molecular mechanism of cancer initiation and progression.  相似文献   

16.
As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and how can they be identified? Here, we present a cage mining methodology based on topological data analysis and a combination of supervised and unsupervised learning that led to the derivation of – to the best of our knowledge – the first and only MOC dataset of 1839 structures and the largest experimental OC dataset of 7736 cages, as of March 2022. We illustrate the use of such datasets with a high-throughput screening of MOCs and OCs for xenon/krypton separation, important gases in multiple industries, including healthcare.

We mined the Cambridge Structural Database for porous cages using topological data analysis, which resulted in the first and only dataset of metal-organic cages and the largest dataset of organic cages.  相似文献   

17.
Enriching the surface density of immobilized capture antibodies enhances the detection signal of antibody sandwich microarrays. In this study, we improved the detection sensitivity of our previously developed P-Si (porous silicon) antibody microarray by optimizing concentrations of the capturing antibody. We investigated immunoassays using a P-Si microarray at three different capture antibody (PSA – prostate specific antigen) concentrations, analyzing the influence of the antibody density on the assay detection sensitivity. The LOD (limit of detection) for PSA was 2.5 ng mL−1, 80 pg mL−1, and 800 fg mL−1 when arraying the PSA antibody, H117 at the concentration 15 μg mL−1, 35 μg mL−1, and 154 μg mL−1, respectively. We further investigated PSA spiked into human female serum in the range of 800 fg mL−1 to 500 ng mL−1. The microarray showed a LOD of 800 fg mL−1 and a dynamic range of 800 fg mL−1 to 80 ng mL−1 in serum spiked samples.  相似文献   

18.
The genetic variability has obtained more and more attention in the process of diagnosis and treatment of tumors.Herein,we have described a multiple genotyping method based on magnetic enrichmentmultiplex PCR (MEM-PCR) and microarray technology.Monodisperse magnetic beads were fabricated and modified with streptavidin.Four loci on two genes (M235T and A-6G loci on AGT gene,A1298C and C677T loci on MTHFR gene) were selected to study single nucleotide polymorphisms (SNP).Target sequences of these SNP loci were amplified using Cy3-labeled primers through multiplex PCR in one tube after the templates were enriched and purified by functional magnetic beads (MB).Four pairs of NH2-labeled probes,corresponding to each locus,were fixed on CHO-modified glass slide by covalent binding.Hybridization between target sequences and probes was performed under suitable conditions.The spotting locations on microarray and the ratio of fluorescence intensity,produced by different loci,were used to distinguish the SNP genotypes.Finally,three of gastric cancer samples were collected and genotyping analysis for these four SNP loci was carried out successfully simultaneously by this method.  相似文献   

19.
Cancer cells couple heightened lipogenesis with lipolysis to produce fatty acid networks that support malignancy. Monoacylglycerol lipase (MAGL) plays a principal role in this process by converting monoglycerides, including the endocannabinoid 2-arachidonoylglycerol (2-AG), to free fatty acids. Here, we show that MAGL is elevated in androgen-independent versus androgen-dependent human prostate cancer cell lines, and that pharmacological or RNA-interference disruption of this enzyme impairs prostate cancer aggressiveness. These effects were partially reversed by treatment with fatty acids or a cannabinoid receptor-1 (CB1) antagonist, and fully reversed by cotreatment with both agents. We further show that MAGL is part of a gene signature correlated with epithelial-to-mesenchymal transition and the stem-like properties of cancer cells, supporting a role for this enzyme in protumorigenic metabolism that, for prostate cancer, involves the dual control of endocannabinoid and fatty acid pathways.  相似文献   

20.
Wu Z  Luo J  Ge Q  Zhang D  Wang Y  Jia C  Lu Z 《Analytica chimica acta》2007,603(2):199-204
Aberrant DNA methylation of CpG site in the gene promoter region has been confirmed to be closely associated with carcinogenesis. In this present study, a new method based on the allele-specific extension on microarray technique for detecting changes of DNA methylation in cancer was developed. The target gene regions were amplified from the bisulfite treated genomic DNA (gDNA) with modified primers and treated with exonuclease to generate single-strand targets. Allele-specific extension of the immobilized primers took place along a stretch of target sequence with the presence of DNA polymerase and Cy5-labeled dGTP. To control the false positive signals, the hybridization condition, DNA polymerase, extension time and primers design were optimized. Two breast tumor-related genes (P16 and E-cadherin) were analyzed with this present method successfully and all the results were compatible with that of traditional methylation-specific PCR. The experiments results demonstrated that this DNA microarray-based method could be applied as a high throughput tool for methylation status analysis of the cancer-related genes, which could be widely used in cancer diagnosis or the detection of recurrence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号