首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Microarrays have been widely used to identify differentially expressed genes. One related problem is to estimate the proportion of differentially expressed genes. For some complex diseases, the amount of differentially expressed genes may be relatively small and these genes may only have subtly differential expressions. For these microarray data, it is generally difficult to efficiently estimate the proportion of differentially expressed genes. In this study, I propose a likelihood-based method coupled with an expectation-maximization (E-M) algorithm for estimating the proportion of differentially expressed genes. The proposed method has favorable performances if either (i) the P values of differentially expressed genes are homogeneously distributed or (ii) the proportion of differentially expressed genes is relatively small. In both of these situations, I showed through simulations that the proposed method gave satisfactory performances when it was compared to other existing methods. As applications, these methods were applied to two microarray gene expression data sets generated from different platforms.  相似文献   

2.
With the rapid development of DNA microarray technology and next-generation technology, a large number of genomic data were generated. So how to extract more differentially expressed genes from genomic data has become a matter of urgency. Because Low-Rank Representation (LRR) has the high performance in studying low-dimensional subspace structures, it has attracted a chunk of attention in recent years. However, it does not take into consideration the intrinsic geometric structures in data.In this paper, a new method named Laplacian regularized Low-Rank Representation (LLRR) has been proposed and applied on genomic data, which introduces graph regularization into LRR. By taking full advantages of the graph regularization, LLRR method can capture the intrinsic non-linear geometric information among the data. The LLRR method can decomposes the observation matrix of genomic data into a low rank matrix and a sparse matrix through solving an optimization problem. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Therefore, the differentially expressed genes can be selected according to the sparse matrix. Finally, we use the GO tool to analyze the selected genes and compare the P-values with other methods.The results on the simulation data and two real genomic data illustrate that this method outperforms some other methods: in differentially expressed gene selection.  相似文献   

3.
4.
建立了一种基于不相交主成分分析(Disjoint PCA)和遗传算法(GA)的特征变量选择方法, 并用于从基因表达谱(Gene expression profiles)数据中识别差异表达的基因. 在该方法中, 用不相交主成分分析评估基因组在区分两类不同样品时的区分能力; 用GA寻找区分能力最强的基因组; 所识别基因的偶然相关性用统计方法评估. 由于该方法考虑了基因间的协同作用更接近于基因的生物过程, 从而使所识别的基因具有更好的差异表达能力. 将该方法应用于肝细胞癌(HCC)样品的基因芯片数据分析, 结果表明, 所识别的基因具有较强的区分能力, 优于常用的基因芯片显著性分析(Significance analysis of microarrays, SAM)方法.  相似文献   

5.
With the proliferation of related microarray studies by independent groups, a natural approach to analysis would be to combine the results across studies. In this article, we address a meta-analysis of the gene expression data on imatinib resistance in chronic myelogenous leukemia. First, an analysis of the overlapping among 6 published studies revealed that only 3 genes were coincident between 2 studies. A later reprocessing using different methods on 4 publicly available datasets revealed that 2 extra genes were overlapped between two sets. Both poor overlappings may be due to large differences in the sample source, the microarray platforms used, and a small difference in gene expression between the imatinib non-responder and responder patients. A search of common genes inside 4 public datasets afforded 404 well defined genes. Nevertheless, this necessary condition for meta-analysis caused the loss of many genes of possible interest. The expression signals of the common genes in the four datasets were reanalyzed using three summary statistical methods for combining quantitative information: Fisher, Stouffer and effect-size. Taking the three methods together and using an FDR < 0.10 threshold, a gene-list with 33 differentially expressed genes was found. Considering all the reanalysis approaches used in this work, a final gene-list with 38 differentially expressed genes is reported. Despite the important limitations to this microarray meta-analysis, the presented procedures and integrated gene-list may have some potential value as regards imatinib resistance in CML patients since it is the first attempt to integrate evidence about gene-lists in this area.  相似文献   

6.
7.
It has been shown that the generalized F-statistics can give satisfactory performances in identifying differentially expressed genes with microarray data. However, for some complex diseases, it is still possible to identify a high proportion of false positives because of the modest differential expressions of disease related genes and the systematic noises of microarrays. The main purpose of this study is to develop statistical methods for Affymetrix microarray gene expression data so that the impact on false positives from non-expressed genes can be reduced. I proposed two novel generalized F-statistics for identifying differentially expressed genes and a novel approach for estimating adjusting factors. The proposed statistical methods systematically combine filtering of non-expressed genes and identification of differentially expressed genes. For comparison, the discussed statistical methods were applied to an experimental data set for a type 2 diabetes study. In both two- and three-sample analyses, the proposed statistics showed improvement on the control of false positives.  相似文献   

8.
Usage of DNA microarrays for gene expression analysis has become a common technique in many research laboratories and industry. Several target-labeling techniques have been devised to reduce the amount of RNA required for microarray experiments. In order to facilitate comparison and sharing of microarray data across the laboratories, it is crucial to determine the relative affects of these different sample-labeling techniques on the final results obtained from these experiments. We have compared two labeling methods designed for small RNA samples, an enzyme-based tyramide method (TSA) and a nucleic acid-based dendrimer method, to a more typical direct-labeling method that requires larger amounts of RNA. We observed comparable levels of reproducibility between replicate spots, with all the techniques. The dendrimer method resulted in a minimum number of spots (0.08%) that showed differential labeling due to a bias in the dyes used but resulted in highest background with only 71.4% of the spots measurable (above background) as compared to 93.3% for the TSA technique and 79.7% for the direct-labeling method. The results from differential labeling experiments showed that the dendrimer method performed better than the TSA method in detecting the same set of differentially expressed genes as observed with the direct method. Overall, our results show that the dendrimer method performs better than the TSA method. Differential labeling experiments using the TSA method show a non-linearity in the data at high intensities, leading to skewing of a portion of the data.  相似文献   

9.
IntroductionIt is reported that LTF had a radiation resistance effect, and its expression in nasopharyngeal carcinoma (NPC) was significantly down-regulated. However, the mechanism of down-regulated LTF affecting the sensitivity of radiotherapy has remained elusive.MethodsWe re-analyzed the microarray data GSE36972 and GSE48503 to find differentially expressed genes (DEGs) in NPC cell line 5−8 F transfected with LTF or vector control, and the DEGs between radio-resistant and radio-sensitive NPC cell lines. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment and protein-protein interaction network (PPI) analysis of DEGs were performed to obtain the node genes. The target genes of miR-214 were also predicted to complement the mechanism associated with radiotherapy resistance because it could directly target LTF.ResultsThis study identified 1190 and 1279 DEGs, respectively. GO and KEGG analysis showed that apoptotic process and proliferation, PI3K-Akt signaling pathway were significantly enriched pathways. Four nodes (DUSP1, PPARGC1A, FOS and SMARCA1) associated with LTF were screened. And 42 target genes of miR-214 were cross-linked to radiotherapy sensitivity.ConclusionsThe present study demonstrates the possible molecular mechanism that the down-regulated LTF enhances the radiosensitivity of NPC cells through interaction with DUSP1, PPARGC1A, FOS and SMARCA1, and miR-214 as its superior negative regulator may play a role in regulating the radiotherapy effect.  相似文献   

10.
11.
双龙方组分诱导大鼠BMSCs分化的差异基因筛选及聚类分析   总被引:2,自引:1,他引:1  
利用基因芯片筛选双龙方有效组分(总人参皂苷及总丹酚酸)诱导大鼠骨髓间充质干细胞(BMSCs)类心肌细胞分化过程中的差异表达基因, 并对其进行聚类分析, 在基因水平研究了双龙方组分对大鼠BMSCs分化的影响. 对大鼠BMSCs进行分组培养, 分别收集10, 20, 30及40 d的细胞样本, 提取tRNA, 经基因芯片检测, 筛选出BMSCs变化过程中的差异表达基因并进行生物信息学分析, 同时通过差异表达基因对样本进行Hierarchical聚类分析. 在BMSCs的分化过程中, 筛选出179条差异表达基因, 经分析发现它们与能量代谢和信号传导等多类基因密切相关. 对样本进行聚类分析发现其聚为两大类: 10和20 d的样本聚为一类, 30和40 d的样本聚为一类. 说明BMSCs在20~30 d之间可能发生了显著的改变.  相似文献   

12.
13.
Li-Juan Tang  Hai-Long Wu 《Talanta》2009,79(2):260-1694
One problem with discriminant analysis of microarray data is representation of each sample by a large number of genes that are possibly irrelevant, insignificant or redundant. Methods of variable selection are, therefore, of great significance in microarray data analysis. To circumvent the problem, a new gene mining approach is proposed based on the similarity between probability density functions on each gene for the class of interest with respect to the others. This method allows the ascertainment of significant genes that are informative for discriminating each individual class rather than maximizing the separability of all classes. Then one can select genes containing important information about the particular subtypes of diseases. Based on the mined significant genes for individual classes, a support vector machine with local kernel transform is constructed for the classification of different diseases. The combination of the gene mining approach with support vector machine is demonstrated for cancer classification using two public data sets. The results reveal that significant genes are identified for each cancer, and the classification model shows satisfactory performance in training and prediction for both data sets.  相似文献   

14.
Our ability to detect differentially expressed genes in a microarray experiment can be hampered when the number of biological samples of interest is limited. In this situation, we propose the use of information from self-self hybridizations to acuminate our inference of differential expression. A unified modelling strategy is developed to allow better estimation of the error variance. This principle is similar to the use of a pooled variance estimate in the two-sample t-test. The results from real dataset examples suggest that we can detect more genes that are differentially expressed in the combined models. Our simulation study provides evidence that this method increases sensitivity compared to using the information from comparative hybridizations alone, given the same control for false discovery rate. The largest increase in sensitivity occurs when the amount of information in the comparative hybridization is limited.  相似文献   

15.
16.
When using microarray data for studying a complex disease such as cancer, it is a common practice to normalize data to force all arrays to have the same distribution of probe intensities regardless of the biological groups of samples. The assumption underlying such normalization is that in a disease the majority of genes are not differentially expressed genes (DE genes) and the numbers of up- and down-regulated genes are roughly equal. However, accumulated evidences suggest gene expressions could be widely altered in cancer, so we need to evaluate the sensitivities of biological discoveries to violation of the normalization assumption. Here, we analyzed 7 large Affymetrix datasets of pair-matched normal and cancer samples for cancers collected in the NCBI GEO database. We showed that in 6 of these 7 datasets, the medians of perfect match (PM) probe intensities increased in cancer state and the increases were significant in three datasets, suggesting the assumption that all arrays have the same median probe intensities regardless of the biological groups of samples might be misleading. Then, we evaluated the effects of three currently most widely used normalization algorithms (RMA, MAS5.0 and dChip) on the selection of DE genes by comparing them with LVS which relies less on the above-mentioned assumption. The results showed using RMA, MAS5.0 and dChip may produce lots of false results of down-regulated DE genes while missing many up-regulated DE genes. At least for cancer study, normalizing all arrays to have the same distribution of probe intensities regardless of the biological groups of samples might be misleading. Thus, most current normalizations based on unreliable assumptions may distort biological differences between normal and cancer samples. The LVS algorithm might perform relatively well due to that it relies less on the above-mentioned assumption. Also, our results indicate that genes may be widely up-regulated in most human cancer.  相似文献   

17.
Colorectal cancer is one of the leading causes of cancer-related deaths worldwide. The gemini nanoparticle formulation of polyphenolic curcumin significantly inhibits the viability of cancer cells. However, the molecular mechanisms and pathways underlying its toxicity in colon cancer are unclear. Here, we aimed to uncover the possible novel targets of gemini curcumin (Gemini-Cur) on colorectal cancer and related cellular pathways. After confirming the cytotoxic effect of Gemini-Cur by MTT and apoptotic assays, RNA sequencing was employed to identify differentially expressed genes (DEGs) in HCT-116 cells. On a total of 3892 DEGs (padj < 0.01), 442 genes showed a log2 FC >|2| (including 244 upregulated and 198 downregulated). Gene ontology (GO) enrichment analysis was performed. Protein–protein interaction (PPI) and gene-pathway networks were constructed by using STRING and Cytoscape. The pathway analysis showed that Gemini-Cur predominantly modulates pathways related to the cell cycle. The gene network analysis revealed five central genes, namely GADD45G, ATF3, BUB1B, CCNA2 and CDK1. Real-time PCR and Western blotting analysis confirmed the significant modulation of these genes in Gemini-Cur-treated compared to non-treated cells. In conclusion, RNA sequencing revealed novel potential targets of curcumin on cancer cells. Further studies are required to elucidate the molecular mechanism of action of Gemini-Cur regarding the modulation of the expression of hub genes.  相似文献   

18.
PurposeTo identify potential biomarkers and to uncover the mechanisms underlying asthma based on Gibbs sampling.MethodsThe molecular functions (MFs) with genes greater than 5 were determined using AnnotationMFGO of BAGS package, and the obtained MFs were then transformed to Markov chain (MC). Gibbs sampling was conducted to obtain a new MC. Meanwhile, the average probabilities of MFs were computed via MC Monte Carlo (MCMC) algorithm, followed by identification of differentially expressed MFs based on the probabilities of MF more than 0.6. Moreover, the differentially expressed genes (DEGs) and their correlated genes were screened and merged, called as co-expressed genes. Pathways enrichment analysis was implemented for the co-expressed genes.ResultsBased on the gene set more than 5, overall 396 MFs were determined. After Gibbs sampling, 5 differentially expressed MF were acquired according to alfa.pi > 0.6. Moreover, the genes in these 5 differentially expressed MF were merged, and 110 DEGs were identified. Subsequently, 338 co-expressed genes were gained. Based on the P value < 0.01, the co-expressed genes were significantly enriched in 6 pathways. Among these, ubiquitin mediated proteolysis contained the maximum numbers of 35 co-expressed genes, and cell cycle were enriched by the second largest number of 11 co-expressed genes, respectively.ConclusionsThe identified pathways such as ubiquitin mediated proteolysis and cell cycle might play important roles in the development of asthma and may be useful for developing the credible therapeutic approaches for diagnosis and treatment of asthma in future.  相似文献   

19.
Dysregulated and reprogrammed metabolism are one of the most important characteristics of cancer, and exploiting cancer cell metabolism can aid in understanding the diverse clinical outcomes for patients. To investigate the differences in metabolic pathways among patients with acute myeloid leukemia (AML) and differential survival outcomes, we systematically conducted microarray data analysis of the metabolic gene expression profiles from 384 patients available from the Gene Expression Omnibus and Cancer Genome Atlas databases. Pathway enrichment analysis of differentially expressed genes (DEGs) showed that the metabolic differences between low-risk and high-risk patients mainly existed in two pathways: biosynthesis of unsaturated fatty acids and oxidative phosphorylation. Using the gene-pathway bipartite network, 62 metabolic genes were identified from 272 DEGs involved in 88 metabolic pathways. Based on the expression patterns of the 62 genes, patients with shorter overall survival (OS) durations in the training set (hazard ratio (HR) = 1.58, p = 0.038) and in two test sets (HR = 1.69 and 1.56 and p = 0.089 and 0.029, respectively) were well discriminated by hierarchical clustering analysis. Notably, the expression profiles of ALAS2, BCAT1, BLVRB, and HK3 showed distinct differences between the low-risk and high-risk patients. In addition, models for predicting the OS outcome of AML from the 62 gene signatures achieved improved performance compared with previous studies. In conclusion, our findings reveal significant differences in metabolic processes of patients with AML with diverse survival durations and provide valuable information for clinical translation.  相似文献   

20.
BackgroundColorectal cancer (CRC) is one of the most frequent and diagnosed diseases. Accumulating evidences showed that mRNAs and noncoding RNAs play important regulatory roles in tumorigenesis. Identification and determining the relationship between them can help diagnosis and treatment of cancer.MethodsHere we analyzed three microarray datasets; GSE110715, GSE32323 and GSE21510, to identify differentially expressed lncRNAs and mRNAs in CRC. The adjusted p-value ≤0.05 was considered statistically significant. Gene set enrichment analysis was carried out using DAVID tool. The miRCancer database was searched to obtain differentially expressed miRNAs in colorectal cancer, and the miRDB database was used to attain the targets of the obtained miRNAs. To predict the lncRNA-miRNA interactions we used DIANA-LncBase v2 and RegRNA 2.0. Finally the lncRNA-miRNA-mRNA-signaling pathway network was constructed using Cytoscape v3.1.ResultsBy analyzing the three datasets, a total of 21 mRNAs (15 up- and 6 down-regulated) and 24 lncRNAs (18 up- and 6 down-regulated) were identified as common differentially expressed genes between CRC tumor and marginal tissues. Nevertheless, the constructed lncRNA-miRNA-mRNA-signaling pathway network revealed a convergence on 6 lncRNAs (3 up- and 3 downregulated), 7 mRNAs (2 up- and 5 downregulated) and 6 miRNAs (3 up- and 3 downregulated). We found that dysregulation of lncRNAs such as PCBP1-AS1, UCA1 and SNHG16 could sequester several miRNAs such as hsa-miR-582-5p and hsa-miR-198 and promote the proliferation, invasion and drug resistance of colorectal cancer cells.ConclusionsWe introduced a set of lncRNAs, mRNAs and miRNAs differentially expressed in CRC which might be considered for further experimental research as potential biomarkers of CRC development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号