首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
本文以本校2005级本科生为调查对象,考察影响大学生四级考试(College English Test Band Four,简称CET-4)CET-4成绩的因素。采用Bootstrap方法与T检验方法,分析得出:来自城市和农村的大学生CET-4成绩存在差异;大一大二四学期英语水平不同的人CET-4成绩存在显著差异;文、理科大学生CET-4成绩也存在差异。进一步,我们运用多元统计中回归分析的方法建立了CET-4成绩的回归模型,并利用Logistic模型对四级通过率进行了预测。  相似文献   

2.
针对现实中土壤水分含量测定困难的问题,应用回归分析方法,找到反映土壤水分含量的相关因素,推导出回归方程,从而计算出土壤水分含量.这样简便了测定方法,便于在实际中运用.  相似文献   

3.
统计方法在证券信息分析中的应用   总被引:1,自引:0,他引:1  
翁小清,甄增荣.统计方法在证券信息分析中的应用.数理统计与管理,1997,16(4),19~24.本文将中位数、配对t检验、方差分析等统计方法应用于证券信息的定量分析中,为证券投资提供了可靠的依据  相似文献   

4.
偏最小二乘建模在R软件中的实现及实证分析   总被引:2,自引:0,他引:2  
通过介绍偏最小二乘(PLS)的建模和显著性检验原理,解决了小样本多变量且变量间存在多重共线性的回归问题,建立了多变量对多变量的回归模型,并使用R软件(版本为Ri3862.15.1)实现了PLS建模;最后基于葡萄和葡萄酒理化指标数据进行了实证分析.  相似文献   

5.
基于ICA的时间序列聚类方法及其在股票数据分析中的应用   总被引:1,自引:0,他引:1  
时间序列聚类分析是时间序列数据挖掘中的重要任务之一,通常由于时间序列数据的特殊结构,导致一般的聚类算法不能直接应用于时间序列数据。本文提出了一种基于独立成分分析与改进^一均值算法相结合的时间序列聚类算法,该算法首先利用独立成分分析对时间序列数据进行特征提取,然后利用改进£.均值聚类算法完成对时间序列特征数据的聚类分析,从而得到了一种新的基于特征的时间序列聚类方法。为了验证该方法的有效性和可行性,将其应用于实际的股票时间序列数据聚类分析中,取得了较好的数值结果。  相似文献   

6.
为了加强对车辆的监控,准确地采集数据,合理调节疏导流量,更好地规范交通管理,采集交通流量数据的设备必不可少,同时对于厂家生产的有关交通数据的检测仪器也要制定出一套科学,客观,有效的检测标准,以便确保交通管理的科学性和准确性.经常采集的交通数据包括车流量,车型大小,停车是否越线.本文依据仪器所采集到的数据,给出了对仪器的测量精度是否达到规定标准的统计分析方法.  相似文献   

7.
运用GM(1,1)模型对我国农林牧渔业总产值进行预测,首先对数据进行预处理,使其满足灰色预测模型的条件,运用M ATLAB软件编程求解.通过做残差检验和滚动检验,发现误差均小于10%,最后进行了关联度分析.得出结论:对农林牧渔业总产值影响最大的为渔业,其次为牧业,林业和农业.  相似文献   

8.
核主成分分析(KPCA)在企业经济效益评价中的应用   总被引:8,自引:0,他引:8  
提出了一种新的经济效益评价模型:核主成分分析(KPCA).它通过一个非线性变换,将原变量空间映射到高维特征空间,然后在高维特征空间中进行线性主成分分析.通过核技巧,只需在原空间进行点积运算,便可使第一主成分的贡献率达到90%以上,能有效避免PCA中因各指标贡献率过于分散而影响评价效果.将该模型应用到广东8个卷烟企业进行评价,得到了较理想的评价效果.  相似文献   

9.
本文运用主成分分析,因子分析等统计方法分析了科大近几年来副教授级以上教职工的体验数据。各种分析表明,科大教职工的体质健康状况不容乐观,如肥胖比例约为38.6%,高血压比例约为78%,脂肪肝比例约为29.8%,糖尿病比例约为9.6%。从结果中发现教师的工作,生活环境等因素对教师的健康状况有重大影响。  相似文献   

10.
运用多重检验方法对高维数据进行推断统计分析.首先将最小一乘估计算法应用在多重检验分析中,构造出新的估计真实零假设个数的方法.其次对最小一乘与最小二乘方法估计真实零假设个数的准确性进行模拟比较分析,模拟结果表明前者较后者估算结果更准确.最后,将上述估计方法应用于乳腺癌微阵列数据的分析中寻找有表达差异的基因.检验结果共找到118个差异基因,其中85个基因在生物学上是有效基因,实证表明该方法具有一定的实用性.  相似文献   

11.
12.
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels for thousands of genes simultaneously. Image analysis is an important aspect of microarray experiments, one that can have a potentially large impact on subsequent analyses such as clustering or the identification of differentially expressed genes. This article reviews a number of existing image analysis approaches for cDNA microarray experiments and proposes new addressing, segmentation, and background correction methods for extracting information from microarray scanned images. The segmentation component uses a seeded region growing algorithm which makes provision for spots of different shapes and sizes. The background estimation approach is based on an image analysis technique known as morphological opening. These new image analysis procedures are implemented in a software package named Spot, built on the R environment for statistical computing. The statistical properties of the different segmentation and background adjustment methods are examined using microarray data from a study of lipid metabolism in mice. It is shown that in some cases background adjustment can substantially reduce the precision—that is, increase the variability—of low-intensity spot values. In contrast, the choice of segmentation procedure has a smaller impact. The comparison further suggests that seeded region growing segmentation with morphological background correction provides precise and accurate estimates of foreground and background intensities.  相似文献   

13.
将乳腺癌相关基因抽象为节点,利用基因表达谱分别构建正常组织样本(C组)与乳腺癌样本(E组)的pearson、spearman、kendall、mutual information和mic等5种相关性网络,进而分析网络节点度在C组和E组中的差异性,筛选出17个结构性关键基因,其中12个基因已有文献证实与乳腺癌显著相关.它...  相似文献   

14.
本文从复杂网络理论出发,在分析原有乳腺癌易感基因数据的基础上,综合统计分析易感基因彼此之间的关联与乳腺癌疾病之间的关系,并以此构建乳腺癌致病基因蛋白质网络.通过计算和研究网络度,聚类系数等指标发现,此网络具有高度聚集性,即少数核心节点控制着整个网络结构的稳定性.这将为进一步研究和发现乳腺癌致病基因提供新的理论依据和方法.  相似文献   

15.
Microarrays offer unprecedented possibilities for the so-called omic, e.g., genomic and proteomic, research. However, they are also quite challenging data to analyze. The aim of this paper is to provide a short tutorial on the most common approaches used for pattern discovery and cluster analysis as they are currently used for microarrays, in the hope to bring the attention of the Algorithmic Community on novel aspects of classification and data analysis that deserve attention and have potential for high reward. R. Giancarlo is partially supported by Italian MIUR grants PRIN “Metodi Combinatori ed Algoritmici per la Scoperta di Patterns in Biosequenze” and FIRB “Bioinformatica per la Genomica e la Proteomica” and Italy-Israel FIRB Project “Pattern Discovery Algorithms in Discrete Structures, with Applications to Bioinformatics”. D. Scaturro is supported by a MIUR Fellowship in the Italy-Israel FIRB Project “Pattern Discovery Algorithms in Discrete Structures, with Applications to Bioinformatics”.  相似文献   

16.
Bayes判别分析在医疗数据处理中的应用   总被引:1,自引:0,他引:1  
本文利用判别分析的基本原理和方法,针对肝硬化医疗数据建立数学模型,然后利用SPSS16.0作为工具求解模型,得到了三个有意义的能判别归类的函数判别式。  相似文献   

17.
A method is developed here for characterizing the empirical distribution of the efficient units in data envelopment analysis. Two empirical applications illustrate the various uses of the distribution approach. One involves the cost frontier which exhibits increasing returns to scale and the other involves a dynamic production frontier, where technological change causes a shift of the production frontier over time.  相似文献   

18.
多指标面板数据的聚类分析及其应用   总被引:8,自引:0,他引:8  
多指标面板数据的多元统计分析在国内研究中尚属空白.本文分析了面板数据的数据格式和数字特征,根据聚类分析原理,重新构造了多指标面板数据的距离函数和离差平方和函数,在此基础上,说明了多指标面板数据的聚类分析过程.最后对我国各地区工业企业生产效率进行了聚类实证分析,显示了良好的效果。  相似文献   

19.
The identification of breast cancer patients for whom chemotherapy could prolong survival time is treated here as a data mining problem. This identification is achieved by clustering 253 breast cancer patients into three prognostic groups: Good, Poor and Intermediate. Each of the three groups has a significantly distinct Kaplan-Meier survival curve. Of particular significance is the Intermediate group, because patients with chemotherapy in this group do better than those without chemotherapy in the same group. This is the reverse case to that of the overall population of 253 patients for which patients undergoing chemotherapy have worse survival than those who do not. We also prescribe a procedure that utilizes three nonlinear smooth support vector machines (SSVMs) for classifying breast cancer patients into the three above prognostic groups. These results suggest that the patients in the Good group should not receive chemotherapy while those in the Intermediate group should receive chemotherapy based on our survival curve analysis. To our knowledge this is the first instance of a classifiable group of breast cancer patients for which chemotherapy can possibly enhance survival.  相似文献   

20.
Given two finite sets of points X + and X in n , the maximum box problem consists of finding an interval (box) B = {x : l x u} such that B X = , and the cardinality of B X + is maximized. A simple generalization can be obtained by instead maximizing a weighted sum of the elements of B X +. While polynomial for any fixed n, the maximum box problem is -hard in general. We construct an efficient branch-and-bound algorithm for this problem and apply it to a standard problem in data analysis. We test this method on nine data sets, seven of which are drawn from the UCI standard machine learning repository.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号