首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   7篇
  免费   0篇
化学   1篇
数学   6篇
  2021年   1篇
  2020年   1篇
  2018年   1篇
  2014年   1篇
  2013年   1篇
  2011年   1篇
  2010年   1篇
排序方式: 共有7条查询结果,搜索用时 15 毫秒
1
1.
Predicting phenotypes on the basis of gene expression profiles is a classification task that is becoming increasingly important in the field of precision medicine. Although these expression signals are real-valued, it is questionable if they can be analyzed on an interval scale. As with many biological signals their influence on e.g. protein levels is usually non-linear and thus can be misinterpreted. In this article we study gene expression profiles with up to 54,000 dimensions. We analyze these measurements on an ordinal scale by replacing the real-valued profiles by their ranks. This type of rank transformation can be used for the construction of invariant classifiers that are not affected by noise induced by data transformations which can occur in the measurement setup. Our 10 \(\times \) 10 fold cross-validation experiments on 86 different data sets and 19 different classification models indicate that classifiers largely benefit from this transformation. Especially random forests and support vector machines achieve improved classification results on a significant majority of datasets.  相似文献   
2.
We study ensembles of simple threshold classifiers for the categorization of high-dimensional data of low cardinality and give a compression bound on their prediction risk. Two approaches are utilized to produce such classifiers. One is based on univariate feature selection employing the area under the ROC curve as ranking criterion. The other approach uses a greedy selection strategy. The methods are applied to artificial data, published microarray expression profiles, and highly imbalanced data.  相似文献   
3.
Feature selection is an essential step when dealing with high-dimensional data. In a diagnostic setting, marker genes have to be selected for specialized low-dimensional gene expression assays. A meaningful biomarker selection is expected to produce stable results in different resampling settings. We define an index to quantify stability and introduce a statistical testing procedure for stability. We also present new methods of visualizing stability and associating it with the accuracy of a subsequent classification process.  相似文献   
4.
Europium(III) fluoride mesocrystals were synthesised in an organic matrix. This matrix is a gel formed by Eu3+ ions and a polycarboxylate/sulfonate copolymer, ACUSOL 588G. In the gel phase, the local amount of europium ions is very high since Eu3+ acts as a crosslinker, and crystallisation occurs upon addition of F. Nucleated seed crystals in the gel phase grow by further ion attachment and form mesocrystals by mutual orientation of the EuF3 particles in the gel. We propose a dipole field as reason for this alignment and that the dipolar character of the particles originates from adsorption of the polyelectrolyte on charged crystal faces.  相似文献   
5.

Visualising data as diagrams using visual attributes such as colour, shape, size, and orientation is challenging. In particular, large data sets demand graphical display as an essential step in the analysis. In order to achieve comprehension often different attributes need to be displayed simultaneously. In this work a comprehensible bivariate, perceptually optimised visualisation scheme for high-dimensional data is proposed and evaluated. It can be used to show fold changes together with confidence values within a single diagram. The visualisation scheme consists of two parts: a uniform, symmetric, two-sided colour scale and a patch grid representation. Evaluation of uniformity and symmetry of the two-sided colour scale was performed in comparison to a standard RGB scale by twenty-five observers. Furthermore, the readability of the generated map was validated and compared to a bivariate heat map scheme.

  相似文献   
6.
Advances in Data Analysis and Classification - Data-driven algorithms stand and fall with the availability and quality of existing data sources. Both can be limited in high-dimensional settings (...  相似文献   
7.
The $k$ -Nearest Neighbour classifier is widely used and popular due to its inherent simplicity and the avoidance of model assumptions. Although the approach has been shown to yield a near-optimal classification performance for an infinite number of samples, a selection of the most decisive data points can improve the classification accuracy considerably in real settings with a limited number of samples. At the same time, a selection of a subset of representative training samples reduces the required amount of storage and computational resources. We devised a new approach that selects a representative training subset on the basis of an evolutionary optimization procedure. This method chooses those training samples that have a strong influence on the correct prediction of other training samples, in particular those that have uncertain labels. The performance of the algorithm is evaluated on different data sets. Additionally, we provide graphical examples of the selection procedure.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号