首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   2篇
  免费   0篇
化学   1篇
物理学   1篇
  2022年   1篇
  2021年   1篇
排序方式: 共有2条查询结果,搜索用时 0 毫秒
1
1.
Untargeted metabolomics based on liquid chromatography coupled with mass spectrometry (LC–MS) can detect thousands of features in samples and produce highly complex datasets. The accurate extraction of meaningful features and the building of discriminant models are two crucial steps in the data analysis pipeline of untargeted metabolomics. In this study, pure ion chromatograms were extracted from a liquor dataset and left-sided colon cancer (LCC) dataset by K-means-clustering-based Pure Ion Chromatogram extraction method version 2.0 (KPIC2). Then, the nonlinear low-dimensional embedding by uniform manifold approximation and projection (UMAP) showed the separation of samples from different groups in reduced dimensions. The discriminant models were established by extreme gradient boosting (XGBoost) based on the features extracted by KPIC2. Results showed that features extracted by KPIC2 achieved 100% classification accuracy on the test sets of the liquor dataset and the LCC dataset, which demonstrated the rationality of the XGBoost model based on KPIC2 compared with the results of XCMS (92% and 96% for liquor and LCC datasets respectively). Finally, XGBoost can achieve better performance than the linear method and traditional nonlinear modeling methods on these datasets. UMAP and XGBoost are integrated into KPIC2 package to extend its performance in complex situations, which are not only able to effectively process nonlinear dataset but also can greatly improve the accuracy of data analysis in non-target metabolomics.  相似文献   
2.
太赫兹(THz)具有低能性、瞬态性、波谱分析能力强的优点,在物质鉴别方面具有广阔的应用前景。现有的基于THz的物质鉴别方法,虽然取得了一定的效果,但是存在容易陷入局部最优的问题,从而导致识别精度不高。均匀流形逼近与投影(UMAP)作为一种非线性降维方法,其假设数据均匀分布在黎曼流形上,可以对具有模糊拓扑结构的流形进行建模。UMAP降维的过程是通过最小化两个拓扑表示之间的交叉熵,从而实现低维空间中数据表示的布局优化。传统的模糊C聚类方法(FCM)在聚类时,初始聚类中心往往随机给定,当初始聚类中心选择不恰当时,容易导致错误的聚类。为此,提出一种基于UMAP辅助的模糊C聚类算法,首先运用UMAP对输入的THz样本矩阵进行降维;再根据类与类之间距离最大化的原则,选择合适的初始聚类中心;最后利用模糊C均值聚类的方法进行聚类。所提出的方法不仅能够解决聚类过程中类与类之间过度拥挤的现象,而且能够反映出类别间的距离信息以便于给样本选择合适的初始聚类中心。为了验证提出的聚类方法的可靠性,运用太赫兹时域光谱技术对鲁棉研28、鲁棉研29、鲁棉研36、中棉28四种不同类型的转基因棉花种子进行了探测,利用基于U...  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号