The machining process is primarily used to remove material using cutting tools. Any variation in tool state affects the quality of a finished job and causes disturbances. So, a tool monitoring scheme (TMS) for categorization and supervision of failures has become the utmost priority. To respond, traditional TMS followed by the machine learning (ML) analysis is advocated in this paper. Classification in ML is supervised based learning method wherein the ML algorithm learn from the training data input fed to it and then employ this model to categorize the new datasets for precise prediction of a class and observation. In the current study, investigation on the single point cutting tool is carried out while turning a stainless steel (SS) workpeice on the manual lathe trainer. The vibrations developed during this activity are examined for failure-free and various failure states of a tool. The statistical modeling is then incorporated to trace vital signs from vibration signals. The multiple-binary-rule-based model for categorization is designed using the decision tree. Lastly, various tree-based algorithms are used for the categorization of tool conditions. The Random Forest offered the highest classification accuracy, i.e., 92.6%.
Various Higgs factories are proposed to study the Higgs boson precisely and systematically in a model- independent way. In this study, the Particle Flow Network and ParticleNet techniques are used to classify the Higgs decays into multicategories, and the ultimate goal is to realize an "end-to-end" analysis. A Monte Carlo simulation study is performed to demonstrate the feasibility, and the performance looks rather promising. This result could be the basis of a "one-stop" analysis to measure all the branching fractions of the Higgs decays simultaneously. 相似文献
An innovative volatolomic approach employs the detection of biomarkers present in cerumen (earwax) to identify cattle intoxication by Stryphnodendron rotundifolium Mart., Fabaceae (popularly known as barbatimão). S. rotundifolium is a poisonous plant with the toxic compound undefined and widely distributed throughout the Brazilian territory. Cerumen samples from cattle of two local Brazilian breeds (‘Curraleiro Pé-Duro’ and ‘Pantaneiro’) were collected during an experimental intoxication protocol and analyzed using headspace (HS)/GC–MS followed by multivariate analysis (genetic algorithm for a partial least squares, cluster analysis, and classification and regression trees). A total of 106 volatile organic metabolites were identified in the cerumen samples of bovines. The intoxication by S. rotundifolium influenced the cerumen volatolomic profile of the bovines throughout the intoxication protocol. In this way, it was possible to detect biomarkers for cattle intoxication. Among the biomarkers, 2-octyldecanol and 9-tetradecen-1-ol were able to discriminate all samples between intoxicated and nonintoxicated bovines. The cattle intoxication diagnosis by S. rotundifolium was accomplished by applying the cerumen analysis using HS/GC–MS, in an easy, accurate, and noninvasive way. Thus, the proposed bioanalytical chromatography protocol is a useful tool in veterinary applications to determine this kind of intoxication. 相似文献
DNA microarray data has been widely used in cancer research due to the significant advantage helped to successfully distinguish between tumor classes. However, typical gene expression data usually presents a high-dimensional imbalanced characteristic, which poses severe challenge for traditional machine learning methods to construct a robust classifier performing well on both the minority and majority classes. As one of the most successful feature weighting techniques, Relief is considered to particularly suit to handle high-dimensional problems. Unfortunately, almost all relief-based methods have not taken the class imbalance distribution into account. This study identifies that existing Relief-based algorithms may underestimate the features with the discernibility ability of minority classes, and ignore the distribution characteristic of minority class samples. As a result, an additional bias towards being classified into the majority classes can be introduced. To this end, a new method, named imRelief, is proposed for efficiently handling high-dimensional imbalanced gene expression data. imRelief can correct the bias towards to the majority classes, and consider the scattered distributional characteristic of minority class samples in the process of estimating feature weights. This way, imRelief has the ability to reward the features which perform well at separating the minority classes from other classes. Experiments on four microarray gene expression data sets demonstrate the effectiveness of imRelief in both feature weighting and feature subset selection applications. 相似文献
水体中过高浓度的有机物含量危害巨大,不仅会造成严重的环境污染,而且危害人类身体健康,传统化学法检测水体化学需氧量(COD)的步骤繁琐且时效性差,不利于水体中COD的快速定量检测。针对这些问题,提出了一种将紫外光谱与组合权值模型相结合的快速定量检测COD方法,该组合权值模型是基于反向区间偏最小二乘法(BiPLS)结合组合区间偏最小二乘法(SiPLS)算法对紫外光谱的特征子区间筛选组合,然后依据特征子区间的权值建立的预测模型。首先按照一定的浓度梯度配制45份COD标准液样本,通过实验获取标准液的紫外光谱数据;对获取到的COD紫外光谱数据做一阶导数和S-G滤波(Savitzky-Golay)的预处理,消除基线漂移和环境干扰噪声;应用SPXY(Sample set partitioning based on jiont X-Y)算法将实验样本数据组划分成校正集和预测集。然后基于BiPLS算法对全光谱区间进行波长筛选,在BiPLS筛选过程中,目标区间的划分数量会对建模产生较大影响,于是对子区间划分数量进行优化,把子区间分成15~25个,在不同区间数下都进行偏最小二乘(PLS)建模,通过交互验证均方根误差(RMSECV)来筛选最优子区间数,得到区间数为18时,模型效果最佳。从18个波长区间筛选出了6个特征波长子区间,入选的子区间为2,1,3,11,7和6,对应波长为234~240,262~268,269~275,290~296,297~303和304~310 nm,这6个特征波长区间涵盖了大量的光谱信息,对最终预测模型的贡献度大;接下来通过SiPLS算法对这6个初选区间进行进一步的筛选组合,采用不同的组合数构建不同特征区间上的PLS模型,在相同组合数下,筛选出一个区间组合数最优的结果,对比不同组合数下预测模型的误差与相关性,将6个区间筛选组合为3个特征波长区间,分别为234~240,262~275和290~310 nm,这三个特征区间最佳因子数分别为4,4和3。对传统SiPLS的特征区间组合方法进行改进,基于权值的大小来对这3个特征区间进行线性组合,代替过去特征区间直接组合的方法。通过权值公式计算出这3个特征区间的权重大小分别为0.509,0.318和0.173,最终建立线性组合权值COD浓度预测模型。为了验证组合权重预测模型的精度,另外建立了全波长范围内的PLS预测模型、单个特征波长区间的PLS预测模型、直接组合特征波长区间的PLS模型,并使用评价参数相关系数的平方(R2)、预测值与真实浓度值的均方根误差(RMSEP)和预测回收率(T)来对模型评价。验证结果表明,相比其他预测模型,组合权值模型相关系数的平方达到了0.999 7,明显优于直接组合特征区间建模的0.968 0,预测均方根误差为0.532,比直接组合特征区间的预测模型误差降低了29.3%,预测回收率为96.4%~103.1%,显著地提高了预测精度。该方法简单可行,不会产生二次污染,可为在线监测水体中COD浓度提供一定的技术支持。 相似文献