首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Multi-Class Classification of Medical Data Based on Neural Network Pruning and Information-Entropy Measures
Authors:Mximo Eduardo Snchez-Gutirrez  Pedro Pablo Gonzlez-Prez
Institution:1.Colegio de Ciencia y Tecnología, Universidad Autónoma de la Ciudad de México, Ciudad de Mexico 06720, Mexico;2.Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana-Cuajimalpa, Ciudad de Mexico 05348, Mexico
Abstract:Medical data includes clinical trials and clinical data such as patient-generated health data, laboratory results, medical imaging, and different signals coming from continuous health monitoring. Some commonly used data analysis techniques are text mining, big data analytics, and data mining. These techniques can be used for classification, clustering, and machine learning tasks. Machine learning could be described as an automatic learning process derived from concepts and knowledge without deliberate system coding. However, finding a suitable machine learning architecture for a specific task is still an open problem. In this work, we propose a machine learning model for the multi-class classification of medical data. This model is comprised of two components—a restricted Boltzmann machine and a classifier system. It uses a discriminant pruning method to select the most salient neurons in the hidden layer of the neural network, which implicitly leads to a selection of features for the input patterns that feed the classifier system. This study aims to investigate whether information-entropy measures may provide evidence for guiding discriminative pruning in a neural network for medical data processing, particularly cancer research, by using three cancer databases: Breast Cancer, Cervical Cancer, and Primary Tumour. Our proposal aimed to investigate the post-training neuronal pruning methodology using dissimilarity measures inspired by the information-entropy theory; the results obtained after pruning the neural network were favourable. Specifically, for the Breast Cancer dataset, the reported results indicate a 10.68% error rate, while our error rates range from 10% to 15%; for the Cervical Cancer dataset, the reported best error rate is 31%, while our proposal error rates are in the range of 4% to 6%; lastly, for the Primary Tumour dataset, the reported error rate is 20.35%, and our best error rate is 31%.
Keywords:medical data and signals  machine learning  restricted Boltzmann machine  feature selection  discriminant pruning  information-entropy measures
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号