Data preparation using data quality matrices for classification mining |
| |
Authors: | Ian Davidson Giri Tayi |
| |
Affiliation: | 1. Department of Computer Science, University of California, Davis, CA, USA;2. School of Business, State University of New York, Albany, NY, USA |
| |
Abstract: | ![]() Data mining aims to find patterns in organizational databases. However, most techniques in mining do not consider knowledge of the quality of the database. In this work, we show how to incorporate into classification mining recent advances in the data quality field that view a database as the product of an imprecise manufacturing process where the flaws/defects are captured in quality matrices. We develop a general purpose method of incorporating data quality matrices into the data mining classification task. Our work differs from existing data preparation techniques since while other approaches detect and fix errors to ensure consistency with the entire data set our work makes use of the apriori knowledge of how the data is produced/manufactured. |
| |
Keywords: | Data manufacturing Data quality Data preparation Application of data mining |
本文献已被 ScienceDirect 等数据库收录! |
|