首页 | 本学科首页   官方微博 | 高级检索  
     


Sar modeling of unbalanced data sets
Authors:H. S. Rosenkranz  A. R. Cunningham
Affiliation:Department of Environmental and Occupational Health, Graduate School of Public Health , University of Pittsburgh , 111 Parran Hall, 130 DeSoto Street, Pittsburgh, P A, 15261, USA
Abstract:Abstract

The increased acceptance of SAR approaches to hazard identification has led us to investigate methods to improve the predictive performance of SAR models. In the present study we demonstrate that although on theoretical grounds the ratio of active to inactive chemicals in the learning set should be unity, SAR models can ?tolerate‘ an unbalanced range in ratios from 3 : 1 (i.e., 75% actives) to 1 : 2 (i.e., 33% actives) and still perform adequately. On the other hand SAR models derived from learning sets with ratios in excess of 4 : 1 (80% actives), even when corrected for the initial ratio do not perform satisfactorily.
Keywords:Unbalanced data  SAR  CASE/MULTICASE  Optimum models
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号