Abstract

Studies on drug design datasets are continuing to grow. These datasets are usually known as hard modeled, having a large number of features and a small number of samples. The most common problems in the drug design area are of regression type. Committee machines (ensembles) have become popular in machine learning because of their high performance. In this study, dynamics of ensembles on regression related drug design problems are investigated on a big dataset collection. The study tries to determine the most successful ensemble algorithm, the base algorithm-ensemble pair having the best / worst results, the best successful single algorithm, and the similarities of algorithms according to their performances. We also discuss whether ensembles always generate better results than single algorithms.

Date of this Version

9-2009

Share

COinS