Ensemble Methods for Top-N Recommendation

Ziwei Fan, Purdue University


As the amount of information grows, the desire to efficiently filter out unnecessary information and retain relevant or interested information for people is increasing. To extract the information that will be of interest to people efficiently, we can utilize recommender systems. Recommender systems are information filtering systems that predict the preference of a user to an item. Based on historical data of users, recommender systems are able to make relevant recommendations to users. Due to its usefulness, Recommender systems have been widely used in many applications, including e-commerce and healthcare information systems. However, existing recommender systems suffer from several issues, including data sparsity and user/item heterogeneity. In this thesis, a hybrid dynamic and multi-collaborative filtering based recommendation technique has been developed to recommend search terms for physicians when physicians review a large number of patients’ information. Besides, a local sparse linear method ensemble has been developed to tackle the issues of data sparsity and user/item heterogeneity. In health information technology systems, most physicians suffer from information overload when they review patient information. A novel hybrid dynamic and multi-collaborative filtering method has been developed to improve information retrieval from electronic health records. We tackle the problem of recommending the next search term to a physician while the physician is searching for information about a patient. In this method, I have combined first-order Markov Chain and multi-collaborative filtering methods. For multi-collaborative filtering methods, I have developed the physician-patient collaborative filtering and transition-involved collaborative filtering methods. The developed method is tested using electronic health record data from the Indiana Network for Patient Care. The experimental results demonstrate that for 46.7% of test cases, this new method is able to correctly prioritize relevant information among top-5 recommendations that physicians are truly interested in. The local sparse linear model ensemble has been developed to tackle both the data sparsity and the user/item heterogeneity issues for the top-n recommendation. Multiple local sparse linear models are learned for all the users and items in the system. I have developed similarity-based and popularity-based methods to determine the local training data for each local model. Each local model is trained on Sparse Linear Method (SLIM) which is a powerful recommendation technique for top-n recommendation. These learned models are then combined in various ways to produce top-N recommendations. I have developed model results combination and model combination methods to combine all learned local models. The developed methods are tested on a benchmark dataset and its sparsified datasets. The experiments demonstrate 18.4% improvement from such ensemble models, particularly on sparse datasets.




Ning, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server