A Weighted Curve Fitting Method for Result Merging in Federated Search
Abstract
Result merging is an important step in federated search to merge the documents returned from multiple source-specific ranked lists for a user query. Previous result merging methods such as Semi-Supervised Learning (SSL) and Sample- Agglomerate Fitting Estimate (SAFE) use regression methods to estimate global document scores from document ranks in individual ranked lists. SSL relies on overlapping documents that exist in both individual ranked lists and a centralized sample database. SAFE goes a step further by using both overlapping documents with accurate rank information and documents with estimated rank information for regression. However, existing methods do not distinguish the accurate rank information from the estimated information. Furthermore, all documents are assigned equal weights in regression while intuitively, documents in the top should carry higher weights. This paper proposes a weighted curve fitting method for result merging in federated search. The new method explicitly models the importance of information from overlapping documents over non-overlapping ones. It also weights documents at different positions differently. Empirically results on two datasets clearly demonstrate the advantage of the proposed algorithm.
Keywords
curve fitting, federated search, result merging, retrieval models
Date of this Version
2011
DOI
10.1145/2009916.2010107
Comments
SIGIR '11 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information