Sentiment detection with auxiliary data
As an important application in text mining and social media, sentiment detection has aroused more and more research interests, due to the expanding volume of available online information such as microblogging messages and review comments. Many machine learning methods have been proposed for sentiment detection. As a branch of machine learning, transfer learning is an important technique that tries to transfer knowledge from one domain to another one. When applied to sentiment detection, existing transfer learning methods employ articles with human labeled sentiments from other domains to help the sentiment detection on a target domain. Although most existing transfer learning methods are devoted to handle the data distribution difference between different domains, they only resort to some approximation methods, which may introduce some unnecessary biases. Furthermore, the popular assumption of existing transfer learning techniques on conditional probability is often too strong for practical applications. In this paper, we propose a novel method to model the distribution difference between different domains in sentiment detection by directly modeling the underlying joint distributions for different domains. Some of the important properties of the proposed method, such as the convergence rate and time complexity, are analyzed. The experimental results on the product review dataset and the twitter dataset demonstrate the advantages of the proposed method over the state-of-the-art methods.
text mining, microblogging, machine learning, transfer learning, twitter datasetm, conditional probability
Date of this Version