Co-training for Semi-supervised Sentiment Classification Based on Dual-view Bags-of-words Representation

Rui Xia, Cheng Wang, Xin-Yu Dai, Tao Li


Abstract

A review text is normally represented as a bag-of-words (BOW) in sentiment classification. Such a simplified BOW model has fundamental deficiencies in modeling some complex linguistic phenomena such as negation. In this work, we propose a dual-view co-training algorithm based on dual-view BOW representation for semi-supervised sentiment classification. In dual-view BOW, we automatically construct antonymous reviews and model a review text by a pair of bags-of-words with opposite views. We make use of the original and antonymous views in pairs, in the training, bootstrapping and testing process, all based on a joint observation of two views. The experimental results demonstrate the advantages of our approach, in meeting the two co-training requirements, addressing the negation problem, and enhancing the semi-supervised sentiment classification efficiency.