Learning Topic Hierarchies for Wikipedia Categories

Linmei Hu, Xuzhong Wang, Mengdi Zhang, Juanzi Li, Xiaoli Li, Chao Shao, Jie Tang, Yongbin Liu


Abstract

Existing studies have utilized Wikipeida for various knowledge acquisition tasks. However, no attempts have been made to explore multi-level topic knowledge contained in Wikipedia articles' Contents tables. The articles with similar subjects are grouped together into Wikipedia categories. In this work, we propose novel methods to automatically construct comprehensive topic hierarchies for given categories based on the structured Contents tables as well as corresponding unstructured text descriptions. Such a hierarchy is important for information browsing, document organization and topic prediction. Experimental results show our proposed approach, incorporating both the structural and textual information, achieves high quality topic hierarchies.