Negation and Speculation Identification in Chinese Language

Bowei Zou, Qiaoming Zhu, Guodong Zhou


Abstract

Identifying negative or speculative narrative fragments from fact is crucial for natural language processing (NLP) applications. Previous studies on negation and speculation identification in Chinese language suffers much from two problems: corpus scarcity and the bottleneck in fundamental Chinese information processing. To resolve these problems, this paper constructs a Chinese corpus which consists of three sub-corpora from different resources. In order to detect the negative and speculative cues, a sequence labeling model is proposed. Moreover, a bilingual cue expansion method is proposed to increase the coverage in cue detection. In addition, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a cue, instead of the traditional chunking-based frame-work. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which obtained significant improvement over the state-of-the-art on negation and speculation identification in Chinese language.