Unsupervised extractive summarization via coverage maximization with syntactic and semantic concepts

Natalie Schluter and Anders Søgaard


Abstract

Coverage maximization with bigram concepts is a state-of-the-art approach to unsupervised extractive summarization. It has been argued that such concepts are adequate and, in contrast to more linguistic concepts such as named entities or syntactic dependencies, more robust, since they do not rely on automatic processing. In this paper, we show that while this seems to be the case for a commonly used newswire dataset, syntactic and semantic concepts lead to significant improvements in performance in other domains.