Parse Imputation for Dependency Annotations

Jason Mielens, Liang Sun, Jason Baldridge


Abstract

Syntactic annotation is a hard task, but it can be made easier by allowing annotators flexibility to leave many aspects of a sentence underspecified. Unfortunately, partial annotations are not directly usable for training parsers. We describe a method for imputing missing dependencies from sentences that have been partially annotated using the Graph Fragment Language, such that a standard dependency parser can then be trained on all annotations. We show that this strategy improves performance over not using partial annotations for English, Chinese, Portuguese and Kinyarwanda, and that performance competitive with state-of-the-art unsupervised and weakly-supervised parsers can be reached with just a few hours of annotation.