Labeled Grammar Induction with Minimal Supervision

Yonatan Bisk, Christos Christodoulopoulos, Julia Hockenmaier


Abstract

Nearly all work in unsupervised grammar induction aims to induce unlabeled dependency trees from gold part-of-speech-tagged text. These clean linguistic classes provide a very important, though unrealistic, inductive bias. Conversely, induced clusters are very noisy. We show here, for the first time, that very limited human supervision (three frequent words per cluster) may be required to induce labeled dependencies from automatically induced word clusters.