Low-Rank Regularization for Sparse Conjunctive Feature Spaces: An Application to Named Entity Classification

Audi Primadhanty, Xavier Carreras, Ariadna Quattoni


Abstract

Entity classification, like many other important problems in NLP, involves learning classifiers over sparse high-dimensional feature spaces that result from the conjunction of elementary features of the entity mention and its context. In this paper we develop a spectral regularization framework for training max-entropy models in such sparse conjunctive feature spaces. Our approach handles conjunctive feature spaces using matrices and induces an implicit low-dimensional representation via low-rank constraints. We show that when learning entity classifiers under minimal supervision, using a seed set, our approach is more effective in controlling model capacity than standard techniques for linear classifiers.