Online Multitask Learning for Machine Translation Quality Estimation

José G. C. de Souza, Matteo Negri, Elisa Ricci, Marco Turchi


Abstract

We present a method for predicting machine translation output quality geared to the needs of computer-assisted translation. These include the capability to: i) continuously learn and self-adapt to a stream of data coming from multiple translation jobs, ii) react to data diversity by exploiting human feedback, and iii) leverage data similarity by learning and transferring knowledge across domains. To achieve these goals, we combine two supervised machine learning paradigms, online and multitask learning, adapting and unifying them in a single framework. We show the effectiveness of our approach in a regression task (HTER prediction), in which online multitask learning outperforms the competitive online single-task and pooling methods used for comparison. This indicates the feasibility of integrating in a CAT tool a single QE component capable to simultaneously serve (and continuously learn from) multiple translation jobs involving different domains and users.