[MTH18a] Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection
Conférence Internationale avec comité de lecture :
Advances in Neural Information Processing Systems (NIPS),
December 2018,
Canada,
Mots clés: Multi-task learning, deep learning, primary task, residual block
Résumé:
Multi-Task Learning (MTL) is appealing for deep learning regularization. In this
paper, we tackle a specific MTL context denoted as
primary MTL, where the ultimate goal is to improve the performance of a given primary task by leveraging
several other auxiliary tasks. Our main methodological contribution is to introduce
ROCK, a new generic multi-modal fusion block for deep learning tailored to the
primary MTL context. ROCK architecture is based on a residual connection, which
makes forward prediction explicitly impacted by the intermediate auxiliary repre-
sentations. The auxiliary predictor’s architecture is also specifically designed to
our primary MTL context, by incorporating intensive pooling operators for maxi-
mizing complementarity of intermediate representations. Extensive experiments
on NYUv2 dataset (object detection with scene classification, depth prediction,
and surface normal estimation as auxiliary tasks) validate the relevance of the
approach and its superiority to flat MTL approaches. Our method outperforms
state-of-the-art object detection models on NYUv2 by a large margin, and is also
able to handle large-scale heterogeneous inputs (real and synthetic images) with
missing annotation modalities.