Robust Mixed Linear Regression using heterogeneous batches

Presented at University of Washington, Seattle, WA, USA, 2020

By exploiting the similarities across tasks, one can hope to overcome data scarcity. Under a canonical scenario where each task is drawn from a mixture of \(k\) linear regressions, we study a fundamental question: Can abundant small-data tasks compensate for the lack of big-data tasks? This work introduces a spectral approach that is simultaneously robust under both data scarcity and outlier tasks. We design an outlier-robust principal component analysis algorithm that achieves an optimal accuracy. This is followed by a sum-of-squares algorithm to exploit the information from higher order moments. Together, this approach is robust against outliers and achieves a graceful statistical trade-off.

This project resulted in two publications:

  1. Meta-learning for Mixed Linear Regression at ICML 2020, and
  2. Robust Meta-learning for Mixed Linear Regression with Small Batches at NeurIPS 2020.

Leave a Comment