Asymptotic Analysis of Machine Learning Models: Comparison Theorems and Universality

Abstract: The study of Machine Learning models in asymptotic regimes, has provided insight into many of the properties of ML models, but seemingly contradicts classical statistical wisdom. To solve this mystery, this thesis focuses on the analysis of models such as the LASSO and Random features regression, when the data points and model parameters grow infinite at constant ratios. It provides analysis for the asymptotic behavior of these problems, including characterization of the learning curves; the predicted training and generalization error as a function of the degree of overparameterization. The papers in this thesis particularly focus on the usage of Gaussian comparison theorems as a methodological tool for the analysis of these problems. In particular, the convex Gaussian min max theorem allows us to study more complex ML optimization problems, by considering alternative models that are simpler to analyze, but asymptotically hold similar properties. Secondarily, this thesis considers universality, which within the asymptotic context demonstrates that many statistics of ML models are fully determined by lower order statistical moments. This allows us to study surrogate Gaussian models, matching these moments. These surrogate Gaussian models can subsequently be analyzed by means of the Gaussian comparison theorems.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)