Speaker
Dr. Roberto Molinari, Assistant Professor, Department of Mathematics and Statistics, Auburn University
Title
Statistics Seminar Series
Subtitle
SWAG: A Sparse Wrapper Algorithm for Multi-Model Inference
Physical Location
Allen 411
Abstract:
The task of modelling data for interpretation and/or prediction is usually focused on selecting the best single model (learner) that fits the data and that predicts future data accurately. In many cases though this approach has either led to an increased use of black-box machine-learning mechanisms which are hard to interpret for practitioners (and can be prone to overfitting) or to model- and variable-interpretations that vary between analysts even when using the same data. This has resulted in the so-called “replication crisis” where, in different domains, findings from different studies appear to contradict each other. To address this problem and improve interpretation, a new line of research fosters the use of many models (so-called “Rashomon Sets”). Developed in parallel to this direction of research, we present a heuristic algorithm which aims at selecting a set of highly predictive models which can deliver new ways of (i) interpreting data, (ii) allowing practitioners to pick the highly-predictive model that best suits their needs, (iii) delivering new forms of inference when there is more than one true model, and (iv) providing a method to eventually improve ensemble learning methods. In particular, we discuss how this new approach can be useful in the domain of medicine and, more specifically, in genomics.