Abstract
We propose a residual randomization procedure designed for robust inference using Lasso estimates in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in settings that also include heavy-tailed covariates and errors. Moreover, our procedure can be valid under clustered errors, which is important in practice, but has been largely overlooked by earlier work. Through extensive simulations, we illustrate our method’s wider range of applicability as suggested by theory. In particular, we show that our method outperforms state-of-art methods in challenging, yet more realistic, settings where the distribution of covariates is heavy-tailed or the sample size is small, while it remains competitive in standard, ‘well behaved’ settings previously studied in the literature.
Publication
Proceedings of the 38th International Conference on Machine Learning (ICML) 2021
Post-doc (2018-2021)
Sam Wang is currently an assistant professor at Cornell University. He was a post-doc at the University of Chicago Booth School of Business advised by Mladen Kolar. His theoretical and methodological research interests are focused on causal discovery, graphical models, and high-dimensional statistical methods; his applied research interests lie in the social sciences. He completed his PhD in 2018 at the University of Washington advised by Mathias Drton and received his undergraduate degree from Rice University. Prior to embarking on his PhD studies, he worked in management consulting.
RP (2019-2021)
Sky is a PhD student at Yale.
Panagiotis (Panos) Toulis studies causal inference in complex settings (e.g., networks) using methods of structured inference, such as permutation tests. He is also interested in the interface of statistics and optimization, particularly in inference problems on large data sets through stochastic gradient descent.
Associate Professor of Econometrics and Statistics
Mladen Kolar is an Associate Professor of Econometrics and Statistics at the University of Chicago Booth School of Business. His research is focused on high-dimensional statistical methods, graphical models, varying-coefficient models and data mining, driven by the need to uncover interesting and scientifically meaningful structures from observational data.