Local AdaGrad-type algorithm for stochastic convex-concave optimization


Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks. Despite their wide applicability, solving such problems efciently and efectively is challenging in the presence of large amounts of data using existing stochastic minimax methods. We study a class of stochastic minimax methods and develop a communication-efcient distributed stochastic extragradient algorithm, LocalAdaSEG, with an adaptive learning rate suitable for solving convex-concave minimax problems in the Parameter-Server model. LocalAdaSEG has three main features: (1) a periodic communication strategy that reduces the communication cost between workers and the server; (2) an adaptive learning rate that is computed locally and allows for tuning-free implementation; and (3) theoretically, a nearly linear speed-up with respect to the dominant variance term, arising from the estimation of the stochastic gradient, is proven in both the smooth and nonsmooth convex-concave settings. LocalAdaSEG is used to solve a stochastic bilinear game, and train a generative adversarial network. We compare LocalAdaSEG against several existing optimizers for minimax problems and demonstrate its efcacy through several experiments in both homogeneous and heterogeneous settings.

Machine Learning
Luofeng Liao
Luofeng Liao
MS Student (2020-2021)

Prior to graduate school, he received B.S. degree in Computer Science at Fudan University in June 2019. His research interests include high-dimensional statistis and distributed optimization. Luofeng continued his education as a PhD student at Columbia University. His personal website can be found here.

Mladen Kolar
Mladen Kolar
Associate Professor of Econometrics and Statistics

Mladen Kolar is an Associate Professor of Econometrics and Statistics at the University of Chicago Booth School of Business. His research is focused on high-dimensional statistical methods, graphical models, varying-coefficient models and data mining, driven by the need to uncover interesting and scientifically meaningful structures from observational data.