Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar

December 2019

Abstract

Multi-agent reinforcement learning has been successfully applied to a number of challenging problems. Despite these empirical successes, theoretical understanding of different algorithms is lacking, primarily due to the curse of dimensionality caused by the exponential growth of the state-action space with the number of agents. We study a fundamental problem of multi-agent linear quadratic regulator in a setting where the agents are partially exchangeable. In this setting, we develop a hierarchical actor-critic algorithm, whose computational complexity is independent of the total number of agents, and prove its global linear convergence to the optimal policy. As linear quadratic regulators are often used to approximate general dynamic systems, this paper provided an important step towards better understanding of general hierarchical mean-field multi-agent reinforcement learning.

Type

Preprint

Publication

Technical report

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

Abstract

Yuwei Luo

MS Student (2019-2020)

Mladen Kolar

Associate Professor of Econometrics and Statistics

Related