Speaker
Dr. Chuangchuang Sun, Assistant Professor, Department of Aerospace Engineering, Mississippi State University
Title
Statistics Seminar Series (Hybrid)
Subtitle
Multiagent Reinforcement Learning under Nonstationarity
Physical Location
Allen 15
Digital Location
https://msstate.webex.com/msstate/j.php?MTID=m8ce377b82dde20f05e835b18379eff44
Abstract: A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively nonstationary due to the changing policies of other agents. Moreover, each agent is constantly learning, leading to natural non-stationarity in the distribution of experiences encountered. Previous approaches also suffer from myopic evaluation, considering only a finite number of policy updates. As such, these methods can only influence transient future policies rather than achieving the promise of scalable equilibrium selection approaches that influence the behavior at convergence. In addition, this nonstationarity can affect the transition and reward functions, which eventually leads to the robustness issue.
To address those issues, we propose a novel meta-multiagent policy gradient theorem that directly accounts for the non-stationary policy dynamics inherent to multiagent learning settings.
This is achieved by modeling our gradient updates to consider both an agent’s own nonstationary policy dynamics and the nonstationary policy dynamics of other agents in the environment. Moreover, we propose a principled framework for considering the limiting policies of other agents as time approaches infinity to consider the influencing long-term behavior. Lastly, we propose a minimax MARL approach to infer the worst-case policy update of other agents, which is subsequently solved via convex relaxation. We test our method on a diverse suite of multiagent benchmarks.