Statistics Seminar - 11/01/22

Nov 1 3:30 pm

Speaker

Dr. Chuangchuang Sun, Assistant Professor, Department of Aerospace Engineering, Mississippi State University

Title

Statistics Seminar Series (Hybrid)

Subtitle

Multiagent Reinforcement Learning under Nonstationarity

Physical Location

Allen 15

Digital Location

https://msstate.webex.com/msstate/j.php?MTID=m8ce377b82dde20f05e835b18379eff44

Begins

Nov 01, 2022 - 3:30 pm

Ends

Nov 01, 2022 - 4:30 pm

Abstract: A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively nonstationary due to the changing policies of other agents. Moreover, each agent is constantly learning, leading to natural non-stationarity in the distribution of experiences encountered. Previous approaches also suffer from myopic evaluation, considering only a finite number of policy updates. As such, these methods can only influence transient future policies rather than achieving the promise of scalable equilibrium selection approaches that influence the behavior at convergence. In addition, this nonstationarity can affect the transition and reward functions, which eventually leads to the robustness issue.

To address those issues, we propose a novel meta-multiagent policy gradient theorem that directly accounts for the non-stationary policy dynamics inherent to multiagent learning settings.
This is achieved by modeling our gradient updates to consider both an agent’s own nonstationary policy dynamics and the nonstationary policy dynamics of other agents in the environment. Moreover, we propose a principled framework for considering the limiting policies of other agents as time approaches infinity to consider the influencing long-term behavior. Lastly, we propose a minimax MARL approach to infer the worst-case policy update of other agents, which is subsequently solved via convex relaxation. We test our method on a diverse suite of multiagent benchmarks.