Tags » Reinforcement Learning

An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning

Yaodong Yang, Lantao Yu, Yiwei Bai, Jun Wang, Weinan Zhang, Ying Wen, Yong Yu

In this paper, we conduct an empirical study on discovering the orderedcollective dynamics obtained by a population of artificial intelligence (AI)agents. 159 more words

Machine Learning Frontier

Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning

Behzad Ghazanfari, Matthew E. Taylor

Reinforcement learning (RL), while often powerful, can suffer from slowlearning speeds, particularly in high dimensional spaces. The autonomousdecomposition of tasks and use of hierarchical methods hold the potential tosignificantly speed up learning in such domains. 80 more words

Machine Learning Frontier

Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds

Zhiyu Lin, Brent Harrison, Aaron Keech, Mark O. Riedl

We describe a method to use discrete human feedback to enhance theperformance of deep learning agents in virtual three-dimensional environmentsby extending deep-reinforcement learning to model the confidence andconsistency of human feedback. 93 more words

Machine Learning Frontier

Learning to Optimize with Reinforcement Learning

Ke Li

Since we posted our paper on “Learning to Optimize” last year, the area of optimizer learning has received growing attention. In this article, we provide an introduction to this line of work and share our perspective on the opportunities and challenges in this area. 348 more words

Machine Learning Frontier

Prosocial learning agents solve generalized Stag Hunts better than selfish ones

Alexander Peysakhovich, Adam Lerer

There is much interest in applying reinforcement learning methods tomulti-agent systems. A popular way to do so is the method of reactive training– ie. 161 more words

Machine Learning Frontier

Reinforcement Learning Toolbok

all the algorithm is running on pycharm IDE, or the package loss error may exist.

implemented algorithm: trpo a3c

  • a3c:for continous action space, use multi processes, but saving model has not been implemented.
  • 41 more words
Machine Learning Frontier