Keywords

Deep Reinforcement Learning, Deep Q-Learning

Select the category the research project fits.

Mathematical/Computational Sciences

Is this submission part of ICaP/PW (Introductory Composition at Purdue/Professional Writing)?

Yes

Abstract

In the past decade, learning algorithms developed to play video games better than humans have become more common. Google’s DeepMind Technologies developed learning algorithms that could play Atari video games and also demonstrated their famous AlphaGo algorithm which outperformed professional Go players. However, little research has been done on learning algorithms developed to complete the particularly difficult single-player games. In particular, much further research could be done on developing learning algorithms for mechanically challenging games such as “bullet hell” games. We believe that agents could learn to efficiently evade obstacles utilizing deep reinforcement learning. The purpose of this study is to understand how to create such an efficient evasion algorithm. The deep learning model utilized is a convolutional neural network trained with a variant of the Q-learning algorithm. The model is given positional coordinate data and bullet location data as its input and outputs a value function to determine the best following action. The agent controls the directional inputs of the game’s user avatar and its inputs, or actions, are modeled as a two-dimensional Markov decision process. The agent uses the game’s internal score and the amount of time its user avatar avoids being hit by obstacles as its target and experiments with its inputs each episode to increase the maximum reward. Each training episode is reset when the user avatar is hit by the bullets.

Share

COinS
 

Collision Avoidance with Deep Reinforcement Learning

In the past decade, learning algorithms developed to play video games better than humans have become more common. Google’s DeepMind Technologies developed learning algorithms that could play Atari video games and also demonstrated their famous AlphaGo algorithm which outperformed professional Go players. However, little research has been done on learning algorithms developed to complete the particularly difficult single-player games. In particular, much further research could be done on developing learning algorithms for mechanically challenging games such as “bullet hell” games. We believe that agents could learn to efficiently evade obstacles utilizing deep reinforcement learning. The purpose of this study is to understand how to create such an efficient evasion algorithm. The deep learning model utilized is a convolutional neural network trained with a variant of the Q-learning algorithm. The model is given positional coordinate data and bullet location data as its input and outputs a value function to determine the best following action. The agent controls the directional inputs of the game’s user avatar and its inputs, or actions, are modeled as a two-dimensional Markov decision process. The agent uses the game’s internal score and the amount of time its user avatar avoids being hit by obstacles as its target and experiments with its inputs each episode to increase the maximum reward. Each training episode is reset when the user avatar is hit by the bullets.