After Go, Google’s DeepMind AI gets curious, plays Montezuma’s Revenge [Video]

Ben Lovejoy | Jun 9 2016 - 4:37 am PT

Learning to play Go well enough to beat a professional champion and preparing to take on the world champion has to be hard work for an AI system, so Google’s DeepMind team decided to let it try something a little less highbrow.

DeepMind had already learned how to play 49 different Atari 2600 games, in each case figuring out the gameplay on its own, but had reportedly given up on Montezuma’s Revenge. The issue, apparently, was that it got bored. In order to persuade it to keep trying, they had to program it with artificial curiosity …

The video depicts a DQN agent playing Montezuma’s Revenge via the Arcade Learning Environment. The agent’s reward function is augmented with an intrinsic reward based on a pseudo-count, itself computed from a sequential density model. This intrinsic reward allows the agent to explore a full two-thirds of the first level of the game and achieve significantly higher scores than anything previously reported.

In other words, the software was rewarded for trying new things. It then took 100 million training frames to explore two of the games’ rooms, simply trying a new approach each time the character was killed.

You can read the full paper here, but the video below gives you the basic idea. Come the day when AI systems want to take over from humanity, our best bet may be to try to distract them with 80s computer games …