AlphaGo recovers from early error to beat Sedol in fifth Go game, series ends 4-1

Cam Bunton | Mar 15 2016 - 3:22 am PT

Back when I was in high school, I remember our computer studies teacher telling us that a computer only does what it’s told to do, and so mistakes are not the machine’s, but rather the user’s. With neural networks and machine learning, that is no longer true. AlphaGo, DeepMind’s specialist Go-playing machine, has proved as much. AlphaGo has been programmed to learn from its mistakes, and can err all on its own.

The AI-powered system failed to recover from an error against Lee Sedol in their fourth game, and eventually lost. In the fifth game, however, it made a mistake and was able to win the series in seemingly dramatic fashion.

Advertisement - scroll for more content

DeepMind’s founder Demis Cassabas announced the fifth game results on Twitter in the early hours of this morning, stating that AlphaGo spent much of the time furiously trying to recover from an early mistake. It succeeded, and in doing so, finished the series 4-1 up versus the 18-time Go world champion.

#AlphaGo wins game 5! One of the most incredible games ever. To comeback from the initial big mistake against Lee Sedol was mind-blowing!!!

— Demis Hassabis (@demishassabis) March 15, 2016

It’s impossible to overstate how important this series win is for the future of computing technology. Playing Go isn’t just a case of programming a machine to learn a set of algorithms, it’s a game which needs a ‘human touch’. With that said, it’s been impossible to program a computer to play the game to any high degree of competency up until recently. With deep neural networks (often called artificial intelligence), AlphaGo was taught how to learn the game.

Speaking to The Guardian, Cassabas said “we call it deep reinforcement learning. It’s the combination of deep learning, neural network stuff, with reinforcement learning: so learning by trial and error, and incrementally improving and learning from your mistakes and your errors, so that you improve your decisions.”

With AlphaGo, DeepMind‘s engineers essentially got the machine to play against itself millions of times, giving it more experience in playing the game than any human could achieve in a lifetime.

This is a milestone event to remember in the world of computing. Now the only question remaining is ‘where do we go next?’