Artificial intelligence company DeepMind announced the latest version of their Go engine, AlphaGo Zero., and the artificial intelligence program quickly became the most powerful Go player in the world.
The first Go computer to beat a human professional was AlphaGo Lee, an earlier version of the same project. It won against Lee Sodel in 2015 in a well-publicized breakthrough in the world of artificial intelligence.
The latest iteration of AlphaGo quickly surpassed its predecessors because of a difference in the way its programmers approached the problem of Go strategy. According to Jared Miller, John Brown University senior engineering major with an electrical/computer concentration, “With AlphaGo Zero, they started at a really low point, not even teaching it the rules but just setting a goal for it. If you wanted it to do something, the better it did, the more reward it would get. That was the most basic way it learned: it continually optimized itself to reach its goal.”
The reason for AlphaGo Zero’s proficiency at the game comes from the machine-learning techniques employed by its programmers. Most strategy artificial intelligences are given a set of rules and strategies and then fed hundreds of games to teach the computers where to use them. Alternatively, AlphaGo Zero was given solely the rules of the game and a means to determine the winner of a completed game. AlphaGo Zero was set to endlessly play itself at supercomputer speed. The AI reached the level of greatest Go player in the world in a mere 40 days of training.
Miller comments on AlphaGo Zero’s evolutionary nature, “With each generation an AI goes through, if, in the last generation there was something that worked really, really well, it is way more likely that that same decision will be chosen again and be built upon than the other decisions.”
Go, an ancient Asian abstract strategy game played with black and white markers on a 19×19 board, has traditionally been known as a game geared more towards human players than computers. The vast number of possible outcomes to the game, coupled with tricky decisions on balancing positions and pieces, means that the human mind can often evaluate good moves and eliminate bad ones from a strategy more efficiently than a computer’s relentless rounds of repetition. Nonetheless, the game is still inherently deterministic: every game has one goal and one outcome.
Issues with AI aren’t always as black and white as the game of Go. Tim Gilmour, associate professor of engineering, commented on some of his concerns.
“I think AI is a very powerful technique for solving sort of limited problems: problems that are limited in scope, that have a clear right and wrong answer,” he said.
However, according to Gilmour, “if it’s some very general problem that’s like a robot that says, ‘Okay, I’m here to help, what do you want me to do?’ it’s supposed to have some programming inside which says I’m not going to do any harmful things. But humans can trick it and say I want you to do this, this, and this, and the robot doesn’t know any better and does something harmful. We try to make AI so smart for safety, but, if it’s a very general problem, where the computer is supposed to figure out what it’s supposed to do in a very general case, it’s probably not the safest situation.”
Miller has a more optimistic view of morality in AI. Regarding the dangers, he said, “There’s actually a system that programmers are trying to implement so that robots can not only be smarter but be ethical.” This system uses the same ideas behind other varieties of artificial intelligence, learning to grow something almost akin to digital morality. “It’s a lot less about placing moral boundaries on a robot, and instead about placing a seed in the AI itself to help it grow its own morals,” Miller said.
AlphaGo Zero’s strengths lie in just this sort of learning. Because it builds itself, it can grow in much the same way people learn new skills. Miller agrees, “One of the best ways to improve our technology is to teach machines how to teach themselves.”