40 | New Scientist | 24 August 2019
same tactics used by AlphaZero in chess.
“It really struck a chord,” says Sadler.
“You really do start thinking there’s an AI
style shared across these different challenges.”
The result is a new kind of software that
displays what looks very much like creativity
and – whisper it – intuition. David Silver at
DeepMind is also struck by these thoughts.
“The professional Go players who competed
with AlphaGo repeatedly remarked on the
creativity of the system,” he says. “They
expected it to play in a way that was perhaps
dull but efficient and instead there was real
beauty and inventiveness to the games.”
So why do these AIs surprise us more than
earlier software? The most likely reason is their
lack of human bias. As good as previous chess
computers are, they have human strategies
built in. DeepMind’s AIs learn by playing against
themselves. Their algorithms may be different,
but their general approach is the same.
All use a machine-learning technique
called deep reinforcement learning. This boils
down to building a neural network – software
loosely modelled on the brain and capable of
performing a particular task – by training it on
millions of times, to become the best Go player
and then the best chess player ever. “AlphaZero
discovers thousands of concepts that lead to
winning more games,” says Silver. “To begin
with, these steps are quite elementary, but
eventually this same process can discover
knowledge that is surprising even to top
human players.”
Silver and his colleagues focused on games
because they are excellent test beds, offering
a wide range of challenges that are familiar to
humans. But the end goal of AI development
is far more ambitious. “In terms of what’s
next, we think our approaches could be
applicable to some fundamental problems
in science,” says Silver.
An early glimpse of what might be possible
came last year with AlphaFold, a DeepMind
AI that predicts the intricate structures of
proteins. A better understanding of how
proteins work will help us control everything
from disease to food production. But a
protein’s function is determined by its unique
structure. And that structure, which usually
looks a bit like a tangled rope, is hard to predict
from the sequence of its constituent amino
acids. Researchers rely on laborious, expensive
structure-determination methods that don’t
work for many proteins. Cracking how a protein
folds based on its amino acid sequence is a very
desirable goal, but despite people pursuing this
for 70-odd years, it is still largely elusive.
In July 2018, AlphaFold won the Critical
Assessment of Protein Structure Prediction
challenge, the gold standard for assessing
software that aims to predict how proteins
fold. The hope is that AlphaFold will bring
to future efforts to predict protein structure
what related AIs bring to games. So where
do we go from here? How far are we from
realising bigger goals?
“Sure, we’ve made great progress but I don’t
The ability of AIs to
think outside the usual
boxes could provide
the breakthroughs we
need in tackling some
of the world’s biggest
problems. Yet will we
be happy with what the
machines come up with?
Sometimes, an AI’s
solution to a puzzle is
no help, but even when
it makes sense, we may
feel uncomfortable. For
technical problems such
as curbing energy use
or designing chemical
reactions, people will
probably go along with
an AI. But when it comes
to social problems, it
might be hard to shrug
off the feeling that we
know better.
For example, imagine
that instead of voting for
who we want in a
government, we asked
an AI to assess the
strengths of the various
candidates and pick for
us. If its choices didn’t
fit our expectations or
preferences, would we
go along with them?
It might be the
same for moral issues.
“If an AI that was always
right about stuff started
giving me moral advice,
I might think twice about
following it even if
intellectually I know I
ought to,” says Anders
Sandberg at the
University of Oxford.
“I might just want to
decide myself.” Or
maybe not – it would be
fascinating, if somewhat
dystopian, if people
absolved themselves of
decision-making.
In principle, testing
ideas to figure out what
works best and then
basing policy on those
results makes lots of
sense. But Sandberg
thinks this approach
won’t work for issues
that evoke strong
feelings in us – how our
children are taught, for
example. This is why
attempts to run policy
trials in schools have
proved controversial.
It is likely to be even
harder to accept an AI’s
recommendations in
such circumstances,
especially if they seem
strange. “AI will probably
be able to tell us how
to educate children,
but will we want it to?”
asks Sandberg.
Alien thinking
“ It makes sense
that we should solve
problems in tandem
with machines”
large amounts of data. In a process of trial
and error, successes, such as winning a game
of Go, are rewarded, reinforcing a particular
behaviour.
AlphaGo and AlphaStar learned by
themselves, following human examples. But
AlphaZero uses only the rules of the game – the
“zero” stands for zero input. Instead, it is given
the rules and a goal, then left to its own devices.
Starting randomly, it plays itself over and over
again until it figures it all out. On the way, it
picks up its own method of doing things. In
just a few hours, AlphaZero played itself tens of
DE
EP
MIN
D