Astronomy

(nextflipdebug2) #1
WWW.ASTRONOMY.COM 35

Galaxy Zoo, the public is asked to iden-
tify the type of galaxy shown: Is it a disk?
Is it edge-on? Is there a central bulge?
These features can be quickly identified
by eye, but natural variations can make
them exceedingly difficult for computers
to recognize and categorize.
“Humans are actually very well
designed to picking out serendipitous
discoveries in image datasets,” Fortson
says. “By virtue of evolution, humans
have developed this amazing visual cor-
tex that can differentiate the unknown
unknowns from the knowns.”
Of course, using the untrained public
doesn’t come without its challenges.
People make mistakes. Luckily, the large
number of people involved in the identi-
fication can be used to create averages
and a group consensus, which, over the
long run, can be even more accurate
than a single scientist’s identification. In
Galaxy Zoo, 40 different individuals
examine each galaxy to create a trusted
identification. By carefully processing
the results, individual people can even
be weighted differently depending on
their identification success rate. In this
way, people whose identifications gener-
ally don’t agree with the group consen-
sus can be f lagged for rejection, so they
don’t skew the end results.


Rise of the machines
Once the masses have identified and
categorized thousands of images,
significant work remains to analyze
the data. This is where computers finally
come in. These machines are the heavy
lifters, allowing for complex calculations
and comparisons that the
human brain would be hard-
pressed to match on its own.
While machines historically
can only do exactly what they
are told, a subset of comput-
ers are being taught to think
on their own.
Astronomers are using a
type of artificial intelligence,
called machine learning, to
get computers to teach them-
selves how to find patterns in
the data. A specific method of machine
learning known as artificial neural
networks was designed based on how
the brain functions. These neural
networks draw connections in vast
webs of data, just as the human brain
does. To create these networks, a scien-
tist starts by showing the computer a


“training set,” which is a series of exam-
ples containing what the computer is
looking for — such as spiral galaxies.
Over time, and with enough examples,
the computer will become adept at
identifying spiral galaxies, despite
their wide range of appearances. At
this point, the scientist can provide
the computer with a sample of unidenti-
fied galaxies, and the machine will
return those that fit the criteria it
has assessed.
Machines can also be taught a much
more difficult task: assessing how objects
and their characteristics relate to one
another. For example, scientists have used
artificial neural networks to
investigate how galaxies form
clusters and how that group-
ing affects the numbers of
stars the galaxies produce.
Only with the assistance of
computers are the scientists
able to compare the many
physical properties at play,
such as galaxy mass, distance
between galaxies, and previ-
ous interactions between gal-
axies. And by comparing
many hundreds of thousands of galaxies,
scientists are able to make broad conclu-
sions about our universe that are unbi-
ased by small irregularities.
When encoded properly, artificial
neural networks can provide profound
insight to scientists; however, they can
also be easily misused. For example, if the

training set is not extensive enough, the
computer will draw the wrong conclu-
sions. Or, as astronomers are fond of
repeating, “Garbage in, garbage out.”
The other drawback to artificial neu-
ral networks is that they require vast
datasets to “learn” from. Luckily, in the
era of large-scale surveys, vast datasets
are common. This means that artificial
neural networks can quickly turn the
problem of too much data into an advan-
tage. The larger the training set — which
citizen scientists can help bolster — the
better the results.

The future of
unexpected discoveries
“Our ability to collect these humongous
datasets is developing in parallel with
our ability to interpret these huge data-
sets,” says Ivezić. “Both directions are
important — people who collect data and
people who develop tools to analyze and
interpret. Otherwise we’d just be stuck
with a huge pile of zeros and ones we
couldn’t make sense out of.”
With the combination of large-scale
surveys, a legion of citizen scientists, and
new machine learning techniques, it
seems many new unexpected discoveries
will soon emerge from the darkness. But
as for the nature of those discoveries?
Only time can tell.

Mara Johnson-Groh is a science writer and
photographer who writes about everything
under the Sun, and even things beyond it.

Galaxies, clusters
of galaxies, and
clusters of clusters
of galaxies join
with dark matter
to form a grand,
weblike structure
called the cosmic
web, a slice of
which is shown
here. With the
help of artificial
neural networks,
astronomers
hope to run
simulations like
this to investigate
the cosmic web in
much greater detail
than previously
possible. NASA, ESA,
AND E. HALLMAN (UNIVERSITY
OF COLORADO BOULDER)

LSST will
collect as
much as 30
terabytes of
data every
clear night.
Free download pdf