Scientific American 201905

May 2019, ScientificAmerican.com 61

Input: Thousands of cat photographs

Each layer of the network learns to identify progressively more complex features

Training

Images broken into pixels

Result: Ability to recognize a cat

Output: Image label

Cat

Input: Sets of different defined groupings

Training

Pretraining

Input: A few cat photographs

Result: Ability to recognize a cat faster

Cat

Result: Ability to generate convincing cat image

Discriminator is randomly given either a real or a fake cat image

Training

Discriminator judges whether the image is real. If not, in what ways is it not real? Feedback is fed to the generator.

Real

Fa ke

Fake (generated) cat image

Noise

Discriminator

Generator

Input: Random noise and a class Cat

Result: Ability to isolate and reconstruct elements

Training Input: Primitive elements with multiple variables

Bottleneck is gradually loosened

A so-called deep network has tens or hundreds of hidden lay- ers. They might represent midlevel structures such as edges and geometric shapes, although it is not always obvious what they are doing. With thousands of neurons and millions of inter con nec- tions, there is no simple logical path through the system. And that is by design. Neural networks are masters at problems not amen - able to explicit logical rules, such as pattern recognition. Crucially, the neuronal connections are not fixed in advance but adapt in a process of trial and error. You feed the network images labeled “dog” or “cat.” For each image, it guesses a label. If it is wrong, you adjust the strength of the connections that contrib- uted to the erroneous result, which is a straightforward exercise in calculus. Starting from complete scratch, without knowing what an image is, let alone an animal, the network does no better than a coin toss. But after perhaps 10,000 examples, it does as well as a human presented with the same images. In other training methods, the network responds to vaguer cues or even discerns the categories entirely on its own. Remarkably, a network can sort images it has never seen be- fore. Theorists are still not entirely sure how it does that, but one factor is that the humans using the network must tolerate errors or even deliberately introduce them. A network that classifies its initial batch of cats and dogs perfectly might be fudging: basing its judgment on unreliable cues and variations rather than on essential features. This ability of networks to sculpt themselves means they can solve problems that their human designers have no idea how to

solve. And that includes the problem of making the networks even better at what they do.

GOING ME TA teachers often coMplain that students forget everything over the summer. In lieu of making vacations shorter, they have taken to loading them up with summer homework. But psychologists such as Robert Bjork of the University of California, Los Angeles, have found that forgetting is not inimical to learning but essential to it. That principle applies to machine learning, too. If a machine learns a task, then forgets it, then learns another task and forgets it, and so on, it can be coached to grasp the com- mon features of those tasks, and it will pick up new variants faster. It won’t have learned anything specific, but it will have learned how to learn—what researchers call meta-learning. When you do want it to retain information, it’ll be ready. “After you’ve learned to do 1,000 tasks, the 1,001st is much easier,” says Sanjeev Arora, a machine-learning theorist at Princeton University. Forgetting is what puts the meta into meta-learning. Without it, the tasks all blur together, and the machine can’t see their overall structure. Meta-learning gives machines some of our mental agility. “It will probably be key to achieving AI that can perform with human-level intelligence,” says Jane Wang, a computational neuro- scientist at Google’s DeepMind in London. Conversely, she thinks that computer meta-learning will help scientists figure out what happens inside our own head. In nature, the ultimate meta-learning algorithm is Darwinian

Generative Adversarial Networks A classification network can be run in reverse to generate fresh images—cats that never existed, say, but look as if they could have. Researchers train this “gen erative” network by coupling it with an ordinary classifier to serve as critic and coach. Random noise is input to the system to ensure that each new cat is unique.

Disentanglement A machine can learn to pick apart a scene into the objects that consti tute it. One network compresses the input data; the other expands them again. By constricting the link between the two, the system is forced to find the most parsimonious description. That is usually the de scription a human would use, too, thereby making the network more transparent in its operation.

Scientific American 201905

Get our desktop app

Company

Features

Documentation

Resources