0% found this document useful (0 votes)
64 views4 pages

AI's Deep Learning Challenges

Deep learning AI has achieved impressive feats but still lacks human-level intelligence and common sense. While it can recognize complex patterns, it does not truly understand what those patterns represent. Researchers are working to address deep learning's vulnerabilities, inefficiency in learning, lack of transparency in decision making, and inability to reason about concepts in a human-like way. The limits of current deep learning highlight the remaining challenges in developing true artificial general intelligence.

Uploaded by

mavromou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views4 pages

AI's Deep Learning Challenges

Deep learning AI has achieved impressive feats but still lacks human-level intelligence and common sense. While it can recognize complex patterns, it does not truly understand what those patterns represent. Researchers are working to address deep learning's vulnerabilities, inefficiency in learning, lack of transparency in decision making, and inability to reason about concepts in a human-like way. The limits of current deep learning highlight the remaining challenges in developing true artificial general intelligence.

Uploaded by

mavromou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

NEWS FEATURE

NEWS FEATURE

What are the limits of deep learning?


The much-ballyhooed artificial intelligence approach boasts impressive feats but still falls
short of human brainpower. Researchers are determined to figure out what’s missing.
M. Mitchell Waldrop, Science Writer

There’s no mistaking the image: It’s a banana—a big, This example of what deep-learning researchers
ripe, bright-yellow banana. Yet the artificial intelligence call an “adversarial attack,” discovered by the Google
(AI) identifies it as a toaster, even though it was trained Brain team in Mountain View, CA (1), highlights just
with the same powerful and oft-publicized deep-learning how far AI still has to go before it remotely approaches
techniques that have produced a white-hot revolution in human capabilities. “I initially thought that adversarial
driverless cars, speech understanding, and a multitude of examples were just an annoyance,” says Geoffrey Hinton,
other AI applications. That means the AI was shown sev- a computer scientist at the University of Toronto and one
eral thousand photos of bananas, slugs, snails, and of the pioneers of deep learning. “But I now think they’re
similar-looking objects, like so many flash cards, and then probably quite profound. They tell us that we’re doing
drilled on the answers until it had the classification down something wrong.”
cold. And yet this advanced system was quite easily con- That’s a widely shared sentiment among AI prac-
fused—all it took was a little day-glow sticker, digitally titioners, any of whom can easily rattle off a long list of
pasted in one corner of the image. deep learning’s drawbacks. In addition to its vulnerability

Apparent shortcomings in deep-learning approaches have raised concerns among researchers and the general public as
technologies such as driverless cars, which use deep-learning techniques to navigate, get involved in well-publicized
mishaps. Image credit: Shutterstock.com/MONOPOLY919.
Downloaded by guest on July 25, 2021

Published under the PNAS license.

1074–1077 | PNAS | January 22, 2019 | vol. 116 | no. 4 www.pnas.org/cgi/doi/10.1073/pnas.1821594116


“Neural network” models of AI process signals by sending them through a network of nodes analogous to neurons.
Signals pass from node to node along links, analogs of the synaptic junctions between neurons. “Learning” improves the
outcome by adjusting the weights that amplify or damp the signals each link carries. Nodes are typically arranged in a
series of layers that are roughly analogous to different processing centers in the cortex. Today’s computers can handle
“deep-learning” networks with dozens of layers. Image credit: Lucy Reading-Ikkanda (artist).

to spoofing, for example, there is its gross inefficiency. Still, there’s no denying that deep learning is an in-
“For a child to learn to recognize a cow,” says Hinton, credibly powerful tool—one that’s made it routine to
“it’s not like their mother needs to say ‘cow’ 10,000 deploy applications such as face and voice recognition
times”—a number that’s often required for deep-learn- that were all but impossible just a decade ago. “So I
ing systems. Humans generally learn new concepts from have a hard time imagining that deep learning will go
just one or two examples. away at this point,” Cox says. “It is much more likely
Then there’s the opacity problem. Once a deep- that we will modify it, or augment it.”
learning system has been trained, it’s not always clear
how it’s making its decisions. “In many contexts that’s Brain Wars
just not acceptable, even if it gets the right answer,” says Today’s deep-learning revolution has its roots in the “brain
David Cox, a computational neuroscientist who heads wars” of the 1980s when advocates of two different ap-
the MIT-IBM Watson AI Lab in Cambridge, MA. Suppose proaches to AI were talking right past each other.
a bank uses AI to evaluate your credit-worthiness and On one side was an approach—now called “good old-
fashioned AI”—that had dominated the field since the
then denies you a loan: “In many states there are laws
1950s. Also known as symbolic AI, it used mathematical
that say you have to explain why,” he says.
symbols to represent objects and the relationship between
And perhaps most importantly, there’s the lack of
objects. Coupled with extensive knowledge bases built by
common sense. Deep-learning systems may be wizards
humans, such systems proved to be impressively good at
at recognizing patterns in the pixels, but they can’t un-
reasoning and reaching conclusions about domains such as
derstand what the patterns mean, much less reason
medicine. But by the 1980s, it was also becoming clear that
about them. “It’s not clear to me that current systems
symbolic AI was impressively bad at dealing with the fluidity
would be able to see that sofas and chairs are for sitting,”
of symbols, concepts, and reasoning in real life.
says Greg Wayne, an AI researcher at DeepMind, a In response to these shortcomings, rebel researchers
London-based subsidiary of Google’s parent company, began advocating for artificial neural networks, or con-
Alphabet. nectionist AI, the precursors of today’s deep-learning
Increasingly, such frailties are raising concerns systems. The idea in any such system is to process sig-
about AI among the wider public, as well—especially nals by sending them through a network of simulated
as driverless cars, which use similar deep-learning nodes: analogs of neurons in the human brain. The
techniques to navigate, get involved in well-publicized signals pass from node to node along connections, or
mishaps and fatalities. “People have started to say, links: analogs of the synaptic junctions between neu-
‘Maybe there is a problem’,” says Gary Marcus, a cogni- rons. And learning, as in the real brain, is a matter of
tive scientist at New York University and one of deep adjusting the “weights” that amplify or damp the sig-
learning’s most vocal skeptics. Until the past year or so, nals carried by each connection.
he says, “there had been a feeling that deep learning In practice, most networks arrange the nodes as a
Downloaded by guest on July 25, 2021

was magic. Now people are realizing that it’s not magic.” series of layers that are roughly analogous to different

Waldrop PNAS | January 22, 2019 | vol. 116 | no. 4 | 1075


processing centers in the cortex. So a network spe- the right answer is.” The network then uses the
cialized for, say, images would have a layer of input backpropagation algorithm to improve its next guess.
nodes that respond to individual pixels in somewhat the Supervised learning works great, says Botvinick—if
same way that rod and cone cells respond to light hitting you just happen to have a few hundred thousand
the retina. Once activated, these nodes propagate their carefully labeled training examples lying around.
activation levels through the weighted connections to That’s not often the case, to put it mildly. And it
other nodes in the next level, which combine the in- simply doesn’t work for tasks such as playing a video
coming signals and are activated (or not) in turn. This game where there are no right or wrong answers—
continues until the signals reach an output layer of just strategies that succeed or fail.
nodes, where the pattern of activation provides an an- For those situations—and indeed, for much of life in
swer—asserting, for example, that the input image was the real world—you need reinforcement learning, Bot-
the number “9.” And if that answer is wrong—say that vinick explains. For example, a reinforcement learning
the input image was a “0”—a “backpropagation” al- system playing a video game learns to seek rewards (find
gorithm works its way back down through the layers, some treasure) and avoid punishments (lose money).
adjusting the weights for a better outcome the next time. The first successful implementation of reinforcement
By the end of the 1980s, such neural networks had learning on a deep neural network came in 2015 when a
turned out to be much better than symbolic AI at dealing group at DeepMind trained a network to play classic
with noisy or ambiguous input. Yet the standoff between Atari 2600 arcade games (4). “The network would take
the two approaches still wasn’t resolved—mainly be- in images of the screen during a game,” says Botvinick,
cause the AI systems that could fit into the computers of who joined the company just afterward, “and at the
the time were so limited. It was impossible to know for output end were layers that specified an action, like how
sure what those systems were capable of. to move the joystick.” The network’s play equaled or
surpassed that of human Atari players, he says. And in
Power Boost 2016, DeepMind researchers used a more elaborate
That understanding began to advance only in the version of the same approach with AlphaGo (5)—a
2000s, with the advent of computers that were orders network that mastered the complex board game Go—
of magnitude more powerful and social media sites and beat the world-champion human player.
offering a tsunami of images, sounds, and other train-
ing data. Among the first to seize this opportunity was Beyond Deep Learning
Unfortunately, neither of these milestones solved the
fundamental problems of deep learning. The Atari system,
“These challenges are real, but they’re not a dead end.” for example, had to play thousands of rounds to master
a game that most human players can learn in minutes.
—Matthew Botvinick And even then, the network had no way to understand or
reason about on-screen objects such as paddles. So Hin-
Hinton, coauthor of the backpropagation algorithm ton’s question remains as valid as ever: What’s missing?
Maybe nothing. Maybe all that’s required is more
and a leader of the 1980s-era connectionist movement.
connections, more layers, and more sophisticated meth-
By mid-decade, he and his students were training
ods of training. After all, as Botvinick points out, it’s been
networks that were not just far bigger than before. They
shown that neural networks are mathematically equiva-
were considerably deeper, with the number of layers
lent to a universal computer, which means there is no
increasing from one or two to about half a dozen.
computation they cannot perform—at least in principle,
(Commercial networks today often use more than 100.)
if you can ever find the right connection weights.
In 2009, Hinton and two of his graduate students
But in practice, those caveats can be killers—one
showed (2) that this kind of “deep learning” could
big reason why there is a growing feeling in the field
recognize speech better than any other known
that deep learning’s shortcomings require some fun-
method. In 2012, Hinton and two other students damentally new ideas.
published experiments (3) showing that deep neural One solution is simply to expand the scope of the
networks could be much better than standard vision training data. In an article published in May 2018 (6),
systems at recognizing images. “We almost halved for example, Botvinick’s DeepMind group studied
the error rates,” he says. With that double whammy in what happens when a network is trained on more than
speech and image recognition, the revolution in deep- one task. They found that as long as the network has
learning applications took off—as did researchers’ efforts enough “recurrent” connections running backward
to improve the technique. from later layers to earlier ones—a feature that allows
One early priority was to expand the ways that deep- the network to remember what it’s doing from one
learning systems could be trained, says Matthew Botvi- instant to the next—it will automatically draw on the
nick, who in 2015 took leave from his neuroscience lessons it learned from earlier tasks to learn new ones
group at Princeton to do a year’s sabbatical at DeepMind faster. This is at least an embryonic form of human-
and never left. Both the speech- and image-recognition style “meta-learning,” or learning to learn, which is a
systems used what’s called supervised learning, he says: big part of our ability to master things quickly.
“That means for every picture, there is a right answer— A more radical possibility is to give up trying to
Downloaded by guest on July 25, 2021

say, ‘cat’—and if the network is wrong, you tell it what tackle the problem at hand by training just one big

1076 | www.pnas.org/cgi/doi/10.1073/pnas.1821594116 Waldrop


network and instead have multiple networks work in infants are also beginning to learn the basics of in-
tandem. In June 2018, the DeepMind team published tuitive psychology, which includes an ability to rec-
an example they call the Generative Query Network ognize faces and a realization that the world contains
architecture (7), which harnesses two different net- agents that move and act on their own.
works to learn its way around complex virtual envi- Having this kind of built-in inductive biasing would
ronments with no human input. One, dubbed the presumably help deep neural networks learn just as
representation network, essentially uses standard rapidly, which is why many researchers in the field are
image-recognition learning to identify what’s visible now making it a priority. Within just the past 1 or
to the AI at any given instant. The generation net- 2 years, in fact, the field has seen a lot of excitement
work, meanwhile, learns to take the first network’s over a potentially powerful approach known as the
output and produce a kind of 3D model of the entire graph network (9). “These are deep-learning systems
environment—in effect, making predictions about that have an innate bias toward representing things as
the objects and features the AI doesn’t see. For objects and relations,” says Botvinick.
example, if a table only has three legs visible, the For example, certain objects such as paws, tail, and
model will include a fourth leg with the same size, whiskers might all belong to a larger object (cat) with
shape, and color. the relationship is-a-part-of. Likewise, Ball A and Block
These predictions, in turn, allow the system to learn B might have the mutual relationship is-next-to, the
quite a bit faster than with standard deep-learning Earth would have the relationship is-in-orbit-around
methods, says Botvinick. “An agent that is trying to the Sun, and so on through a huge range of other
predict things gets feedback automatically on every examples—any of which could be represented as an
time-step, since it gets to see how its predictions abstract graph in which the nodes correspond to ob-
turned out.” So it can constantly update its models to jects and the links to relationships.
make them better. Better still, the learning is self- A graph network, then, is a neural network that
supervised: the researchers don’t have to label any- takes such a graph as input—as opposed to raw pixels
thing in the environment for it to work or even provide or sound waves—then learns to reason about and
rewards and punishments. predict how objects and their relationships evolve
An even more radical approach is to quit asking the over time. (In some applications, a separate, standard
networks to learn everything from scratch for every image-recognition network might be used to analyze
problem. The blank-slate approach does leave the a scene and pick out the objects in the first place.)
networks free to discover ways of representing objects The graph-network approach has already demon-
and actions that researchers might never have thought strated rapid learning and human-level mastery of a
of, as well as some totally unexpected game-playing variety of applications, including complex video games
strategies. But humans never start with a blank slate: (10). If it continues to develop as researchers hope, it
for almost any task, they can bank on at least some prior could ease deep learning’s 10,000-cow problem by
knowledge that they’ve learned through experience or making training much faster and more efficient. And
that was hardwired into their brains by evolution. it could make the networks far less vulnerable to
Infants, for example, seem to be born with many adversarial attacks simply because a system that rep-
hardwired “inductive biases” that prime them to ab- resents things as objects, as opposed to patterns of
sorb certain core concepts at a prodigious rate. By the pixels, isn’t going to be so easily thrown off by a little
age of 2 months, they are already beginning to master noise or an extraneous sticker.
the principles of intuitive physics (8), which includes Fundamental progress isn’t going to be easy or fast
the notion that objects exist, that they tend to move in any of these areas, Botvinick acknowledges. But even
along continuous paths, and that when they touch so, he believes that the sky’s the limit. “These chal-
they don’t just pass through each other. Those same lenges are real,” he says, “but they’re not a dead end.”

1 Brown TB, Mané D, Roy A, Abadi M, Gilmer J (2017) Adversarial patch. ArXiv:1712.09665 [cs.CV].
2 Mohamed A, Dahl G, Hinton G (2009) Deep belief networks for phone recognition. Available at www.cs.toronto.edu/∼asamir/papers/
NIPS09.pdf. Accessed December 19, 2018.
3 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Advances in Neural
Information Processing Systems, eds Pereira F, Burges CJC, Bottou L, Weinberger KQ (Curran Associates, Inc., Red Hook, NY), Vol 25,
pp 1097–1105.
4 Mnih V, et al. (2015) Human-level control through deep reinforcement learning. Nature 518:529–533.
5 Silver D, et al. (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489.
6 Wang JX, et al. (2018) Prefrontal cortex as a meta-reinforcement learning system. Nat Neurosci 21:860–868.
7 Eslami SMA, et al. (2018) Neural scene representation and rendering. Science 360:1204–1210.
8 Lake BM, Ullman TD, Tenenbaum JB, Gershman SJ (2017) Building machines that learn and think like people. Behav Brain Sci
40:e253.
9 Battaglia PW, et al. (2018) Relational inductive biases, deep learning, and graph networks. ArXiv:1806.01261 [cs.LG].
10 Zambaldi V, et al. (2018) Relational deep reinforcement learning. ArXiv:1806.01830 [cs.LG].
Downloaded by guest on July 25, 2021

Waldrop PNAS | January 22, 2019 | vol. 116 | no. 4 | 1077

You might also like