Attacking Artificial Intelligence: AI's Security Vulnerability and What Policymakers Can Do About It
Attacking Artificial Intelligence: AI's Security Vulnerability and What Policymakers Can Do About It
Attacking
Artificial
Intelligence
AI’s Security Vulnerability and What
Policymakers Can Do About It
Marcus Comiter
PA P E R
AUGUST 20 1 9
Belfer Center for Science and International Affairs
Harvard Kennedy School
79 JFK Street
Cambridge, MA 02138
www.belfercenter.org
Statements and views expressed in this report are solely those of the authors and do not imply
endorsement by Harvard University, the Harvard Kennedy School, or the Belfer Center for Science
Attacking
Artificial
Intelligence
AI’s Security Vulnerability and What
Policymakers Can Do About It
Marcus Comiter
PA P E R
AUGUST 20 1 9
About the Author
Marcus Comiter is a Ph.D. Candidate in Computer Science at Harvard
University and a Non-Resident Fellow at the Belfer Center for Science
and International Affairs. Comiter’s computer science research focuses
on machine learning, computer and wireless networking (including 5G
wireless networks), and security. Comiter’s policy research leverages
knowledge of the latest computer science research to study the public
policy and cybersecurity implications of new technologies, such as
machine learning/artificial intelligence and big data. His research has been
published in both top computer science and public policy venues, and
awarded a best paper award.
ii Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Table of Contents
Executive Summary.......................................................................................................1
Introduction....................................................................................................................3
Conclusion................................................................................................................... 80
Belfer Center for Science and International Affairs | Harvard Kennedy School iii
A simulated representation of AI image recognition software.
Adobe Stock
Executive Summary
Artificial intelligence systems can be attacked.
There are five areas most immediately affected by artificial intelligence attacks:
content filters, the military, law enforcement, traditionally human-based tasks
being replaced by AI, and civil society. These areas are attractive targets for
attack, and are growing more vulnerable due to their increasing adoption of
artificial intelligence for critical tasks.
Belfer Center for Science and International Affairs | Harvard Kennedy School 1
This report proposes “AI Security Compliance” programs to protect against
AI attacks.
Public policy creating “AI Security Compliance” programs will reduce the
risk of attacks on AI systems and lower the impact of successful attacks.
Compliance programs would accomplish this by encouraging stakeholders to
adopt a set of best practices in securing systems against AI attacks, including
considering attack risks and surfaces when deploying AI systems, adopting
IT-reforms to make attacks difficult to execute, and creating attack response
plans. This program is modeled on existing compliance programs in other
industries, such as PCI compliance for securing payment transactions, and
would be implemented by appropriate regulatory bodies for their relevant
constituents.
2 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Introduction
““Artificial intelligence algorithms can be attacked
and controlled by an adversary.”
The terrorist of the 21st century will not necessarily need bombs, uranium,
or biological weapons. He will need only electrical tape and a good pair
of walking shoes. Placing a few small pieces of tape inconspicuously on a
stop sign at an intersection, he can magically transform the stop sign into a
green light in the eyes of a self-driving car. Done at one sleepy intersection,
this would cause an accident. Done at the largest intersections in leading
metropolitan areas, it would bring the transportation system to its knees.
It’s hard to argue with that type of return on a $1.50 investment in tape.
1 Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning visual classification.”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
2 Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial
examples.” arXiv preprint arXiv:1412.6572 (2014)
Belfer Center for Science and International Affairs | Harvard Kennedy School 3
a drone searching for enemy activity on a reconnaissance mission, or
subverting content filters to post terrorist recruiting propaganda on social
networks, the danger is serious, widespread, and already here.
However, just as not all applications of AI are “good,” not all AI attacks are
necessarily “bad.” As autocratic regimes turn to AI as a tool to monitor and
control their populations, AI “attacks” may be used as a protective measure
against government oppression, much like technologies such as Tor and
VPNs are.
The report is split into four sections. First, it begins by giving an accessible
yet comprehensive description of how current AI systems can be attacked,
the forms of these attacks, and a taxonomy for categorizing them.
4 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Second, the report identifies the most critical areas affected by this new
class of vulnerabilities. While the number of systems affected by this new
threat will only grow as AI increases its penetration into the modern world,
this report focuses on five high priority areas that require immediate atten-
tion: content filters, military, law enforcement, human tasks being replaced
with AI, and civil society.
Fourth, the report proposes the idea of “AI Security Compliance” programs
to protect against AI attacks. These compliance programs will reduce the
risk of attacks on AI systems and lower the impact of successful attacks.
They will accomplish this by encouraging stakeholders to adopt a set of
best practices in securing systems against AI attacks, including considering
attack risks and surfaces when deploying AI systems, adopting IT-reforms
that will make attacks more difficult to execute, and creating attack
response plans to mitigate attack damage.
This policy will improve the security of the community, military, and econ-
omy in the face of AI attacks. But for policymakers and stakeholders alike,
the first step towards realizing this security begins with understanding the
problem, which we turn our attention to now.
Belfer Center for Science and International Affairs | Harvard Kennedy School 5
Part I: Technical Problem
General George Patton may have won the D-Day campaign for the Allies
without ever firing a shot. In support of the future D-Day landings, Patton
was given charge of the First United States Army Group (FUSAG). Rather
than fighting in arms, the FUSAG fought in deception. To convince the
German command that the invasion point would be Pas de Calais rather
than Normandy, the FUSAG orchestrated a major force deployment—
including hundreds of tanks and other vehicles—directly across the
English Channel from it.
These tanks, however, were not what they seemed. Unable to spare the
vehicles needed for this show of force from the actual war effort, the Allies
instead used inflatable balloons painted to look like tanks. Although more
characteristic of a technique employed by Bugs Bunny against Elmer Fudd
than George Patton against Nazis, it did the trick. German reconnaissance
was fooled. The images captured by the Luftwaffe planes were interpreted
as a major buildup of forces in anticipation of an invasion of Pas de Calais,
leaving the beaches of Normandy under-fortified.3
Given access to the site, we would not expect a human to mistake what was
essentially a painted balloon for a multi-ton metal machine. But German
reconnaissance worked by recognizing patterns: the shapes and markings
representing tanks and other military assets in images. Relegated to pattern
matching, German reconnaissance was easy to fool with a few strategic
markings placed on the inflatable balloons. Although surprising, this is the
same flaw that dooms AI algorithms, allowing them to be fooled in similar
and even more pernicious manners.
3 Knighton, Andrew, “FUSAG: The Ghost Army—Patton’s D-Day Force That Was Only Threat In The
Enemy’s Imagination”, 14 May 2017, https://www.warhistoryonline.com/world-war-ii/fusag-the-
ghost-army-pattons-d-day-force-that-was-only-a-threat-xb.html.
Belfer Center for Science and International Affairs | Harvard Kennedy School 7
To understand why AI systems are vulnerable to the same weakness, we
must briefly examine how AI algorithms, or more specifically the machine
learning techniques they employ, “learn.” Just like the reconnaissance offi-
cers, the machine learning algorithms powering AI systems “learn” by
extracting patterns from data. These patterns are tied to higher-level con-
cepts relevant to the task at hand, such as which objects are present in an
image. As an example, consider the task of an AI algorithm on a self-driv-
ing car learning to recognize a stop sign. For this task, the algorithm
“learns” by being shown a dataset containing hundreds or thousands of
examples of stop signs and extracting patterns of colors and shapes repre-
sentative of it. When later tasked to identify if a particular sign is a stop
sign, the algorithm scans the image looking for the patterns it has learned
to associate with a stop sign. If the patterns match, the algorithm can
instruct the car to stop. If the patterns match that of a different sign, such
as a new faster speed limit, the algorithm can similarly instruct the car to
speed up.
Just as the FUSAG could expertly devise what pat- ““Only a few stray marks
terns needed to be painted on the inflatable balloons or subtle changes to a
to fool the Germans, with a type of AI attack called handful of pixels in an
an “input attack,” adversaries can craft patterns of image are needed to
destroy an AI system.”
changes to a target that will fool the AI system into
making a mistake. This attack is possible because
when patterns in the target are inconsistent with the variations seen in the
dataset, as is the case when an attacker adds these inconsistent patterns
purposely, the system may produce an arbitrary result. However, unlike
the tank example, these patterns or markings need not be as blatant. This
is because AI algorithms process information differently than humans do.
As a result, while it may have been necessary to make the balloons actually
look like tanks to fool a human, to fool an AI system, only a few stray
marks or subtle changes to a handful of pixels in an image are needed to
destroy an AI system.
8 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
hypnotizing the German analysts to close their eyes anytime they were
about to see any valuable information that could be used to hurt the Allies.
But what exactly are AI attacks? Why do they exist? And what do they
look like? We now turn our attention to understanding the technical basis
of these attacks in order to answer these questions.
Belfer Center for Science and International Affairs | Harvard Kennedy School 9
Overview of Artificial
Intelligence Attacks
10 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
a social network to malfunction, therefore letting the material
propagate unencumbered.
Belfer Center for Science and International Affairs | Harvard Kennedy School 11
Why Do Artificial Intelligence Attacks
Exist?
To see why this is the case, we need to understand how the algorithms
underpinning AI work. Many current AI systems are powered by machine
learning,4 a set of techniques that extract information from data in order
to “learn” how to do a given task. A machine learning algorithm “learns”
analogously to how humans learn. Humans learn by seeing many examples
of an object or concept in the real world, and store what is learned in the
brain for later use. Machine learning algorithms “learn” by seeing many
examples of an object or concept in a dataset, and store what is learned in a
model for later use. In many if not most AI applications based on machine
learning, there is no outside knowledge or other magic used in this process:
it is entirely dependent on the dataset and nothing else.5
4 As a note on terminology, artificial intelligence and machine learning are popularly used
interchangeably. In a more exact sense, the two are distinct. Artificial intelligence is a broader term
that generally refers to the ability of computer systems to execute complex tasks performed by
humans. Machine learning is one particular method used to power artificial intelligence, and is a set
of techniques and algorithms that “learn” by extracting patterns from data. Due to the overwhelming
success of machine learning algorithms compared to other methods, many artificial intelligence
systems today are based entirely on machine learning. As a result, the attacks and vulnerabilities
described in this report affect both artificial intelligence and machine learning systems.
5 Production machine learning systems may feature a good amount of human and guard rail
engineering, while others may be fully data dependent. As a result, some production systems may
fall along a spectrum between “learned” systems that are fully data dependent and “designed”
systems that are heavily based on hand-designed features. However, systems that are closer to
the “designed” side of the spectrum may still be vulnerable to attacks, such as input attacks.
Further, given the success of learning, which often captures patterns and relations that could not
be designed manually by human model designers, many if not most systems will rely heavily on
learned features, and be vulnerable to attacks.
12 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
The key to understanding AI attacks is understanding what the “learning”
in machine learning actually is, and more importantly, what it is not. Recall
that machine learning “learns” by looking at many examples of a concept
or object in a dataset. More specifically, it uses algorithms that extract and
generalize common patterns in these examples. These patterns are stored
within the model. Taking the example of recognizing a stop sign, the learn-
ing algorithm will identify patterns in the pixels that make up the example
images, such as large areas of red, the shapes of the letters “S” “T” “O”
and “P”, and other defining characteristics. When the model is later called
upon to detect a stop sign in a new image, it will search that image for the
same patterns of pixels. If it finds patterns that match those it has learned
to associate with a stop sign, it will output that it has found a stop sign. If
it instead finds patterns that match those it has learned to associate with a
different object, such as a green light, it will output that it has found a green
light. These patterns are “general” in the sense that they should work in
new settings, not just on the examples from which it learned. For example,
the patterns in the example above should be able to recognize all stop signs,
not just the particular ones included in the dataset.
Given enough data, the patterns learned in this manner are of such high
quality that they can even outperform humans on many tasks. This is
because if the algorithm sees enough examples in all of the different ways
the target naturally appears, it will learn to recognize all the patterns
needed to perform its job well. Continuing the stop sign example, if the
dataset contains images of stop signs in the sun and shade, from straight
ahead and from different angles, during the day and at night, it will learn
all the possible ways a stop sign can appear in nature.
6 Bagdasaryan, Eugene, et al. “How to backdoor federated learning.” arXiv preprint arXiv:1807.00459
(2018).
Belfer Center for Science and International Affairs | Harvard Kennedy School 13
But the problems do not end there. Even assuming a non-corrupted dataset
and highly accurate model, this success comes with a very important
caveat: the patterns “learned” by current state-of-the-art machine learning
models are relatively brittle. As a result, the model only works on data that
is similar in nature to the data used during the learning process. If used on
data that is even a little different in nature from the types of variations it
saw in the original dataset, the model may utterly fail. This is a major lim-
itation attackers can exploit: by introducing artificial variations—such as a
piece of tape or other aberrant patterns—the attacker can disrupt the
model and control its behavior based on what artificial pattern is intro-
duced. Because the amount of data used to build the model is finite but the
amount of artificial variations an attacker can create are infinite, the
attacker has an inherent advantage.
This explains how the stop sign tape attack can cause ““Many may be
a self-driving car to crash. While the dataset used to surprised to learn
train the stop sign detector contains plenty of variations that machine
of stop signs in different natural conditions, it doesn’t learning has such a
contain examples of the endless ways it can be artifi- glaring shortcoming.”
Many may be surprised to learn that machine learning has such a glaring
shortcoming. This is because popular culture has shaped a widespread but
erroneous belief that machine learning actually “learns” in a human sense of
the word. Humans are good at truly learning concepts and associations. If a
stop sign is distorted or defaced with graffiti or dirt, even a human who has
never seen graffiti or a dirty stop sign would still reliably and consistently
identify it as a stop sign, and certainly would not mistake it for an entirely
different object altogether, such as a green light. But we now know current AI
systems do not work in the same way. Even a model that can almost perfectly
14 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
recognize a stop sign still has no knowledge of the concept of a stop sign,
or even a sign for that matter, as a human does. It only knows that certain
learned patterns correspond to a label named “stop sign.”
While it may seem that this distinction between human learning and
machine “learning” is arbitrary—especially because if the model works,
it seems we should be happy—we now understand why it has such severe
ramifications: under contested conditions, AI systems can be made to fail
even if they are extremely successful under “normal” conditions.
A logical step to combat this would be to understand why the patterns the
model learns are so brittle. However, this is not currently supported in the
most widely used models, such as deep neural networks, as exactly how
and even what these models learn is still not fully understood. As a result,
the most popular machine learning algorithms powering AI, like neural
networks, are referred to as “black boxes”: we know what goes in, we know
what comes out, but we do not know exactly what happens in between. We
cannot reliably fix what we do not understand. And for this same reason, it
is difficult if not impossible to even tell if a model is being attacked or just
doing a bad job. While other data science methods, such as decision trees
and regression models, allow for much more explainability and under-
standing, these methods do not generally deliver the performance that the
widely used neural networks are capable of providing.
Belfer Center for Science and International Affairs | Harvard Kennedy School 15
• Characteristic 2: Dependence solely on data provides a main
channel to corrupt a machine learning model. Machine learning
“learns” solely by extracting patterns from a set of examples known
as a dataset. Unlike humans, machine learning models have no
baseline knowledge that they can leverage—their entire knowledge
depends wholly on the data they see. Poisoning the data poisons
the AI system. Attacks in this vein essentially turn an AI system
into a Manchurian candidate that attackers can activate at a time of
their choosing.
Taken together, these weaknesses explain why there are no perfect tech-
nical fixes for AI attacks. These vulnerabilities are not “bugs” that can be
patched or corrected as is done with traditional cybersecurity vulnerabili-
ties. They are deep-seated issues at the heart of current state-of-the-art AI
itself.
16 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Input Attacks
Input attacks do not require the attacker to have corrupted the AI system in
order to attack it. Completely state-of-the-art AI systems that are highly accu-
rate and have never had their integrity, dataset, or algorithms compromised are
still vulnerable to input attacks. And in stark contrast to other cyberattacks,
the attack itself does not always use a computer!
Figure 1: In regular use (top), the AI system takes a valid input, processes it
with the model (brain), and returns an output. In an input attack (bottom), the
input to the AI system is altered with an attack pattern, causing the AI system
to return an incorrect output.
Belfer Center for Science and International Affairs | Harvard Kennedy School 17
These attacks are particularly dangerous because the attack patterns do
not have to be noticeable, and can even be completely undetectable.
Adversaries can be surgical, changing just a small aspect of the input in
a precise and exact way to break the patterns learned previously by the
model. For attacks on physical objects that must be captured by a sensor
or camera before being fed into an AI system, attackers can craft small
changes that are just big enough to be captured by the sensor. This is the
canonical “tape attack”: attackers figure out that placing a two-inch piece
of white tape on the upper corner of a stop sign will exploit a particular
brittleness in the patterns learned by the model, turning it into a green
light.7 For attacks on digital objects that are fed directly into the AI system,
such as an image uploaded to a social network, the attack patterns can be
imperceivable to the human eye. This is because in this all-digital setting,
the alterations can occur on an individual pixel level, creating alterations
that are so small they are literally invisible to the human eye.
The most interesting aspect of input attacks is how varied ““Adversaries will
they are. Input attacks on AI systems are like snowflakes: choose a form for
no two are exactly alike. The first step in securing systems their attack that
from these attacks is to create a taxonomy to bring order fits their particular
scenario and
to the endless attack possibilities. “Form fits function” is
mission.”
an appropriate lens with which to do so: adversaries will
choose a form for their attack that fits their particular
scenario and mission. Therefore, a taxonomy should follow this same ten-
dency.
Input attack forms can be characterized along two axes: perceivability and
format. Perceivability characterizes if the attack is perceivable to humans
(e.g., for AI attacks on physical entities, is the attack visible or invisible
to the human eye). Format characterizes if the attack vector is a physical
real-world object (e.g., a stop sign), or a digital asset (e.g., an image file on
a computer). The figure below shows this taxonomy.
7 Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning visual classification.”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
18 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Figure 2: Taxonomy for categorizing input attacks. The horizontal axis
characterizes the format of the attack, either in the physical world or
digital. The vertical axis characterizes the perceivability of the attack, either
perceivable to humans or imperceivable to humans.
(See footnote8 for thumbnail images citations.)
8 Graphic by Marcus Comiter except for stop sign attack thumbnail from Eykholt, Kevin, et al.
“Robust physical-world attacks on deep learning visual classification.” Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition. 2018, panda attack thumbnail from
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial
examples.” arXiv preprint arXiv:1412.6572 (2014), turtle attack thumbnail from Athalye, Anish, et al.
“Synthesizing robust adversarial examples.” arXiv preprint arXiv:1707.07397 (2017), and celebrity
attack thumbnail from Sharif, Mahmood, et al. “Adversarial generative nets: Neural network attacks
on state-of-the-art face recognition.” arXiv preprint arXiv:1801.00349 (2017).
Belfer Center for Science and International Affairs | Harvard Kennedy School 19
Perceivability Axis
We first discuss the perceivability axis. On one end of the axis are “per-
ceivable” attacks in which the input attack pattern is able to be noticed
by humans. The attack patterns can be alterations to the target itself, such
as deforming, removing a portion of, or altering the color of the target.
Alternatively, the attack pattern may be an addition to the target, such as
affixing tape or other decals to the physical target, or adding digital marks
to a digital target. Examples of perceivable attacks include defacing a stop
sign with patterns formed from tape,9 or using software to superimpose
objects such as glasses10 on a digital image of a subject (as many popular
apps like Snapchat do).
The figure below shows how a perceivable attack is formed for a physical object.
A regular object is altered with a visible attack pattern (a few pieces of tape) to
form the attack object. While the regular object would be classified correctly by
the AI system, the attack object is incorrectly classified as a “green light”.
9 Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning visual classification.”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
10 Sharif, Mahmood, et al. “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face
recognition.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications
Security. ACM, 2016.
11 Graphic by Marcus Comiter except for stop sign noise thumbnail and stop sign attack thumbnail
from Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning visual classification.”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
20 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Although perceivable attacks are noticeable by humans, they can still be
highly effective for a number of reasons. First, perceivable attacks need not
be ostentatious. A visible attack in the form of a few carefully chosen pieces
of tape placed on a stop sign is able to be perceived, but will not necessarily
be noticed. Humans are naturally conditioned to ignore small changes in
their environment, such as graffiti, vandalism, and natural wear and tear.
As such, perceivable attacks may go completely unnoticed. Second, per-
ceivable attacks can be crafted to hide in plain sight. A visible attack in the
form of specially designed glasses or a specially crafted logo added to a
person’s t-shirt would be noticed, but would not be suspected of being an
attack, effectively hiding in plain sight. In this case, rather than crafting an
attack to be as small as possible, it may actually be more effective for it to
be large but blend into its surroundings.
On the other end of the visibility axis are “imperceivable” ““For digital
attacks that are invisible to human senses. Imperceivable content like
attacks can take many forms. For digital content like images, images, these
these attacks can be executed by sprinkling “digital dust” ‘imperceivable’
attacks can be
on top of the target.12 Technically, this dust is in the form
executed by
of small, unperceivable perturbations made to the entire
sprinkling ‘digital
target. Each small portion of the target is changed so slightly dust’ on top of
that the human eye cannot perceive the change, but in the target.”
aggregate, these changes are enough to alter the behavior of
the algorithm by breaking the brittle patterns learned by the
model. The figure below shows how an imperceivable attack is formed in this
manner. A normal digital image is altered with tiny, imperceivable pixel-level
perturbations scattered throughout the image, forming the attack image.
While the regular image would be classified correctly by the AI system as a
“panda”, the attack object is incorrectly classified as a “monkey”. However,
because the attack pattern makes such small changes, to the human eye, the
attack image looks identical to the original regular image.
12 Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial
examples.” arXiv preprint arXiv:1412.6572 (2014)
Belfer Center for Science and International Affairs | Harvard Kennedy School 21
Figure 4: Crafting an invisible input attack. A small amount of noise that is
invisible to the human eye is added to the entire image, making the AI system
misclassify the image without changing its appearance. (Image concept from
footnote13, see footnote14 for thumbnail images citations.)
Imperceivable attacks are not limited to just digital objects. For example,
attack patterns can be added in imperceivable ways to a physical object itself.
Researchers have shown that a 3D-printed turtle with an imperceivable input
attack pattern could fool AI-based object detectors.15 While turtle detection
may not have life and death consequences (yet...), the same strategy applied
to a 3D-printed gun may. In the audio domain, high pitch sounds that are
imperceivable to human ears but able to be picked up by microphones can be
used to attack audio-based AI systems, such as digital assistants.
13 Image concept showing how attack is formed from Goodfellow, Ian J., Jonathon Shlens,
and Christian Szegedy. “Explaining and harnessing adversarial examples.” arXiv preprint
arXiv:1412.6572 (2014)
14 Graphic by Marcus Comiter except for panda image thumbnail, noise image thumbnail, and panda
attack thumbnail from Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and
harnessing adversarial examples.” arXiv preprint arXiv:1412.6572 (2014).
15 Athalye, Anish, et al. “Synthesizing robust adversarial examples.” arXiv preprint
arXiv:1707.07397 (2017).
22 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Imperceivable attacks are highly applicable to targets that the adversary
has full control over, such as digital images or manufactured objects. For
example, a user posting an illicit image, such as one containing child por-
nography, can alter the image such that it evades detection by the AI-based
content filters, but also remains visually unchanged from the human per-
spective. This allows the attacker unfettered and, for all practical purposes,
unaltered distribution of the content without detection.
Format
We next discuss the format axis. On one end of the axis are “physical”
attacks. These are attacks in which the target being attacked exists in the
physical world. While physical attacks are easiest to think of in terms
of objects, including stop signs, fire trucks, glasses, and even humans,
they are also applicable to other physical phenomena, such as sound. For
example, attacks have been shown on voice controlled digital assistants,
where a sound has been used to trigger action from the digital assistant.16
Alterations are made directly to or placed on top of these targets in order
to craft an attack. Examples of physical attacks on real-world objects are
shown in the figure below.
Belfer Center for Science and International Affairs | Harvard Kennedy School 23
Figure 5: Examples of physical attacks on real world objects. Physical attacks
can be perceivable, as with the stop sign or yellow glasses, or imperceivable,
as with the 3D-printed turtle and baseball shown here.
(See footnote18 for thumbnail images citations.)
On the other end of the format axis are “digital” attacks. These are attacks
in which the target being attacked is a digital asset. Examples include
images, videos, social media posts, music, files, and documents. Unlike
physical targets that must first be sensed and digitized, digital targets are
fed directly in their original state into the AI system. This gives adversaries
an expanded selection of attacks and lowers the difficulty of crafting a
successful attack, as they do not need to account for possible distortion of
the attack pattern during this sensing process. As such, digital attacks are
particularly well suited to invisibility. Examples of digital attacks on digital
images are shown in the figure below. (While the digital attacks shown in
this figure are all digital images, this choice is for presentation purposes,
and attacks can also target other digital assets such as videos and files.)
18 Graphic by Marcus Comiter except for stop sign attack thumbnail from Eykholt, Kevin, et al.
“Robust physical-world attacks on deep learning visual classification.” Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition. 2018., turtle attack thumbnail
and baseball attack thumbnail from Athalye, Anish, et al. “Synthesizing robust adversarial
examples.” arXiv preprint arXiv:1707.07397 (2017), and girl with glasses attack thumbnail from
Sharif, Mahmood, et al. “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face
recognition.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications
Security. ACM, 2016.
24 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Figure 6: Examples of digital attacks on digital images. Digital attacks can
be perceivable, as with the silly glasses superimposed on the picture of a
celebrity (middle), or imperceivable, as with the panda and duck images
shown here (left, right). (See footnote19 for thumbnail images citations.)
Once attackers have chosen an attack form that suits their needs, they must
craft the input attack. The difficulty of crafting an attack is related to the
types of information available to the attacker. However, it is important to
note that attacks are still practical (although potentially more challenging
to craft) even under very difficult and restrictive conditions.
An input attack is relatively easy to craft if the attacker has access to the AI
model being attacked. Armed with this, the attacker can automatically craft
attacks using simple textbook optimization methods. Publicly available
software implementing these methods is already available.20 Attackers can
also use Generative Adversarial Networks (GANs), a method specifically
created to exploit weaknesses in AI models, to craft these attacks.21
19 Graphic by Marcus Comiter except for panda attack thumbnail from Goodfellow, Ian J., Jonathon
Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples.” arXiv preprint
arXiv:1412.6572 (2014), celebrity attack thumbnail from Sharif, Mahmood, et al. “Adversarial
generative nets: Neural network attacks on state-of-the-art face recognition.” arXiv preprint
arXiv:1801.00349 (2017), and goose attack thumbnail from Gong, Yuan, and Christian Poellabauer.
“Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic
Cues.” 2018 27th International Conference on Computer Communication and Networks (ICCCN).
IEEE, 2018.
20 See, e.g., https://github.com/tensorflow/cleverhans
21 Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing
systems. 2014.
Belfer Center for Science and International Affairs | Harvard Kennedy School 25
While it may seem shocking that attackers would have access to the model,
there are a number of common scenarios in which this would occur rou-
tinely. On the more innocent side of the spectrum, models are often made
public because they have been optimized by researchers or companies for
an important general task, such as object recognition, and then made
public for anyone to use as part of the “open source” movement.22 On the
more sinister side of the spectrum, attackers can hack the system storing
the model in order to steal it. The model itself is just a digital file living on a
computer, no different from an image or document, and therefore can be
stolen like any other file on a computer. Because models are not always
seen as highly sensitive assets, the systems holding these models may not
have high levels of cybersecurity protection. History has shown that when
software capabilities are commoditized, as they are becoming with AI sys-
tems, they are often not handled or invoked carefully in a security sense, as
demonstrated by the prevalence of default root passwords. If this history is
any indication, the systems holding these models will suffer from similar
weaknesses that can lead to the model being easily stolen.
Even in cases where the attacker does not have the ““Even in cases where
model, it is still possible to mount an input attack. the attacker does not
If attackers have access to the dataset used to train have the model, it is
the model, they can use it to build their own copy of still possible to mount
an input attack.”
the model, and use this “copy model” to craft their
attack. Researchers have shown that attacks crafted
using these “copy models” are easily transferable to the originally targeted
models.23 As was the case with models, there are a number of common sce-
narios in which the attacker would have access to the dataset. Like models
themselves, datasets are made widely available as part of the open source
movement, or could similarly be obtained by hacking the system storing
this dataset. In an even more restrictive setting where the dataset is not
available, attackers could compile their own similar dataset, and use this
similar dataset to build a “copy model” instead.
26 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
they can still craft an attack. This situation occurs often in practice, with
businesses offering Artificial Intelligence as a Service via a public API.24
This service gives users the output of an AI model trained for a particular
task, such as object recognition. While these models and their associated
datasets are kept private, attackers can use the output information from
their APIs to craft an attack. This is because this output information
replaces the need for having the model or the dataset.
In the hardest case where nothing about the model, its dataset, or its output
is available to the attacker, the attacker can still try to craft attacks by brute
force trial-and-error. For example, an attacker trying to beat an online con-
tent filter can keep generating random attack patterns and uploading the
content to see if it is removed. Once a successful attack pattern is found, it
can be used in future attacks.
24 See, e.g., “Machine Learning on AWS: Putting Machine Learning in the Hands of Every Developer”,
https://aws.amazon.com/machine-learning/
Belfer Center for Science and International Affairs | Harvard Kennedy School 27
Poisoning Attacks
28 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Figure 7: In normal machine learning (left), the learning algorithm extracts
patterns from a dataset, and the “learned” knowledge is stored in the machine
learning model—the brain of the system. In a poisoning attack (right), the
attacker changes the training data to poison the learned model.
The ability to attack the dataset collection process represents the beginning
of a new era of attitudes towards data. Today, data is generally viewed as
a truthful representation of the world, and has been successfully used to
teach AI systems to perform tasks within this world. As a result, data col-
lection practices today resemble a dragnet: everything that can be collected
is collected. The reason for this is clear: AI is powered almost entirely by
data, and having more data is generally correlated with better AI system
performance.
However, now that the dataset collection process itself may be attacked, AI
users can no longer blindly trust that the data they collect is valid. Data
Belfer Center for Science and International Affairs | Harvard Kennedy School 29
represents the state of something in the world, and this state can be altered
by an adversary. This represents a new challenge: even if data is collected
with uncompromised equipment and stored securely, what is represented
in the data itself may have been manipulated by an adversary in order to
poison downstream AI systems. This is the classic misinformation cam-
paign updated for the AI age.
Dataset Poisoning
The most direct way to poison a model is via the dataset. As previously
discussed, the model is wholly dependent on the dataset for all of its
knowledge: poison the dataset, poison the model. An attacker can do this
by introducing incorrect or mislabeled data into the dataset. Because the
machine learning algorithms learn a model by recognizing patterns in
this dataset, poisoned data will disrupt this learning process, leading to a
poisoned model that may, for example, have learned to associate patterns
with mislabeled outcomes that serve the attacker’s purpose. Alternatively,
the adversary can change its behavior so that the data collected in the first
place will be wrong.
30 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Discovering poisoned data in order to stop poisoning attacks can be very
difficult due to the scale of the datasets. Datasets routinely contain millions
of samples. These samples many times come from public sources rather
than private collection efforts. Even in the case when the dataset is col-
lected privately and verified, an attacker may hack into the system where
the data is being stored and introduce poisoned samples, or seek to corrupt
otherwise valid samples.
Algorithm Poisoning
Belfer Center for Science and International Affairs | Harvard Kennedy School 31
that install a particular backdoor into models,26 as well as those that gener-
ally degrade the model,27 have already been demonstrated.
Model Poisoning
26 Bagdasaryan, Eugene, et al. “How to backdoor federated learning.” arXiv preprint arXiv:1807.00459
(2018).
27 Bhagoji, Arjun Nitin, et al. “Analyzing Federated Learning through an Adversarial Lens.” arXiv
preprint arXiv:1811.12470 (2018).
32 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Part II: Impacted Systems
We now turn our attention to which systems and segments of society are
most likely to be impacted by AI attacks. AI systems are already integrated
into many facets of society, and increasingly so every day. For industry and
policy makers, the five most pressing vulnerable areas are content filters,
military systems, law enforcement systems, traditionally human-based
tasks being replaced with AI, and civil society.
Content Filters
Content filters are also uniquely qualified to police content at the scale the
Internet requires. The content uploaded to the Internet each minute is a
staggering amount growing at a staggering rate. Over three billion images
are shared every day on the Internet.28 AI-based content filters have
emerged as the primary, if not only, tool able to operate at this scale, and
have been widely adopted by industry. For example, Facebook removed
21 million pieces of lewd content in the first quarter of 2018 alone, 96% of
which was flagged by these algorithms.29
28 List, Mary, “33 Mind-Boggling Instagram Stats & Facts for 2018”, 19 February 2018, https://www.
wordstream.com/blog/ws/2017/04/20/instagram-statistics
29 Meeker, Mary, “Internet Trends 2018”, 30 May 2018, https://www.slideshare.net/kleinerperkins/
internet-trends-report-2018-99574140
30 Alfifi, Majid, et al. “Measuring the Impact of ISIS Social Media Strategy.” (2018): 1-4.
31 Mozur, Paul, “A Genocide Incited on Facebook, With Posts from Myanmar’s Military”, NY Times, 15
October 2018, https://www.nytimes.com/2018/10/15/technology/myanmar-facebook-genocide.
html.
Belfer Center for Science and International Affairs | Harvard Kennedy School 33
democratic elections in the U.S. and Europe.32 As this content successfully
weaponizes US-based platforms, the efficacy of AI-based content filters has
broad-ranging implications, including the defense of both national security
and oppressed populations.
Even in more banal uses, content filters are tied to many business models.
As advertisers begin to be held responsible in the court of public opinion
for the content appearing next to their advertisements, there is a growing
need to detect an increasing number of objectionable content types. This
extends to detection of nudity, violence, hate crimes, weapons, adult
pornography, profanity, and inappropriate comments. YouTube faced the
boycott of advertisers including AT&T, Disney, Hasbro, and Nestle for fail-
ing to effectively filter sexual comments left by viewers on videos in which
children appeared.34
As content filters are drafted into these battles, there will be strong
incentives both to attack them and to generate tools making these attacks
easier to execute. Adversaries have already seen the power of using digital
34 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
platforms in pursuit of their mission. ISIS organically grew an international
following and successfully executed a large-scale recruitment program
using social media. These are successes that, morals aside, may have evoked
jealousy from the marketing departments of Fortune 500 companies.
Future organizations of malice are likely to follow the same playbook. If
confronted with better content filters, they are likely to be the first adopters
of AI attacks against these filters.
In this respect, entities such as social networks may not even know they are
under attack until it is too late, a situation echoing the 2016 U.S. presiden-
tial election misinformation campaigns. As a result, as is discussed in the
policy response section, content-centric site operators must take proactive
steps to protect against, audit for, and respond to these attacks.
Belfer Center for Science and International Affairs | Harvard Kennedy School 35
Military
35 “Establishment of the Joint Artificial Intelligence Center”, Deputy Secretary of Defense, 27 June
2018, https://admin.govexec.com/media/establishment_of_the_joint_artificial_intelligence_
center_osd008412-18_r....pdf
36 Pellerin, Cheryl, “Project Maven Industry Day Pursues Artificial Intelligence for DoD Challenges”,
U.S. Department of Defense, 27 October 2017, https://dod.defense.gov/News/Article/
Article/1356172/project-maven-industry-day-pursues-artificial-intelligence-for-dod-challenges/
37 MSTAR Public Targets, https://www.sdms.afrl.af.mil/index.php?collection=mstar&page=targets.
36 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
made the development of “edge computing” a priority, as the bandwidth
needed to support a cloud-based AI paradigm is unlikely to be available
in battlefield environments.38 This reality will require these systems to be
treated with care. Just as the military recognizes the threat created when a
plane, drone, or weapon system is captured by an enemy, these AI systems
must be recognized and treated as a member of this same protected class so
that the systems are not compromised if captured by an enemy.
38 “Interview with Lieutenant General Jack Shanahan: Part 2”, Over the Horizon Multi-Domain
Operations and Strategy, 4 April 2018, https://othjournal.com/2018/04/04/interview-with-
lieutenant-general-jack-shanahan-part-2/
39 Statement by Dana Deasy, Department of Defense Chief Information Office, Before the House
Armed Services Committee Subcommittee on Emerging Threats and Capabilities on “Department
of Defense’s Artificial Intelligence Structure, Investments, and Applications”, 26 February 2019,
https://armedservices.house.gov/_cache/files/5/7/579723e2-4461-4a8c-95da-ec3e84c4985e/
E41B38FCB69AD83331F31CDC06570D33.hhrg-116-as26-wstate-deasyd-20190226.pdf.
40 “Interview with Lieutenant General Jack Shanahan: Part 1”, Over the Horizon Multi-Domain
Operations and Strategy, 2 April 2018, https://othjournal.com/2018/04/02/interview-with-
lieutenant-general-jack-shanahan-part-1/
Belfer Center for Science and International Affairs | Harvard Kennedy School 37
applications depended on this same shared dataset, this could lead to
widespread vulnerabilities throughout the military. In the case of input
attacks, an adversary would then be easily able to find attack patterns to
engineer an attack on any systems trained using the dataset. In the case of
poisoning attacks, an adversary would only need to compromise one
dataset in order to poison any downstream models that are later trained
using this poisoned dataset.
Further, the process associated with creating these unique ““The military faces
datasets can lead to vulnerabilities that can be exploited. the challenge that
When building AI-enabled weapons and defense systems, AI attacks will
the individual data samples used to train the models them- be difficult,
if not
selves become a secret that must be protected. However, impossible, to detect
in battle conditions.”
because this preparation work is exceedingly time consum-
ing, it may rely on a large number of non-expert labelers
or even outsourced data labeling and preparation services. This trend has
already manifested itself in the private sector, where firms like Facebook
have turned to outsourced content moderators,41 as well as in initial mili-
tary AI efforts.42 Expected similar trends here could make high confidence
guarantees on data-access restrictions and oversight of proper data han-
dling, labeling, and preparation difficult to achieve. While these types of
procedural oversight concerns are not new, best practices have been estab-
lished in other fields such as nuclear. However, because of its infancy, these
best practices are lacking in the AI field. Forming these best practices will
require new policies managing data acquisition and preparation.
Beyond the threats posed by sharing datasets, the military may also seek
to re-use and share models and the tools used to create them. Because the
military is a, if not the, prime target for cyber theft, the models and tools
themselves will also become targets for adversaries to steal through hack-
ing or counterintelligence operations. History has shown that computer
systems are an eternally vulnerable channel that can be reliably counted on
as an attack avenue by adversaries. By obtaining the models stored and run
41 Lagorio-Chafkin, Christine, “Facebook’s 7,500 Moderators Protect You From the Internet’s Most
Horrifying Content. But Who’s Protecting Them?”, Inc., 26 September 2018, https://www.inc.com/
christine-lagorio/facebook-content-moderator-lawsuit.html.
42 Fang, Lee, “Google Hired Gig Economy Workers to Improve Artificial Intelligence in Controversial
Drone-targeting Project”, The Intercept, 4 February 2019, https://theintercept.com/2019/02/04/
google-ai-project-maven-figure-eight/.
38 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
on these systems, adversaries can back-solve for the attack patterns that
could fool the systems.
Finally, the military faces the challenge that AI attacks will be difficult,
if not impossible, to detect in battle conditions. This is because a hack
of these systems to obtain information to formulate an attack would not
by itself necessarily trigger a notification, especially in the case where an
attacker is only interested in reconnaissance aimed at learning the datasets
or types of tools being used. Further, once adversaries develop an attack,
they may exercise extreme caution in their application of it in order to not
arouse suspicion and to avoid letting their opponent know that its systems
have been compromised. Accordingly, attacks may be limited only to
situations of extreme importance. In this respect, there may be no count-
er-indications to system performance until after the most serious breach
occurs. This is also a problem inherent in traditional cyberattacks.
Beyond these defensive concerns, the military may also choose to invest
in offensive AI attack capabilities. This topic of offensive weaponization is
discussed in detail in Part III.
Belfer Center for Science and International Affairs | Harvard Kennedy School 39
Law Enforcement
The applications of AI for law enforcement are both already deployed and
being actively researched. Amazon has recently launched a facial recog-
nition system44 that is being piloted by police departments in the US.45
The system seeks to match target facial images against a large database of
criminal mugshots. The NIJ supports research in video and image analysis,
detecting characteristics of firearm discharges (number of guns present,
assignment of a gunshot to a particular gun, and classification of firearm
class and caliber), face detection, and other applications.46
40 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Beyond just its use in keeping pace with expanding amounts of content, AI
can be used to provide more effective policing and crime prevention by
detecting criminal warning signs earlier and apprehending suspects faster.
Further, these attacks are not limited to visual surveillance systems. The NIJ’s
funded research into classifying firearm class and caliber from audio signals also
presents a target. New classes of hardware accessories such as “smart silencers”
may be developed that execute AI attacks to deceive these systems, for example
by making the systems think that the gunshot came from a different gun. As the
AI technology evolves, criminal strategy will do so in turn.
Although law enforcement and the military share many ““Private companies
similar AI applications, the law enforcement community have already shown
faces its own unique set of challenges in securing against AI an ineptitude to
properly address
attacks. First, law enforcement AI systems will largely be off-
known and easily
the-shelf purchases from different private companies. Unlike
addressed security
the military, most law enforcement organizations are small vulnerabilities.”
and lack the resources needed to scope, let alone build, these
AI systems, and will therefore likely rely on a patchwork of different private
providers. This is reason to worry. Private companies have already shown
an ineptitude to properly address known and easily addressed security
49 Sharif, Mahmood, et al. “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face
recognition.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications
Security. ACM, 2016.
50 See, e.g., https://www.clearme.com
Belfer Center for Science and International Affairs | Harvard Kennedy School 41
vulnerabilities, let alone an emerging and difficult vulnerability such as AI
attacks. It would be unwise to assume that the private companies are taking,
or are even capable of taking, the necessary steps to mitigate AI security vul-
nerabilities. Further, each law enforcement organization alone will probably
not have enough market power to demand stringent security protections,
while the military does.
Together, these challenges are an especially worrisome point given the cur-
rent climate in which police departments are on the front lines of fighting
terrorism. A technological system that is fragmented and not properly han-
dled may disadvantage police forces in the face of advanced adversaries.
This situation may call for additional coordination from sources such as
DHS to unify purchasing and security standards.
51 “Cybersecurity Guide for State and Local Law Enforcement”, National Consortium for
Advanced Policing, June 2016, https://cchs.gwu.edu/sites/g/files/zaxdzs2371/f/downloads/
NCAPCybersecurityGuide-2016.pdf
42 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Commercial Artificial Intelligence-
fication of Human Tasks
The cost of failure of AI systems in this domain have already been experi-
enced. An Uber self-driving car struck and killed a pedestrian in Tempe,
Arizona when the on-board AI system failed to detect a human in the
road.52 While it is unclear if the particular pattern of this pedestrian is what
caused the failure, the failure manifested itself in the exact same manner in
which an AI attack on the system would. This real-world example is a terri-
fying harbinger of the ability for adversaries who are deliberately trying to
find attack patterns to find success.
52 Said, Carolyn, “Video shows Uber robot car in fatal accident did not try to avoid woman”, SFGate,
21 March 2018, https://www.sfgate.com/business/article/Uber-video-shows-robot-car-in-fatal-
accident-did-12771938.php
Belfer Center for Science and International Affairs | Harvard Kennedy School 43
progress. In one scenario, individual companies will each build their own
proprietary AI systems. Because each company is building its own system,
industries cannot pool resources to invest in preventative measures and
shared expertise. However, this diversification limits the applicability of an
attack on one AI system to be applied broadly to many other systems.
Further, by not pooling dataset resources, a dataset breach will have limited
consequences.
Different industries will likely play into one of these scenarios, if not
a hybrid of both. This dichotomy is already seen in the market today.
Autonomous vehicle companies are largely operating under the first “every
firm on its own” scenario. At the same time, Artificial Intelligence as a
Service, a key component of the second “shared monoculture” scenario,
is also becoming more common. As such, policymakers must be ready to
address both scenarios, as each will require different interventions.
Civil Society
Just as not all uses of AI are “good,” not all AI attacks are “bad.” While AI
in a Western context is largely viewed as a positive force in society, in many
other contexts it is employed to more nefarious ends. Countries like China
44 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
and other oppressive regimes use AI as a way to track, control, and intimi-
date their citizens. As a result, “attacks” on these systems, from a US-based
policy view of promoting human rights and free expression, would not
be an “attack” in a negative sense of the word. Instead, these AI “attacks”
would become a source of protection capable of promoting safety and free-
dom in the face of oppressive AI systems instituted by the state.
This “dual use” nature is not unique to AI attacks, but ““AI ‘attacks’ may
is shared with many other cyber “attacks.” For exam- take on a role
ple, the identical encryption method can be used by similar to that of
dissidents living under an oppressive regime to protect Tor, VPNs, and other
technologies used to
their communications as easily as it can be by terror-
evade government
ists planning an attack.
oppression.”
53 Huges, Roland, “China Uighurs: All you need to know on Muslim ‘crackdown’”, BBC News, 8
November 2018, https://www.bbc.com/news/world-asia-china-45474279
54 Sharif, Mahmood, et al. “Adversarial generative nets: Neural network attacks on state-of-the-art
face recognition.” arXiv preprint arXiv:1801.00349 (2017).
Belfer Center for Science and International Affairs | Harvard Kennedy School 45
In this respect, AI “attacks” may take on a role similar to that of Tor, VPNs,
and other technologies used to evade government oppression. Just as this
report advocates for appropriate agencies to educate their constituents
about the risks posed by AI attacks, it should likewise advocate for human
rights organizations to educate their constituents about the benefits avail-
able through AI “attacks.”
This dual use will create difficult policy decisions as potential protections
against AI attacks are developed. Specifically, if protections against AI
attacks are developed, should they be made public? If sharing this protec-
tion with U.S. institutions and companies would stop dangerous attacks on
them, the answer would be “yes.” But if oppressed people around the world
came to rely on AI “attacks” to protect themselves from their government,
and sharing this protection would again give their oppressive regimes the
upper hand, many may argue that the answer would be “no.” (Beyond the
impact on civil society, the answer may also be “no” if it was known that
the disclosure would improve an adversary’s defenses against AI attack.)
46 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Part III: Significance
within the Cybersecurity
Landscape
Belfer Center for Science and International Affairs | Harvard Kennedy School 47
Despite this fundamental difference, the two are linked in important ways.
Many AI attacks are aided by gaining access to assets such as datasets or
model details. In many scenarios, doing so will utilize traditional cyberat-
tacks that compromise the confidentiality and integrity of systems, a
subject well studied within the cybersecurity CIA triad. Traditional confi-
dentiality attacks will enable adversaries to obtain the assets needed to
engineer input attacks. Traditional integrity attacks will enable adversaries
to make the changes to a dataset or model needed to execute a poisoning
attack. As a result, traditional cybersecurity policies and defense can be
applied to protect against some AI attacks. While AI attacks can certainly
be crafted without accompanying cyberattacks, strong traditional cyber
defenses will increase the difficulty of crafting certain attacks.
Given the current attention cybersecurity problems are receiving from the
public and the government, the climate is right for taking proactive mea-
sures to allow for the beneficial use of AI while mitigating the associated
attack threat before the expanded spread of these algorithms to safety- and
security-critical infrastructure and applications.
55 Greenberg, Andy, “The Untold Story of NotPetya, the Most Devastating Cyberattack in History”,
Wired, 22 August 2018, https://www.wired.com/story/notpetya-cyberattack-ukraine-russia-code-
crashed-the-world/.
48 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Offensive Weaponization
Any cyber vulnerability can be turned into a cyber ““The focus of China’s
weapon. The same holds true for AI attacks, espe- and other countries’
cially in the military and intelligence contexts. investments in AI is
The potential promise of this is based on the belief based on an attempt to
offset traditional U.S.
that other countries may begin to integrate AI and
battlefield superiority.”
machine learning into military decision making
pipelines and automated weapons.56 China and other potential adversaries
are investing heavily in AI and machine learning. Many believe that these
abilities will be integrated into their armed forces.57 Lieutenant General
John Shanahan, director of the Joint Artificial Intelligence Center, believes
that machine learning/artificial intelligence capabilities of potential foes
will be so far developed in future wars that U.S. use of the same technolo-
gies “... is not a case where we’re going to offset somebody. We will however,
be offset if we do not do it [develop these capabilities].”58
56 Harvard Kennedy School Institute of Politics John F. Kennedy Jr. Forum “Interview with
Eric Rosenbach and Jason Mathen: The Public Policy Challenges of Artificial Intelligence”,
15 February 2018, https://www.belfercenter.org/event/public-policy-challenges-artificial-
intelligence#transcript
57 Upchurch, Tom, “How China Could Beat the West in the Deadly Race for AI Weapons”, Wired, 8
August 2018, https://www.wired.co.uk/article/artificial-intelligence-weapons-warfare-project-
maven-google-china.
58 “Interview with Lieutenant General Jack Shanahan: Part 1”, Over the Horizon Multi-Domain
Operations and Strategy, 2 April 2018, https://othjournal.com/2018/04/02/interview-with-
lieutenant-general-jack-shanahan-part-1/
59 Talmadge, Caitlin, “Beijing’s Nuclear Option: Why a U.S.-Chinese War Could Spiral Out of Control”,
Foreign Affairs Vol. 97 Num. 6, November/December 2018.
Belfer Center for Science and International Affairs | Harvard Kennedy School 49
to counter this new AI-based strategy. One key component of this strategy
should include offensive AI attacks to degrade the performance of enemy
automated systems. In this respect, AI attacks would be a modern-day
version of radar jamming.
60 Burgess, Matt, “Everything you need to know about EternalBlue—the NSA exploit linked to Petya”,
Wired, 28 June 2017, https://www.wired.co.uk/article/what-is-eternal-blue-exploit-vulnerability-
patch
50 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
if the host country or its allies are utilizing a similar system vulnerable to
the same attack.
Belfer Center for Science and International Affairs | Harvard Kennedy School 51
Considerations of Practicality
This was due to two enabling factors, both of which can be applied to gain
insight into the practicality of AI attacks. First, even though the under-
lying technology behind Deepfakes was sophisticated, it was possible to
create tools that simplified the application of the method. In the case of
Deepfake, an app was created that abstracted away all of the technical
details, essentially distilling the application of a complicated algorithm to a
52 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
drag-and-drop and a single click of a button.62 This allowed for non-tech-
nical actors to harness the power of the algorithm easily. This is not the
first time this rodeo has played out in the cyber domain: a similar set of
tools has also proliferated in the traditional cybersecurity domain, allow-
ing non-technical actors to participate in campaigns such as Distributed
Denial of Service (DDoS) attacks.63
Further, the fact that technological ecosystems have not adapted to pre-
vent these attacks will amplify the success of these tools even further. For
example, because many AI systems have web-based APIs, apps could easily
Belfer Center for Science and International Affairs | Harvard Kennedy School 53
be developed to interface directly with the APIs to generate attacks on
demand. To attack an image content filter with a web-based API, attackers
would simply supply an image to the app, which would then generate a
version of the image able to trick the content filter but remain indistin-
guishable from the original to the human eye.
54 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Part IV: “AI Security
Compliance” as a Policy
Solution for AI Attacks
This report proposes the creation of “AI Security Compliance” programs
as a main public policy mechanism to protect against AI attacks. The goals
of these compliance programs are to 1) reduce the risk of attacks on AI
systems, and 2) mitigate the impact of successful attacks.
This section sets forth a general AI security compliance program that can
be the basis of compliance programs adopted by industry and regulators.
Industries and sectors adopting this type of compliance program can cus-
tomize the components to fit their needs. The following section describes
implementation and enforcement details.
66 See https://www.pcisecuritystandards.org/
Belfer Center for Science and International Affairs | Harvard Kennedy School 55
Planning Stage Compliance
Requirements
AI Suitability Tests
Conduct “AI Suitability Tests” that assess the risks of current and future
applications of AI. These tests should result in a decision as to the acceptable
level of AI use within a given application. These tests should weigh the
application’s vulnerability to attack, the consequence of an attack, and the
availability of alternative non-AI-based methods that can be used in place of
AI systems.
56 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
• Ease of Attack: How easy will it be for an adversary to execute an
attack on the AI system?
We now discuss each component briefly. The value of the AI system should
be examined in light of the economic and societal benefit the system is
expected to deliver. This will by nature be a subjective measure, but entities
deciding to adopt AI should be able to justify the value they believe it will
deliver in the event of an audit or external review.
Belfer Center for Science and International Affairs | Harvard Kennedy School 57
enforcement, academics, and think tanks in order to understand what
damage may be incurred from a successful attack against an AI system.
Once each of these questions have been sufficiently answered, they should
be weighed to arrive at a determination of how much risk the system poses,
and this should be used to make an implementation decision. Just as they
may have chosen to do in answering the questions, stakeholders may again
wish to consult with law enforcement, academics, think thanks, and other
outside entities in arriving at a decision. Entities may wish to look to the
National Highway Traffic Safety Administration’s cost analysis methodol-
ogy for inspiration in reaching an implementation decision.67
67 Soodoo, George, “A Primer on the NHTSA Rulemaking Process”, Eno Center for Transportation, 13
March 2017, https://www.enotrans.org/article/primer-nhtsa-rulemaking-process/
58 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
of attack damage, an attack will at worst render the content filters inef-
fective, an outcome no worse than not deploying them in the first place.
In terms of availability of other options, AI-based filtering is perhaps the
only technique that is capable of operating at a sufficient scale given the
large amount of content added to social networks daily. As a result, this
application would still be well suited for AI, given a lack of alternatives
and low collateral damage from an attack. However, even though AI may
still be appropriate in this case, it does not absolve the social network from
both preventative and mitigative efforts to counter attacks. For example,
the social network may need to determine human involvement in and
oversight of the system, such as by executing periodic manual audits of
content to identify when its systems have been attacked, and then taking
appropriate action such as increased human review of material policed by
the compromised system.
This example also demonstrates the outcomes of these AI suitability tests need
not be binary. They can, for example, suggest a target level of AI reliance on
the spectrum between full autonomy and full human control. This can allow
for technological development while not leaving an application vulnerable to a
potentially compromised monoculture. The DoD has been vocal about adopt-
ing this strategy in its development of AI-enabled systems, albeit for additional
reasons. In this middle-lane strategy, AI-enabled systems can be used to aug-
ment human-controlled processes, but not to fully replace human operators.
Through this middle lane, a successful attack would not have its full intended
effect. Stakeholders may look to the self-driving vehicle industry for inspiration
in categorizing human involvement in AI systems, which formulizes this clas-
sification system by categorizing autonomous vehicles from Level 1 (no AI use)
to Level 5 (full AI use).
Belfer Center for Science and International Affairs | Harvard Kennedy School 59
questions that make up the tests as well ““Just as Rome’s powerful roads
as in forming a final implementation were turned against them by their
decision. enemies, AI attacks and other forms
of information warfare may similarly
turn data from the panacea it is
Beyond this supportive role, regulators
hailed as today into a vulnerability
should affirm that they will use an enti- in an AI-dominated society.”
ty’s effort in executing a suitability test
in deciding culpability and responsibility if attacks do occur. As is the case
with other compliance efforts, a company that demonstrates that it made
a good faith effort to reach an informed decision via a suitability test may
face more lenient consequences from regulators in the case of attacks than
those that disregarded the tests.
Review and update data collection and sharing practices to protect against
data being weaponized against AI systems. This includes formal validation of
data collection practices and restricting data sharing.
AI users must review and secure their data collection and sharing poli-
cies. These reviews should be formal, identify emerging ways data can be
weaponized against systems, and be used to shape data collection and use
practices. The outcome of these reviews should be written policies govern-
ing how any data used in building an AI system is collected and shared.
These reviews are needed because data may emerge as a potent weapon in
the age of AI attacks, and steps must be taken to have stakeholders realize
60 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
the dangers data can now pose. This is especially important because this
new danger is in stark contrast with data’s current reputation in society:
data is currently regarded pervasively as “digital gold” within the private
sector, government, and military. However, because AI is almost wholly
dependent on data, data is a direct avenue through which to conduct AI
attacks. In this respect, just as Rome’s powerful roads were turned against
them by their enemies, AI attacks and other forms of information warfare
may similarly turn data from the panacea it is hailed as today into a vulner-
ability in an AI-dominated society.
AI users must validate their data collection practices to account for risks
that manipulated, inaccurate, or incomplete datasets pose to AI systems.
Data can be weaponized in order to execute AI attacks, specifically poi-
soning attacks. For every dataset collected, AI users should ask themselves
the following questions to identify potential weaknesses in the dataset that
could be exploited for AI attacks:
If the adversary controls the entities on which data is being collected, they
can manipulate them to influence the data collected. For example, consider
a dataset of radar signatures of an adversary’s aircrafts. Because the adver-
sary has control over their own aircraft, it can alter them in order to alter
the data collected. Adversaries need not be aware that data is being col-
lected in order to manipulate the process. The existence of the possibility
that data will be collected may be enough of a threat to execute this type of
influence campaign.
Belfer Center for Science and International Affairs | Harvard Kennedy School 61
If an adversary is aware that data is being collected, they may try to
interfere in some aspect of the collection process in order to alter the data
being collected. An analogous example from the traditional cybersecurity
domain can illustrate this example. When the U.S. was aware the Russia
was stealing pipeline control software, they purposely altered the software
to introduce a flaw into the software that would trigger a pipeline explo-
sion.68 Analogously in the data domain, if an adversary is aware that data is
being collected to be used in an AI system, they may take additional steps
to interfere in the data collection process to corrupt the data collected.
Once they have answered these questions, AI users should evaluate what
risks exist within the dataset, and take corrective actions:
68 Russell, Alec, “CIA plot led to huge blast in Siberian gas pipeline”, The Telegraph, 28 February 2004,
https://www.telegraph.co.uk/news/worldnews/northamerica/usa/1455559/CIA-plot-led-to-huge-
blast-in-Siberian-gas-pipeline.html
62 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
• If there is a risk adversaries may have been able to manipulate the
data itself, additional steps should be taken to validate the data and
remove data that is suspect.
Critical AI systems must restrict how and when the data used to build
them is shared in order to make AI attacks more difficult to execute.
For critical applications, as a rule data should by default not be shared.
Exceptions should be well reasoned. The resulting data sharing policies
should be explicitly written and followed.
69 National Science and Technology Council, Networking and Information Technology Research and
Development Subcommittee, “The National Artificial Intelligence Research and Development
Strategic Plan”, October 2016, https://www.nitrd.gov/PUBS/national_ai_rd_strategic_plan.pdf
70 Statement by Dana Deasy, Department of Defense Chief Information Office, Before the House
Armed Services Committee Subcommittee on Emerging Threats and Capabilities on “Department
of Defense’s Artificial Intelligence Structure, Investments, and Applications”, 26 February 2019,
https://armedservices.house.gov/_cache/files/5/7/579723e2-4461-4a8c-95da-ec3e84c4985e/
E41B38FCB69AD83331F31CDC06570D33.hhrg-116-as26-wstate-deasyd-20190226.pdf.
71 “Interview with Lieutenant General Jack Shanahan: Part 1”, Over the Horizon Multi-Domain
Operations and Strategy, 2 April 2018, https://othjournal.com/2018/04/02/interview-with-
lieutenant-general-jack-shanahan-part-1/
Belfer Center for Science and International Affairs | Harvard Kennedy School 63
shared widely, there is a larger risk that it will be stolen or accidently copied
on to insecure systems.
As such, when writing data sharing policies, AI users must challenge these
established norms, consider the risks posed by data sharing, and shape data
sharing policies accordingly. Without this, constituent parties may not realize
the strategic importance data provides to attackers, and therefore may not
take the steps necessary to protect it in the absence of explicit policy.
64 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Implementation Stage
Compliance Requirements
Protect the assets that can be used to craft AI attacks, such as datasets and
models, and improve the cybersecurity of the systems on which these assets
are stored.
These best practices should be formulated with joint input from security
experts and domain experts for each application, and are likely to include
changes such as only transmitting data over classified or encrypted
networks, encrypting stored data to protect it even if the system is
compromised, and keeping system details, such as tools and model hyper-
parameters, secret.
Belfer Center for Science and International Affairs | Harvard Kennedy School 65
pipeline. This has transformed a wide range of assets that span the AI
training and implementation pipelines into targets for would-be attackers.
Specifically, these assets include the datasets used to train the models, the
algorithms themselves, system and model details such as which tools are
used and the structure of the models, storage and compute resources hold-
ing these assets, and the deployed AI systems themselves.
This hardening must extend to the model itself. Even if the data is properly
secured and an uncompromised model is trained, the model itself must
then be protected. A trained model is just a digital file, no different from an
image or document on a computer. As such, like other digital assets, it can
be stolen or corrupted. If a model is stolen, crafting an attack is relatively
easy. If an uncompromised model is corrupted or replaced with a cor-
rupted one, all other protection efforts are completely moot. As such, the
model itself must be recognized as a critical asset and protected, and the
66 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
storage and computing systems on which the model is stored and executed
must similarly be treated with high levels of security.
However, recent trends in how models are used will ““Regardless of the
complicate efforts to protect them. Recently, models are reason for doing so,
no longer residing and operating exclusively within data placing AI models
centers where security and control can be centralized, on edge devices
makes protecting
but are instead being pushed directly to devices such as
them more difficult.”
weapon systems and consumer products. This change
is necessary for applications in which it is either impos-
sible or impractical to send data from these “edge” devices to a data center
to be processed by AI models living in the cloud. For example, in the case
of weapon systems, this may be impossible because the enemy has jammed
the communication channels. In the case of consumer applications such
as autonomous cars, this may be impractical because the device will not
receive a response fast enough to meet application requirements.
Regardless of the reason for doing so, placing AI models on edge devices
makes protecting them more difficult. Because these edge devices have
a physical component (e.g., as is the case with vehicles, weapons, and
drones), they may fall into an adversary’s hands. Care must be taken that
if these systems are captured or controlled, they cannot be examined or
disassembled in order to aid in crafting an attack. In other contexts, such as
with consumer products, adversaries will physically own the device along
with the model (e.g., an adversary can buy a self-driving car in order to
acquire the model that is stored on the vehicle’s on-board computer to help
in crafting attacks against other self-driving cars). In this case, care must be
taken that adversaries cannot access or manipulate the models stored on
systems over which they otherwise have full control. Encryption will play
an important role in securing these assets.
Belfer Center for Science and International Affairs | Harvard Kennedy School 67
Improve intrusion and attack formulation Detection
Improve intrusion detection systems to better detect when assets have been
compromised and to detect patterns of behavior indicative of an adversary
formulating an attack.
While hardening soft targets will raise the difficulty of executing attacks,
attacks will still occur and must be detected. Policymakers should encour-
age improved intrusion detection for the systems holding these critical
assets, and the design of methods profiling anomalous behavior to detect
when attacks are being formulated. While an ounce of prevention is worth
a pound of cure, it is imperative to know when prevention has failed so
that the system operator can take the necessary mitigation steps before the
adversary has time to execute an attack.
In the simplest scenarios where a central repository holds the datasets and
other important assets, the vanilla intrusion detection methods that are
currently a mainstay of cybersecurity can be applied. In this simple case,
if assets such as datasets or models are accessed by an unauthorized party,
this should be noted immediately and the proper steps should be taken in
response.
68 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
to allow customers to utilize the models. Attackers can use this window
into the system to craft attacks, replacing the need for more intrusive
actions such as stealing a dataset or recreating a model. In this setting,
it can be difficult to tell if an interaction with the system is a valid use of
the system or probing behavior being used to formulate an attack. For
example, is the case of a user sending the same image to a content-filter
one hundred times 1) a developer diligently running tests on a newly built
piece of software, or 2) an attacker trying different attack patterns to find
one that can be used to evade the system? System operators must invest
in capabilities able to alert them to behavior that seems to be indicative of
attack formulation rather than valid use.
Belfer Center for Science and International Affairs | Harvard Kennedy School 69
Mitigation Stage Compliance
Requirements
Determine how AI attacks are most likely to be used, and craft response plans
for these scenarios.
72 Reuters, “Facebook says it removed 1.5 million videos of the New Zealand mosque attack”, 17 March
2019, https://www.reuters.com/article/us-newzealand-shootout-facebook-video/facebook-says-it-
removed-15-million-videos-of-the-new-zealand-mosque-attack-idUSKCN1QY05X
70 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Similar human-machine partnerships that Facebook sometimes employs73
will need to become the norm in an era in which the AI systems are vul-
nerable to attack.
Response plans may also require real-world action to be taken. For exam-
ple, police response plans to input attacks on infrastructure, such as signs
and road markers, will require the immediate dispatch of officers. Just as
officers are dispatched to an intersection when a traffic light is broken, sim-
ilar responses will be needed. In this case however, the response will need
to be immediate—humans can still navigate a broken traffic light relatively
well, but a driverless car will run a now “invisible” stop sign without the
human passengers having a chance to intervene. This response plan may
also require expanded partnerships and information sharing agreements
with other entities, such as companies controlling the technology. Further,
the response plan will require training and coordination such that officers
will be equipped to recognize that seemingly harmless graffiti or vandal-
ism may actually be an attack, and then know to activate the appropriate
response plan.
Create maps showing how the compromise of one asset or system affects all
other AI systems.
73 Liptak, Andrew, “Facebook says that it removed 1.5 million videos of the New Zealand mass
shooting”, The Verge, 17 March 2019, https://www.theverge.com/2019/3/17/18269453/facebook-
new-zealand-attack-removed-1-5-million-videos-content-moderation
Belfer Center for Science and International Affairs | Harvard Kennedy School 71
Given the reality of how data is shared and repurposed, shared dependen-
cies—and therefore vulnerabilities—among systems will be widespread
for better or worse. As a result, there is a need to rapidly understand how a
compromise of one asset or system affects other systems.
72 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Part V: Implementation
and Enforcement
Implementation
Belfer Center for Science and International Affairs | Harvard Kennedy School 73
agencies already regulating an industry to manage compliance mandates
and details. In the context of self-driving cars, this may fall to DoT or one
of its sub-agencies, such as NHTSA. In the context of other consumer
applications, this may fall to other agencies such as the FTC.
Enforcement
Drawbacks
74 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
will require a trade-off against other important considerations, such as
ensuring that AI systems are fair, unbiased, and trustworthy. Many of the
methods to verify these properties rely on openly publishing datasets,
methods, models, and APIs to the systems. However, these exact actions
double as a list of worst practices in terms of protecting against AI attacks.
In already deployed systems that require both verified fairness and security,
such as AI-based bond determination,74 it will be difficult to balance both
simultaneously. New methods will be needed to allow for audits of systems
without compromising security, such as restricting audits to a trusted third
party rather than publishing openly.
74 See, e.g., the Correcitonal Offender Management Profiling for Alternative Sanctions tool (COMPAS)
and the use of it by various government institutions, e.g., https://doc.wi.gov/Pages/AboutDOC/
COMPAS.aspx and https://qz.com/1375820/california-just-replaced-cash-bail-with-algorithms/.
Belfer Center for Science and International Affairs | Harvard Kennedy School 75
Additional Recommendations
76 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Beyond creating programs and grants aimed solely at defense mechanisms
and creating new methods not vulnerable to these attacks, DARPA and
other funding bodies should mandate that every research project related to
AI must include a component discussing the vulnerabilities introduced by
the research. This will allow users who potentially adopt these technologies
to make informed decisions as to not just the benefits but also the risks of
using the technology.
Additional Recommendation 2: The FTC, DoD, and DOJ should alert their
relevant constituents regarding the existence of AI attacks and preventative
measures that can be taken.
Belfer Center for Science and International Affairs | Harvard Kennedy School 77
intelligence-like capabilities beyond attack. This may lead to premature
replacement of humans with algorithms in domains where the threats of
attack or failure are severe yet unknown. This will hold particularly true
for applications of AI to safety and national security. Decisions in these
domains may be made for purposes of reducing operating expenditures,
increasing efficiency, or broad imperatives to adopt new technology and
“modernize.” Without a proper understanding of the threats that exist to
an AI-based system, proper cost-benefit analyses cannot be conducted, and
dangerous vulnerabilities may be overlooked that create systematic risk
within these critical domains.
Reevaluation of AI Applications
Policymakers and industry alike must study and reevaluate the planed role
of AI in many applications. While this may appear Ludditian in view, it has
a historical basis. The US’s Strategic Automated Command and Control
System, a component within the U.S. nuclear control system, still uses
technology systems from the 1970s rather than updated state-of-the-art
computers.76 This is because the presence of cybersecurity vulnerabilities
in new technologies poses too great a risk for this particular application.
76 Fung, Brian, “The Real Reason America Controls its Nukes with Ancient Floppy Disks”,
The Washington Post, 26 May 2016, https://www.washingtonpost.com/news/the-switch/
wp/2016/05/26/the-real-reason-america-controls-its-nukes-with-ancient-floppy-
disks/?noredirect=on&utm_term=.e4d0d5a41b7a
78 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
Similar discussions must occur in regard to the integration of AI into other
applications, but not necessarily with the end goal of reaching binary use/
don’t use outcomes. For some applications, the integration of AI may pose
such little risk that there is little worry. For others, AI may require human
supervision. While this supervision may not always protect against the
consequences of all AI attacks, it may reach a common ground between
full exposure to attack risk and the risk of not realizing the benefits AI can
deliver. The military is setting a good example for this intermediate use by
prioritizing the development of AI systems that augment but do not replace
human control. Finally, some applications of AI may prove too dangerous
to use. Autonomous weapon systems, even those that do not utilize AI,
already carry great stigma due to a fear that attack or algorithmic mistakes
will cause unacceptable collateral damage, and therefore present unaccept-
able levels of risk. This same attitude may be adopted in other applications
reliant on AI.
In some contexts, these discussions can be internally led. The DoD, for
example, has already shown attention to understanding and addressing
the security risks of employing AI. However, in other contexts, such as in
industry settings where parties have shown a disregard and inability to
address other cyber risks, these discussions may need to be forced by an
outside regulatory body such as the FTC.
Belfer Center for Science and International Affairs | Harvard Kennedy School 79
Conclusion
“Knowledge is knowing that Frankenstein is not the monster.
Wisdom is knowing that Frankenstein is the monster.”77
77 Anonymous quote.
80 Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It
technological equals or even superiors must develop and protect against
this new weapon. Law enforcement, an industry that has perhaps fallen
victim to technological upheaval like no other, risks its efforts at moderniz-
ing being undermined by the very technology it is looking at to solve its
problems. Commercial applications that are using AI to replace humans,
such as self-driving cars and the Internet of Things, are putting vulnerable
artificial intelligence technology onto our streets and into our homes.
Segments of civil society are being monitored and oppressed with AI, and
therefore have a vested interest in using AI attacks to fight against the sys-
tems being used against them.
The world has learned a number of painful lessons from the unencum-
bered and reckless enthusiasm with which technologies with serious
vulnerabilities have been deployed. Social networks have been named as an
aide to genocide in Myanmar and the instrument of democratic disruption
in the world’s foremost democracy. Connected infrastructure has led to
attacks with hundreds of millions of dollars of economic loss. The warning
signs of AI attacks may be written in bytes, but we can see them and what
they portend. We would be wise to not ignore them.
Belfer Center for Science and International Affairs | Harvard Kennedy School 81
Belfer Center for Science and International Affairs
Harvard Kennedy School
79 John F. Kennedy Street
Cambridge, MA 02138
www.belfercenter.org