Poisoning Network Flow Classifiers: Giorgio Severi Simona Boboila Alina Oprea
Poisoning Network Flow Classifiers: Giorgio Severi Simona Boboila Alina Oprea
ABSTRACT The same conditions that spurred the development of new auto-
As machine learning (ML) classifiers increasingly oversee the au- mated network traffic analysis systems, have also led researchers
tomated monitoring of network traffic, studying their resilience to develop adversarial machine learning attacks against them, tar-
against adversarial attacks becomes critical. This paper focuses on geting both deployed models [5, 9, 13, 25, 64] (evasion attacks) and,
poisoning attacks, specifically backdoor attacks, against network albeit to a lesser extent, their training process [4, 30, 41, 58] (poi-
traffic flow classifiers. We investigate the challenging scenario of soning attacks). We believe this second category is particularly
clean-label poisoning where the adversary’s capabilities are con- interesting, both from an academic perspective as well as a practi-
strained to tampering only with the training data — without the cal one. Recent research on perceived security risks of companies
ability to arbitrarily modify the training labels or any other compo- deploying machine learning models repeatedly highlighted poison-
nent of the training process. We describe a trigger crafting strategy ing attacks as a critical threat to operational ML systems [22, 78].
that leverages model interpretability techniques to generate trigger Yet, much of the prior research on poisoning attacks in this domain
patterns that are effective even at very low poisoning rates. Finally, tends to adopt threat models primarily formulated in the sphere of
we design novel strategies to generate stealthy triggers, including image classification, such as assuming that the victim would accept
an approach based on generative Bayesian network models, with a pre-trained model from a third party [58], thus allowing adversar-
the goal of minimizing the conspicuousness of the trigger, and thus ial control over the entire training phase, or granting the adversary
making detection of an ongoing poisoning campaign more chal- the ability to tamper with the training labels [4]. As awareness of
lenging. Our findings provide significant insights into the feasibility poisoning attacks permeates more extensively, it is reasonable to as-
of poisoning attacks on network traffic classifiers used in multi- sume that companies developing these types of systems will exhibit
ple scenarios, including detecting malicious communication and an increased wariness to trust third parties providing pre-trained
application classification. classifiers, and will likely spend resources and effort to control
or vet both code and infrastructure used during training. For this
ACM Reference Format: reason, we believe it is particularly interesting to focus on the less
Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz,
studied scenario of an adversary who is restricted to tampering
and Jason Matterer. 2023. Poisoning Network Flow Classifiers . In An-
only with the training data (data-only attack) by disseminating a
nual Computer Security Applications Conference (ACSAC ’23), December
04–08, 2023, Austin, TX, USA. ACM, New York, NY, USA, 15 pages. https: small quantity of maliciously crafted points, and without the ability
//doi.org/10.1145/3627106.3627123 to modify the labels assigned to training data (clean-label) or any
other component of the training process.
Our aim is to investigate the feasibility and effects of poison-
1 INTRODUCTION
ing attacks on network traffic flow classifiers, and in particular
Automated monitoring of network traffic plays a critical role in backdoor attacks —where an association is induced between a trig-
the security posture of many companies and institutions. The large ger pattern and an adversarially chosen output of the model. Our
volumes of data involved, and the necessity for rapid decision- approach focuses on the manipulation of aggregated traffic flow
making, have led to solutions that increasingly rely on machine features rather than packet-level content, as they are common in
learning (ML) classifiers to provide timely warnings of potentially traffic classification applications [53, 61, 88]. We will focus on sys-
malicious behaviors on the network. Given the relevance of this tems that compute aggregated features starting from the outputs
task, undiminished despite being studied for quite a long time [54], of the network monitoring tool Zeek, because of its large user base.
a number of machine learning based systems have been proposed It is important to note that, despite the perceived relevance of poi-
in recent years [29, 52, 60, 61, 88] to classify network traffic. soning attacks, it is often remarkably difficult for an adversary to
∗ Work done while staff member at MIT Lincoln Laboratory.
successfully run a poisoning campaign against classifiers operating
on constraint-heavy tabular data, such as cybersecurity data — like
network flows or malware samples [73]. This is a well known issue
Publication rights licensed to ACM. ACM acknowledges that this contribution was
authored or co-authored by an employee, contractor or affiliate of the United States in adversarial ML, illustrated in detail by [68] and often referred to
government. As such, the Government retains a nonexclusive, royalty-free right to as problem-space mapping. It stems from the complexity of craft-
publish or reproduce this article, or to allow others to do so, for Government purposes ing perturbations of the data points (in feature space) that induce
only.
ACSAC ’23, December 04–08, 2023, Austin, TX, USA the desired behavior in the victim model without damaging the
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. structure of the underlying data object (problem space) necessary
ACM ISBN 979-8-4007-0886-2/23/12. . . $15.00 for it to be generated, parsed, or executed correctly. When dealing
https://doi.org/10.1145/3627106.3627123
337
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
with aggregated network flow data, these difficulties compound pixels (the trigger pattern) was added to a subset of images at train-
with the inherent complexity of handling multivariate tabular data ing time together with an altered label, to induce the prediction of a
consisting of heterogeneous fields. To address these challenges, we target class. Subsequently, Turner et al. [82] and Shafahi et al. [74]
design a novel methodology based on ML explanation methods devised clean-label backdoor attacks which require more poisoning
to determine important features for backdoor creation, and map data samples to be effective, but relax some strong assumptions of
them back into the problem space. Our methods handle complex previous threat models, making them significantly more applicable
dependencies in feature space, generalize to different models and in security scenarios.
feature representations, are effective at low poisoning rates (as low In cybersecurity, the earliest poisoning attacks were designed
as 0.1%), and generate stealthy poisoning attacks. against worm signature generation [57, 66] and spam detectors [56].
In summary, we make the following contributions: (i) We develop More recently, a few studies have looked at packet-level poisoning
a new strategy to craft clean-label, data-only, backdoor poisoning via padding [30, 58], feature-space poisoning in intrusion detec-
attacks against network traffic classifiers that are effective at low tion [4, 42], and label flipping attacks for IoT [65]. Severi et al. [73]
poisoning rates. (ii) We show that our poisoning attacks work across proposed to use model interpretation techniques to generate clean-
different model types, classification tasks, and feature representa- label poisoning attacks against malware classifiers. Their strategies
tions, and we comprehensively evaluate the techniques on several are applicable to security datasets whose records are independent
network traffic datasets used for malware detection and application such as individual files or Android applications, which present a di-
classification. (iii) We propose different strategies, including gener- rect mapping from feature space to problem space. In contrast, our
ative approaches based on Bayesian networks, to make the attacks study explores attacks trained on network traffic, where multiple
inconspicuous and blend the poisoned data with the underlying sequential connections are translated into one single feature-space
training set. To ensure reproducibility, we evaluate our techniques data point; in this setting, inverting triggers from feature to problem
on publicly available datasets, and release all the code used to run space becomes particularly difficult due to data dependencies.
the experiments in the paper1 .
Model Interpretation Techniques. With the proliferation and in-
crease in complexity of ML models, the field of explainable machine
2 BACKGROUND AND RELATED WORK learning, focused on understanding and interpreting model predic-
Machine Learning for Threat Detection. Machine learning tions, has seen a substantial increase in popularity over recent years.
methods have been successfully used to detect several cyber security We are particularly interested in model-agnostic interpretability
threats, including: malicious domains [2, 3, 59, 62, 69], command- techniques, which can be applied to any model. Linardatos et al. [44]
and-control communication between attackers and compromised provide a comprehensive taxonomy of these methods, and conclude
hosts [55, 62], or malicious binaries used by adversaries for distribut- that, among the black-box techniques presented, Shapley Additive
ing exploit code and botnet commands [33, 79]. Several endpoint explanations (SHAP) [48, 49] is the most complete, providing ex-
protection products [31, 32, 50, 51] are now integrating ML tools planations for any model and data type both at a global and local
to proactively detect the rapidly increasing number of threats. scale. SHAP is a game-theory inspired method, which attempts
to quantify how important each feature is for a classifier’s predic-
Adversarial Machine Learning. We can identify two major cate- tions. SHAP improves on other model interpretation techniques
gories of integrity attacks against ML classifiers: (1) evasion attacks, like LIME [72], DeepLIFT [77] and Layer-Wise Relevance Propaga-
which occur at test time and consist in applying an imperceptible tion [6], by introducing a unified measure of feature importance
perturbation to test samples in order to have them misclassified, and that is able to differentiate better among output classes.
(2) poisoning attacks, which influence the training process (either In this study, we also experiment with Gini index [21] and infor-
through tampering with the training dataset or by modifying other mation gain [38, 40] – two of the most popular splitting algorithms
components of the training procedure) to induce wrong predictions in decision trees. A decision tree is built recursively, by choosing at
during inference. For details on other adversarial ML techniques, we each step the feature that provides the best split. Thus, the tree offers
direct the reader to the standardized taxonomy presented in [63]. a natural interpretability, and a straightforward way to compute
In this study, we are focusing on backdoor poisoning attacks, a the importance of each feature towards the model’s predictions.
particularly insidious technique in which the attacker forces the
learner to associate a specific pattern to a desired target objective — Preserving Domain Constraints. Functionality-preserving at-
usually the benign class in cybersecurity applications. While back- tacks on network traffic have mostly looked at evasion during test
door poisoning does not impact the model’s performance on typical time, rather than poisoning. For instance, Wu et al. [84] proposed
test data, it leads to misclassification of test samples that present a packet-level evasion attack against botnet detection, using rein-
the adversarial pattern. ML poisoning has become a top concern forcement learning to guide updates to adversarial samples in a way
in industry [78]. In contrast to evasion attacks which need to gen- that maintains the original functionality. Sheatsley et al. [76] study
erate per-sample perturbations, backdoor triggers, once learned, the challenges associated with the generation of valid adversarial
are powerful and universal as they can be applied to any samples examples that abide domain constraints and develop techniques to
during inference to alter their prediction. learn these constraints from data. Chernikova et al. [13] design eva-
Backdoor poisoning attacks against modern ML models were in- sion attacks against neural networks in constrained environments,
troduced by Gu et al. [23] in BadNets, where a small patch of bright using an iterative optimization method based on gradient descent to
ensure valid numerical domain values. With our constraint-aware
1 https://github.com/ClonedOne/poisoning_network_flow_classifiers
338
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
problem-space mapping, which also takes into account dependen- explanation technique to compute feature importance coefficients,
cies in network traffic, we delve one step further into the challeng- but it prevents any form of inspection of model weights or hid-
ing issue of designing functionality-preserving attacks. den states. This scenario is very common for deployed models, as
Significant advances have been made recently with respect to they often undergo periodical re-training but are only accessible
generating multivariate data. Modern tabular data synthesizers behind controlled APIs. Interacting with a victim system, however,
of mixed data types leverage the power of generative adversarial always imposes a cost on the attacker, whether in terms of actual
networks [11, 18, 19, 86, 91] and diffusion models [39] to create monetary expenses for API quotas, or by increasing the risk of
realistic content from the same distribution as the original data. being discovered. Motivated by this observation, we also explore
Among the different frameworks, FakeTables [11] is the only at- the use of model interpretation methods that do not require any
tempt at preserving functional dependencies in relational tables. access to the classifier, but instead leverage proxy models on local
However, its evaluation is limited to Census and Air Carrier Statis- data (i.e., information gain and Gini coefficients), and can be used
tics datasets, and its ability to capture more complex relationships even when the model is not subject to re-training cycles. Several
between variables is unclear. previous studies on training time attacks [47, 58] relax the model
In this work, we model conditional dependencies in the traf- access constraints, assuming an adversary can train a ML classifier
fic using Bayesian networks – a common choice for generating and provide it to the victim through third-party platforms such as
synthetic relational tables [15, 27, 36, 70, 90]. Bayesian networks Machine Learning as a Service (MLaaS) [71]. However, we believe
offer increased transparency and computational efficiency over that this threat model is rapidly becoming obsolete, at least in the
more complex generative models like generative adversarial net- cybersecurity domain, due to the push for stricter cyber hygiene
works [36]. We believe this is an important advantage in our setting, practices from security vendors, including the reluctance to trust
which deals with large volumes of network traffic featuring multi- third-party model providers and MLaaS platforms [1, 67].
ple variables (e.g., log fields). In cybersecurity, Bayesian networks Importantly, our threat model requires the adversary to have a
have also been used to learn traffic patterns and flag potentially small footprint within the victim network. In practice, the attack
malicious events in intrusion detection systems [16, 34, 83, 85]. could be run by controlling even a single internal host and some
external IPs. . The crafted trigger models a specific traffic pattern
3 THREAT MODEL in a time window, independent of IP values.
Adversary’s Capabilities. Recent work analysing the training Adversary’s Objective. The main objective of the adversary is
time robustness of malware classifiers [73, 89] pointed out that to acquire the ability to consistently trigger desired behavior, or
the use of ever larger quantities of data to train effective security output, from the victim model, after the latter has been trained
classifiers inherently opens up the doors to data-only poisoning on the poisoned data. In this study, we focus on the binary class
attacks, especially in their more stealthy clean-label [74, 81] vari- scenario (0/1), where the goal is defined as having points of a chosen
ants where the adversary does not control the label of the poisoned victim class being mis-labeled as belonging to the target class, when
samples. Thus, in this work, we constrain the adversary to clean- carrying a backdoor pattern that does not violate the constraints
label data-only attacks. This type of setup moves beyond the classic of the data domain. For instance, in the benign/malicious case, the
threat model proposed by Gu et al. [23] and adopted by other re- adversary attempts to have malicious data points mis-classified as
search [12, 47, 58], where the adversary was able to tamper with not benign, where “benign” represents the target class.
only the content of the training points but also the corresponding Adversary’s Target. We select two representative ML classifier
ground-truth labels. models as targets for our attacks: Gradient Boosting decision trees,
Network traffic labeling is often done using antivirus tools or and Feed-forward Neural Networks. Both of these models have
external threat services (e.g., Intrusion Detection Systems, VirusTo- been widely-used in intrusion detection for classifying malicious
tal2 , etc.) [24, 62], making label manipulation hard for an adversary. network traffic, with decision trees often preferred in security con-
Hence, clean-label poisoning is a more realistic threat model, where texts due to their easier interpretation [35]. We study two use cases
access to even a single compromised host is enough to carry out of network traffic classifiers: (1) detection of malicious activities,
the attack by injecting the backdoor into the benign traffic, without and (2) application classification.
tampering with the data collection and labeling process. By dissemi-
Data Format. In our threat model, network traffic consists of con-
nating innocuous looking —but adversarially crafted— data, i.e., the
nection logs (“conn.log” files), which are extracted from packet-level
backdoor, the adversary is able to tamper with a small, yet effective,
PCAP files using the Zeek3 monitoring tool. We use a subset of the
percentage of the training set and induce the desired behavior in
Zeek log fields previously used in the literature that are effective
the learned model.
at detecting malicious traffic [61]. The Zeek log fields used in our
To design the trigger, the adversary requires access to a small
study are described in Appendix A, Table 5, and include port, IP
amount of clean labeled data, 𝐷𝑎 , from a similar distribution as
address, protocol, service, timestamp, duration, packets, payload
the victim’s training data 𝐷. 𝐷𝑎 is used for crafting the backdoor
bytes, and connection state. Thus, the input data is tabular and
pattern and it is disjoint from the training and test datasets.
multivariate, consisting of multiple log fields in either numeric for-
We consider an adversary who has query-only access to the ma-
mat (e.g., bytes, packets, etc.) or categorical format (e.g., connection
chine learning classifier. This allows the attacker to use the SHAP
state, protocol, etc.). A data point in this domain is represented by
2 https://www.virustotal.com/ 3 https://zeek.org/ Previously known as Bro.
339
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
340
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
Phase II. Once the subset of important features is selected, we corresponding to non-selected features in the backdoor to make
can proceed to find a suitable assignment of values. To be consistent them appear closer to values common in the target-class natural
with real traffic constraints, we need to ensure that the values data ∈ 𝐷𝑎 . Note that fields influencing the selected (important) fea-
that we select represent information that can be easily added to tures will not be modified, because they carry the backdoor pattern
data points of the non-target class, by injecting new connections, associated with the target class. Our generative approach leverages
without having to remove existing connections. Our features are Bayesian networks, a widely-used probabilistic graphical model
mainly count-based, hence injecting the trigger will increase feature for encoding conditional dependencies among a set of variables,
values. Thus, for the assignment, we select values that correspond to and deriving realistic samples of data [15, 27, 70]. Bayesian net-
the top 𝑡 𝑡ℎ percentile of the corresponding features for non-target works consist of two parts: (1) structure – a directed acyclic graph
class points. Choosing a high percentile is a reasonable heuristic, (DAG) that expresses dependencies among the random variables
as it provides a strong signal. In practice, setting this parameter to associated with the nodes, and (2) parameters – represented by
the 95𝑡ℎ percentile performed well in our experiments. conditional probability distributions associated with each node.
Phase III. Armed with the desired assignment for the selected Structure. Given our objective to synthesize realistic log connec-
features, we can proceed to identify an existing data point that ap- tions (in problem space) that lead to the feature-space prototype,
proximates these ideal trigger values. To find it, in our first attack we we construct a directed acyclic graph 𝐺 = (𝑉 , 𝐸) where the nodes
leverage a mimicry method to scan the non-target (e.g., malicious) 𝑥𝑖 ∈ 𝑉 correspond to fields of interest in the connection log and
class samples and isolate the one with the lowest Euclidean distance the edges 𝑒𝑖 𝑗 ∈ 𝐸 model the inter-dependencies between them. We
from the assignment, in the subspace of the selected features. We explore field-level correlations in connection logs using two statis-
call this point in feature space the trigger prototype. An example tical methods that have been previously used to study the degree of
of the trigger prototype in feature space is given in Appendix C association between variables [37]: the correlation matrix and the
Table 8. pairwise normalized mutual information. In our experiments, both
methods discover similar relationships in 𝐷𝑎 , with the mutual infor-
Phase IV. Up until this point, the process was working completely mation approach bringing out additional inter-dependencies. Note
in feature space. Our explicit goal, however, is to run the attack in that we are not interested in the actual coefficients, rather, in the
problem space. So the next step in the attack chain is to identify, in associational relationships between variables. Thus, we extract the
the attacker’s dataset, a contiguous subset of log connections that strongest pairwise associations, and use them in addition to domain
best approximate the prototype. Enforcing that the selected subset is expertise to guide the design of the DAG structure. For instance,
contiguous ensures that temporal dependencies across log records there is a strong relationship between the number of response
are preserved. This subset of connections represents the actual packets and source packets (resp_pkts ↔ orig_pkts); between the
trigger that we will use to poison the target-class training data. protocol and the response port (proto ↔ resp_p); between the
Appendix C Table 9 shows an excerpt from a trigger materialized connection state and protocol (conn_state ↔ proto), etc.
as a pattern of connections. There is a large body of literature on learning the DAG structure
Phase V. Finally, it is time to inject the trigger in the training directly from data. We point the interested reader to a recent survey
data. This step is quite straightforward, as it the adversary is in by Kitson et al. [37]. However, computing the graphical structure
control of generating the poisoned data, and can execute the trigger remains a major challenge, as this is an NP-hard problem, where
connections in the specified order. We next describe two strategies the solution space grows super-exponentially with the number of
for increasing trigger stealthiness before injection. variables. Resorting to a hybrid approach [37] that incorporates
expert knowledge is a common practice that alleviates this issue.
4.2 Increasing Attack Stealthiness The survey also highlights the additional complexity in modeling
the DAG when continuous variables are parents of discrete ones,
Beyond the basic objective of maximizing attack success, the ad- and when there are more than two dependency levels in the graph.
versary may have the additional goal of minimizing the chance of Based on the above considerations, we design the directed acyclic
being detected. To achieve this secondary goal, the adversary may graph presented in Figure 2. For practical reasons, we filter out
wish to slightly alter the trigger before injecting it in the training some associations that incur a high complexity when modeling the
data. In particular, we study two strategies: (1) trigger size reduction conditional probability distributions. To ensure that the generated
and (2) trigger generation using Bayesian models. traffic still reflects the inter-dependency patterns seen in the data,
Trigger size reduction. The first strategy consists of minimizing we inspect the poisoned training dataset using the same statistical
the trigger footprint, by removing all the connections that are not techniques (correlation matrix and mutual information). We include
strictly necessary to achieve the values specified in the prototype the mutual information matrix on the clean adversarial dataset
for the subset of important features (such as connections on other (Appendix E, Figure 8a) and on the training dataset poisoned with
ports). We then select the smallest subset of contiguous connections the Generated trigger method (Appendix E, Figure 8b), to show
that would produce the desired values for the selected features. that the associational relationships between variables are preserved
after poisoning (though the actual coefficients may vary).
Trigger generation using Bayesian networks. The second strat-
egy aims at reducing the conspicuousness of the trigger by blending Parameters. Bayesian networks follow the local Markov prop-
it with the set of connections underlying the data point where it erty, where the probability distribution of each node, modeled as a
is embedded. To this end, we generate the values of the log fields random variable 𝑥𝑖 , depends only on the probability distributions
341
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
resp_p orig_pkts The Neris botnet scenario unfolds over three capture periods. We
use two of these periods for training our models, and we partition
service proto orig_p resp_pkts orig_bytes the last one in two subsets, keeping 85% of the connections for the
test set, and 15% for the adversarial set, 𝐷𝑎 .
conn_state resp_bytes
CIC IDS 2018 Botnet: From CTU-13, we moved to a recent dataset
for intrusion detection systcheems, the Canadian Institute for Cy-
Figure 2: Directed Acyclic Graph (DAG) representing the bersecurity (CIC) IDS 2018 dataset [75]. We experimented with the
inter-dependencies between log connection fields. botnet scenario, in which the adversary uses the Zeus and Ares mal-
ware packages to infect victim machines and perform exfiltration
of its parents. Thus, the joint probability distribution of a Bayesian actions. This dataset includes a mixture of malicious and benign
network consisting of 𝑛 nodes is represented as: 𝑝 (𝑥 1, 𝑥 2, · · · , 𝑥𝑛 ) = samples and is also heavily imbalanced.
Î𝑛
𝑖=1 𝑝 (𝑥𝑖 |𝑥 𝑃𝑖 ), where 𝑃𝑖 is the set of parents for node 𝑖, and the CIC ISCX 2016 dataset: This dataset contains several application
conditional probability of node 𝑖 is expressed as 𝑝 (𝑥𝑖 |𝑥 𝑃𝑖 ). traffic categories, such as chat, video, and file transfer. We leverage
Sampling. The DAG is traversed in a hierarchical manner, one the CIC ISCX 2016 dataset [17] to explore another scenario where
step at a time, as a sequential decision problem based on probabili- an adversary may affect the outcome via poisoning: detection of
ties derived from the data, with the goal of generating a realistic banned applications. For instance, to comply with company policies,
set of field-value assignments. The value assignments for nodes an organization monitors its internal network to identify usage
at the top of the hierarchy are sampled independently, from the of prohibited applications. An adversary may attempt to disguise
corresponding probability distribution, while the nodes on lower traffic originating from a banned application as another type of
levels are conditioned on parent values during sampling. We com- traffic. We study two examples of classification tasks on the non-vpn
pute the conditional probabilities of categorical fields (e.g., ports, traffic of this dataset: (1) File vs Video, where we induce the learner
service, protocol, connection state), and model numerical fields to mistake video traffic flows as file transfer, and (2) Chat vs Video,
(e.g., originator/responder packets and bytes) through Gaussian where the classifier mis-labels video traffic as chat communication.
kernel density estimation (KDE). An example of the KDE learned Performance Metrics. Similar to previous work in this area [58,
from the data, and used to estimate the number of exchanged bytes 73], we are interested in the following indicators of performance
between a source (originator) and a destination (responder), given for the backdoored model:
the number of packets, is presented in Appendix D, Figure 7.
• Attack Success Rate (ASR). This is the fraction of test data
Given the complexity of sampling from hybrid Bayesian net-
points which are mis-classified as belonging to the target
works, we approximate the conditional sampling process with a
class. We evaluate this metric on a subset of points that have
heuristic, described in Table 1. We consider an example where the
been previously correctly classified by a clean model trained
log fields corresponding to the most important features have been
with the same original training data and random seed.
set to the TCP protocol and responder port 80. Our generative
method synthesizes values for the rest of the fields, in an attempt • Performance degradation on clean data. This metric captures
to make the trigger blend in with the target class. We show in our the side effects of poisoning, by evaluating the ability of the
evaluation that the synthesized poisoning traffic is a good approxi- backdoored model to maintain its predictive performance on
𝑝
mation of clean network traffic, both in terms of Jensen-Shannon clean samples. Let 𝐹 1 be the F1 score of the poisoned model
distance between distributions (Section 5.3) and preservation of on the clean test set, and 𝐹 1𝑐 the test score of a non-poisoned
field-level dependencies (Appendix E). model trained equally, the performance degradation on clean
data at runtime is: Δ𝐹 1 = |𝐹 1 − 𝐹 1𝑐 |.
𝑝
5 EXPERIMENTAL RESULTS Unless otherwise noted, all the results shown in the following
sections are averages of five experiments with different random
5.1 Experimental Setup seeds, reported with their relative standard deviations.
In this section, we describe the datasets and performance metrics
used in our evaluation. We also present the baseline performance Parameters. We define 𝑝% as the percentage of feature-space
of the target classifiers (without poisoning). points of the training dataset that have been compromised by an ad-
versary. Since the amount of poisoned points is generally a critical
Datasets. We used three public datasets commonly used in cyberse- parameter of any poisoning attack, we measure the attack perfor-
curity research for intrusion detection and application classification. mance across multiple poison percentage values 𝑝% . At runtime,
we randomly select a subset of test points to inject the trigger.
CTU-13 Neris Botnet: We started our experimentation with the
Specifically, we select 200 points for the CTU-13 and CIC IDS 2018
Neris botnet scenario of the well-known CTU-13 dataset [20]. This
datasets, and 80 for the CIC ISCX 2016 dataset (due its smaller size).
dataset offers a window into the world of botnet traffic, captured
within a university network and featuring a blend of both malicious Baseline Model Performance. As mentioned in our threat model,
and benign traffic. Despite the sizeable number of connections we consider two representative classifiers: a Gradient Boosting
(≈ 9∗106 ), the classes are extremely imbalanced, with a significantly Decision Tree (GB), and a Feed Forward Neural Network (FFNN).
larger number of benign than malicious data points. Note that the Note that we are not interested in finding the most effective possible
class imbalance is a common characteristic of security applications. learner for the classification task at hand, instead our focus is on
342
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
Table 1: Sampling method for each dependency described in the DAG from Figure 2. In this example, we assume that the most
important features correspond to protocol and port; their values (TCP protocol on port 80) have been determined in Phase II of
our strategy. Here, our generative method samples the rest of the log field values. 𝐷𝑎 represents the attacker’s dataset.
343
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
parameter, halving and doubling the number of selected features, Table 3: Area under the Precision-Recall Curve and F1 score
but we found that eight were sufficient to achieve satisfying ASRs. obtained by performing anomaly detection on the poisoned
Attack Success Rate: We show the results of these experiments data with an Isolation Forest model trained on a clean subset
in Figure 3. On average, we found the Entropy strategy to be the of the training data. CTU-13 Neris, at 1% poisoning rate.
most successful against both classifiers on this dataset. The Random
strategy leads to inconsistent results: occasionally, it stumbles upon Strategy Model Trigger PR AUC 𝐹 1 score
useful features, but overall attacks relying on Random selection Full 0.056 0.013
perform worse than attacks guided by the other feature selection Entropy Any Reduced 0.045 0.012
methods. Figure 3 also illustrates a major finding – our attacks per- Generated 0.078 0.018
form well even at very small poisoning rates such as 0.1%, where Full 0.099 0.015
they reach an attack success rate of up to 0.7 against the Gradient Gradient Boosting Reduced 0.070 0.013
Boosting classifier. As expected, increasing the poisoning percent- Generated 0.099 0.019
SHAP
age leads to an increase in attack success rate; for instance, an ASR Full 0.061 0.015
of 0.95 is obtained with Entropy at 1.0% poisoning. By comparison, Feed-forward NN Reduced 0.047 0.014
previous works only considered larger poisoning rates (e.g, 2% to Generated 0.052 0.012
20% in [41], 20% samples from nine (out of ten) non-target classes
in [58]). We also notice that some of the variance in the ASR re-
sults can be attributed to a somewhat bimodal distribution. This and Reduced trigger deliver an attack success rate of about 0.7 and
can be partially explained with differences in the resulting trigger 0.4, respectively, while the Generative trigger is able to synthesize
sizes, with Figure 4b highlighting the correlation between larger more effective triggers, which leads to attack success rates over 0.7.
triggers and higher ASR. We leave a more detailed analysis of the Figure 4b studies the correlation between trigger size (measured
distribution of the ASR scores for future work. in number of connections) and attack success rate for each type
Furthermore, we observe that the SHAP strategy, while working of trigger. Each data point represented in the figure constitutes a
well in some scenarios (especially for the application classification separate experiment, while the regression lines capture the trend
tasks in Section 5.5) does not, on average, lead to better results (how ASR changes as the trigger size changes). These figures show
than estimating feature importance through proxy models (Entropy that the generative method leads to consistently smaller triggers
and Gini). This makes the attack quite easy to run in practice, as it than the other two methods, without sacrificing attack success. This
circumvents the necessity to run multiple, potentially expensive, result is indicative of the power of generative models in knowledge
queries to the victim model. discovery, and, in our case, their ability to synthesize a small set of
Performance degradation on clean data: While these results show realistic log connections that lead to the feature-space prototype.
that the attack causes the poisoned model to misclassify poisoned Figure 4b also shows that the size reduction strategy is able to create
data, we also want to make sure that the performance on clean triggers (Reduced trigger) that are smaller than the Full trigger, but
data is maintained. The average Δ𝐹 1 across poisoning rates and at the expense of the attack success rate.
feature selection strategies in our experiments was below 0.037,
Evaluation of attack stealthiness in feature space. Next, we
demonstrating that the side effects of the attack are minimal. The
evaluate the attack stealthiness in feature space, using the Isolation
neural network model exhibits on average a slightly larger decrease
Forest [45] algorithm for anomaly detection. The objective of this
when compared against the Gradient Boosting classifier, especially
experiment is to see whether a standard technique for anomaly
when the Entropy and Gini feature selection strategies are used.
detection can identify and flag the poisoned samples as anomalies.
The anomaly detector is trained on a clean subset of data, which is
5.3 Attack Stealthiness completely disjoint from the poisoned data points and consists of
Remaining undetected is an important factor in running a success- 10% of the entire training dataset.
ful poisoning campaign. Here, we study the impact of our two ap- Table 3 presents the anomaly detection results on the poisoned
proaches for increasing attack stealthiness described in Section 4.2: data obtained with each trigger type (Full, Reduced, and Generated).
reducing the trigger size (Reduced trigger) and generating the trig- For comparison, we evaluate both the entropy-based and the SHAP-
ger connections using Bayesian networks (Generated trigger). We based feature selection strategies used to craft the injected pattern.
start by analyzing the attack success with the different types of Since SHAP queries the model to compute feature relevance scores,
triggers, followed by a quantitative comparison of their stealthiness we present the anomaly detection results separately for a SHAP-
in feature space (via anomaly detection), and in problem space (via guided attack against a Gradient Boosting classifier and against a
the Jensen-Shannon distance). Feed-forward Neural Network. Across the board, we observe very
low Precision-Recall area under the curve (AUC) scores (in the 0.045
Evaluation of attack success. Figure 4a shows the attack success
– 0.099 range), as well as very low 𝐹 1 scores (in the 0.012 – 0.019
rate as a function of the poisoning percentage for the three different
range). These results demonstrate the difficulty of differentiating
types of triggers: Full, Reduced, and Generated. We observe that
the poisoned data points from the clean data points, and indicate
all triggers are able to mount effective attacks against the Gradient
that the poisoning attacks are highly inconspicuous in feature space.
Boosting classifier, with attack success rates over 0.8 when 0.5%
or more of the training data is poisoned. The Feed-forward Neural Evaluation of attack stealthiness in problem space. We also
Network is generally more resilient to our attacks: the Full trigger evaluate attack stealthiness in problem space, in terms of how
344
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
(b) Correlation between the number of connections composing the trigger and the attack success rate (ASR). Each point represents a separate
experiment. Curve fitting illustrating the trend is performed using linear regression.
Figure 4: Analysis of trigger selection strategy. CTU-13 Neris Botnet scenario, with the Entropy feature selection strategy.
345
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
346
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
not feasible in practice, particularly for stateful protocols (TCP). be lessened through different strategies to decrease the likelihood
There are two ways to address this potential issue. First, given the of a defender discovering an ongoing poisoning campaign.
relentless pace of improvements in generative models, including Furthermore, we demonstrated that this form of poisoning has a
those targeting tabular data [7, 87], we expect that the ability of relatively wide applicability for various objectives across different
generative models to infer the inter-feature constraints that char- types of classification tasks. The implications of these findings
acterize this data modality will increase significantly in the very extend our understanding of ML security in practical contexts,
short term. In parallel, the adversary could attempt to verify the and prompt further investigation into effective defense strategies
correctness of the generated connections using a model checker against these refined attack methodologies.
and a formal model of the TCP protocol, and simply reject the non-
conforming ones. Both approaches are exciting avenues for future ACKNOWLEDGMENTS
research, and we leave their in-depth analysis to future work.
This research was sponsored by the U.S. Army Combat Capabilities
Labeling. Network traffic labeling usually relies on intrusion de- Development Command Army Research Laboratory (DEVCOM
tection systems, antivirus tools and external threat services [24, 62]. ARL) under Cooperative Agreement Number W911NF-13-2-0045,
In our threat model, the adversary has no control on the labels, and and the Department of Defense Multidisciplinary Research Pro-
simply injects the poisoning traffic into benign connections. Hence, gram of the University Research Initiative (MURI) under contract
the question arises: Will the poisoned samples still have a benign W911NF-21-1-0322.
label? We assume the poisoned samples remain benign, based on DISTRIBUTION STATEMENT A. Approved for public release.
the following reasons: (1) The Jensen-Shannon distance between Distribution is unlimited. This material is based upon work sup-
poisoned and clean samples is very small (Figure 5); (2) Anom- ported by the Under Secretary of Defense for Research and En-
aly detection (Table 3) cannot identify the poisoned samples (F1 gineering under Air Force Contract No. FA8702-15-D-0001. Any
score < 0.02); (3) Features are extracted from connections metadata, opinions, findings, conclusions or recommendations expressed in
and the actual packet contents do not need to be malicious. this material are those of the author(s) and do not necessarily re-
Mitigation. We designed methods to hide the poisoning campaign, flect the views of the Under Secretary of Defense for Research and
and showed that our poisoning points are difficult to identify both Engineering.
in feature space, by using anomaly detection techniques, and in
problem space, by analysing the distributional distance of poisoned REFERENCES
data. Defending ML models from backdoor attacks is an open, and [1] Emre Kiciman Andrew Marshall, Jugal Parikh and Ram Shankar Siva Kumar.
extremely complex, research problem. Many of the current pro- 2022. Threat Modeling AI/ML Systems and Dependencies. https://learn.microsoft.
com/en-us/security/engineering/threat-modeling-aiml.
posed solutions are designed to operate in the computer vision [2] Manos Antonakakis, Roberto Perdisci, David Dagon, Wenke Lee, and Nick Feam-
domain [10], or on specific model architectures [46, 80]. In contrast, ster. 2010. Building a Dynamic Reputation System for DNS. In Proceedings of
the 19th USENIX Conference on Security (Washington, DC) (USENIX Security’10).
our attack method generalizes to different model typologies. More- USENIX Association, USA, 18.
over, initial research on defending classifiers from backdoor attacks [3] Manos Antonakakis, Roberto Perdisci, Wenke Lee, Nikolaos Vasiloglou, and
in the security domain [28] highlighted potential trade-offs between David Dagon. 2011. Detecting Malware Domains at the Upper DNS Hierarchy.
In Proceedings of the 20th USENIX Conference on Security (San Francisco, CA)
robustness and utility (e.g., defenses that rely on data sanitization (SEC’11). USENIX Association, USA, 27.
may mistakenly remove a high number of benign samples in an [4] Giovanni Apruzzese, Michele Colajanni, Luca Ferretti, and Mirco Marchetti. 2019.
attempt to prune out potentially poisoned samples). By releasing Addressing Adversarial Attacks Against Security Systems Based on Machine
Learning. In 2019 11th International Conference on Cyber Conflict (CyCon), Vol. 900.
new attack strategies, we hope to encourage future research in 1–18. https://doi.org/10.23919/CYCON.2019.8756865
the challenging direction of defending against backdoor attacks on [5] Md. Ahsan Ayub, William A. Johnson, Douglas A. Talbert, and Ambareen Siraj.
2020. Model Evasion Attack on Intrusion Detection Systems using Adversarial
network traffic. Machine Learning. In 2020 54th Annual Conference on Information Sciences and
Systems (CISS). 1–6. https://doi.org/10.1109/CISS48834.2020.1570617116
[6] Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, Klaus-Robert
7 CONCLUSIONS Müller, and Wojciech Samek. 2016. Layer-wise relevance propagation for neural
networks with local renormalization layers. In Artificial Neural Networks and
With this work we investigated the possibility of carrying out data- Machine Learning–ICANN 2016: 25th International Conference on Artificial Neural
Networks, Barcelona, Spain, September 6-9, 2016, Proceedings, Part II 25. Springer,
only, clean-label, poisoning attacks against network flow classifiers. 63–71.
We believe this threat model holds substantial significance for the [7] Stavroula Bourou, Andreas El Saer, Terpsichori-Helen Velivassaki, Artemis
security community, due to its closer alignment with the capabilities Voulkidis, and Theodore Zahariadis. 2021. A Review of Tabular Data Syn-
thesis Using GANs on an IDS Dataset. Information 12, 9 (Sept. 2021), 375.
exhibited by sophisticated adversaries observed in the wild, and https://doi.org/10.3390/info12090375
the current best practices in secure ML deployments, in contrast to [8] Martin Burkhart, Mario Strasser, Dilip Many, and Xenofontas Dimitropoulos. 2010.
other prevailing models frequently employed. SEPIA: Privacy-Preserving Aggregation of Multi-Domain Network Events and
Statistics. In 19th USENIX Security Symposium (USENIX Security 10). USENIX Asso-
The attack strategy we introduce can effectively forge consistent ciation, Washington, DC. https://www.usenix.org/conference/usenixsecurity10/
associations between the trigger pattern and the target class even sepia-privacy-preserving-aggregation-multi-domain-network-events-and
[9] Xiaoyu Cao and Neil Zhenqiang Gong. 2017. Mitigating Evasion Attacks to
at extremely low poisoning rates (0.1-0.5% of the training set size). Deep Neural Networks via Region-Based Classification. In Proceedings of the 33rd
This results in notable attack success rates, despite the constrained Annual Computer Security Applications Conference (Orlando, FL, USA) (ACSAC
nature of the attacker. While the attack is effective, it has minimal ’17). Association for Computing Machinery, New York, NY, USA, 278–287. https:
//doi.org/10.1145/3134600.3134606
impacts on the victim model’s generalization abilities when dealing [10] Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Ed-
with clean test data. Additionally, the detectability of the trigger can wards, Taesung Lee, Ian Molloy, and Biplav Srivastava. 2019. Detecting Backdoor
347
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
348
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
349
ACSAC ’23, December 04–08, 2023, Austin, TX, USA Giorgio Severi, Simona Boboila, Alina Oprea, John Holodnak, Kendra Kratkiewicz, and Jason Matterer
B SELECTED FEATURES
While the features selected to form the trigger will change accord- D MODELING THE BYTES DISTRIBUTION
ing to selection strategy, victim model, and randomness effects, we In Figure 7, we present the modeling of two log field values using
report in Table 7 the list of 20 most frequently selected features the Kernel Density Estimation (KDE): responder bytes (left side)
in our experiments on CTU-13 Neris. Since our features are aggre- 5 S0:
Connection attempt seen, no reply observed by Zeek.
gated by internal IP (in addition to time window and port), in the 6 RSTRH: The responder sent a SYN ACK and then a reset, while Zeek did not observe
table, “_s_” and “_d_” distinguish between the internal IP being the a SYN from the originator.
350
Poisoning Network Flow Classifiers ACSAC ’23, December 04–08, 2023, Austin, TX, USA
Table 9: Excerpt of 10 consecutive connection events from a trigger. Due to space constraints, only the relevant fields are shown.
id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state orig_pkts resp_pkts
147.32.84.165 2293 67.195.168.230 25 tcp - 3.004297 0 0 S0 2 0
147.32.84.165 2297 168.95.5.51 25 tcp - 3.004231 0 0 S0 2 0
147.32.84.165 2298 67.195.168.230 25 tcp - 2.987467 0 0 S0 2 0
147.32.84.165 2303 74.125.113.27 25 tcp - 2.987476 0 0 S0 2 0
147.32.84.165 2359 174.133.57.141 80 tcp http 0.310870 358 1765 SF 6 5
147.32.84.165 2367 31.192.109.167 80 tcp http 3.170172 229 194 SF 6 5
147.32.84.165 2368 174.133.57.141 80 tcp http 124.109969 358 2920 RSTO 6 5
147.32.84.165 2354 212.117.174.7 4506 tcp - 3.009544 0 0 S0 2 0
147.32.84.165 2353 212.117.171.138 65500 tcp - 57.278569 883 1097 SF 23 24
and originator bytes (right bytes). The figure shows the observed
distribution of bytes per packet in the adversary’s dataset (top row),
and the KDEs distribution of these fields learned from the data
(middle row), and the distribution of the sampled values (bottom
row). Note the similar distribution across the three rows, for each
of the two fields, which indicates that the KDE method is able to
capture the data distribution well. (b) Mutual information on the poisoned training dataset
351