2024ManuscriptFinal-MTDroid A Moving Target Defense-Based Android Malware Detector Against Evasion Attacks
2024ManuscriptFinal-MTDroid A Moving Target Defense-Based Android Malware Detector Against Evasion Attacks
net/publication/381418789
CITATIONS READS
3 105
5 authors, including:
Yuyang Zhou
Southeast University
25 PUBLICATIONS 910 CITATIONS
SEE PROFILE
All content following this page was uploaded by Yuyang Zhou on 18 June 2024.
Abstract—Machine learning (ML) has been widely adopted for machine learning (ML) techniques to train a binary classifier
Android malware detection to deal with serious threats brought (i.e., malware detector) to identify malware instances via
by explosive malware attacks. However, it has been recently features [3]–[5]. In particular, existing methods [6]–[8] have
proven that ML-based detection systems exhibit inherent vulner-
abilities to evasion attacks, which inject adversarial perturbations shown promising results with 99% of accuracy in their lab-
into a malicious app to hide its malicious behaviors and evade oratory settings. Unfortunately, ML-based detection systems
detection. To date, researchers have not found effective solutions were proven to be very vulnerable to adversarial examples [9],
for this critical problem. Although there are some similar works which can enable evasion attacks [3], [10], poisoning at-
in the image classification field, most of those ideas cannot tacks [11], or a combination of both. In this study, we narrow
be borrowed due to the significant differences between images
and Android apps. In this paper, we exploit Moving Target our focus to evasion attacks, which are designed to deceive
Defense (MTD) to continually change the attack surface of the ML-based detection during the testing phase.
protected detector and create uncertainty on the attacker side. Despite that adversarial machine learning has been inves-
We thus propose a novel Android malware detection framework tigated for more than a decade in the image classification
named MTDroid, which fully leverages a seamless blend of field [12], there are few defense mechanisms to protect An-
dynamicity, diversity, and heterogeneity to mitigate the impact
of evasion attacks. To this end, we develop a dynamic model droid malware detectors against evasion attacks [10], [13].
pool to decrease the exposure time of a single classifier, by Apart from this, these countermeasures may sacrifice the
building and rebuilding multiple heterogeneous models with accuracy of clean data in some cases [14]. Moreover, the static
distinct data. We then generate diversified variant models to and deterministic classifiers grant the attackers the advantage
provide defensive measures against various attacks, and further of time to acquire the details of the defense mechanisms,
improve robustness through ensemble learning. Specifically, we
propose a two-stage selection algorithm to optimize the ensemble and they could carefully craft the next-generation adversarial
learning process, and design a hybrid update strategy to refresh examples to compromise the enhanced ML model again.
the framework dynamically. The experimental results show that Fortunately, Moving Target Defense (MTD) [15] has
MTDroid significantly enhances the robustness against a wide emerged as a promising solution in the field of cybersecurity.
range of attacks and outperforms the state-of-the-art methods It is a proactive paradigm that aims to develop and implement
upon three popular practical datasets.
mechanisms that are diverse, heterogeneous, and constantly
Index Terms—Android Malware Detection, Adversarial Exam- evolving [16]. By enhancing the dynamics and randomness
ples, Evasion Attacks, Moving Target Defense, Machine Learning. of the protected system, MTD enhances the system’s unpre-
dictability in terms of time and space [17]. Consequently, it
significantly amplifies the complexity and cost for attackers,
I. I NTRODUCTION thus weakening their advantage and diminishing their effec-
tiveness in confrontations [18]. In the context of adversarial
W ITH the growing popularity of the Android system, it
not only attracts developers to produce powerful and
diversified mobile apps, but also becomes a potential victim of
example detection, MTD typically involves moving between
multiple ML models to prevent a single point of failure.
exponentially increasing malware attacks. In 2021, Zimperium This approach renders the constructed adversarial examples,
reported that 2 billion new malware samples emerged in the which are designed towards a single detector, ineffective across
wild [1], and Kaspersky detected 1,661,743 mobile malware multiple detectors, making them less successful against the
or unwanted software installers in 2022 [2]. Hence, it is a big victim detection system over an extended period [19].
challenge for detecting and analyzing Android malware, which However, there exist several drawbacks in previous MTD-
could lead to both privacy leakage and economic loss. based studies. First, existing work often relies on small ensem-
In recent years, there has been a growing body of research bles of learning models, which may be bypassed by strong
on Android malware detection, and most of them leverage evasion attacks because of their static nature [20]. Second,
while the utilization of a model pool can be beneficial, relying
Yuyang Zhou, Guang Cheng, Zongyao Chen, and Yujia Hu are with solely on traditional adversarial training methods may result in
the School of Cyber Science and Engineering, Southeast University, Purple
Mountain Laboratories, and Jiangsu Province Engineering Research Center of the detection of only a limited number of evasion attacks that
Security for Ubiquitous Network, Nanjing 211189, China. E-mail: {yyzhou, closely resemble the training samples. Third, the selection of
chengguang, solar1s, huyj}@seu.edu.cn. candidate models and the updating strategy of the model pool
Shui Yu is with the School of Computer Science, University of Technology
Sydney, Ultimo, NSW 2007, Australia. E-mail: [email protected]. have not been well investigated in previous research. In the
Guang Cheng is the corresponding author. absence of optimized MTD deployment, adversaries can fool
2
the target model by taking advantage of the exposed vulnerable Static Analysis-based Detection. These approaches drew
model within the attack window. inspiration from static program analysis, which statically
To deal with the aforementioned challenges, this paper inspects programs and disassembles their code to identify
proposes an MTD-based Android malware detection frame- potential Android malware. Numerous static analysis methods
work, namely MTDroid, that adequately exploits dynamicity, or tools have been proposed, such as FlowDroid [21] (with its
diversity, and heterogeneity in the full lifecycle of detection. In extension IccTA [22]), DroidSafe [23], and Amandroid [24].
detail, MTDroid maintains multiple heterogeneous models and However, these methods may fall short when a malware
exploits the use of diversified adversarial training, increasing sample employs encryption or obfuscation technology.
the robustness of the classifiers while reducing the transfer- Dynamic Analysis-based Detection. Another research di-
ability among them. It also optimally combines several variant rection focuses on detecting malware via runtime analysis.
models via ensemble learning, to strengthen the resilience to For instance, DroidScope [25] reconstructs the OS-level and
various evasion attacks and preserve performance on clean Java-level semantics simultaneously by extracting APIs from
applications. To this end, the proposed framework is designed different platform layers. TaintDroid [26] dynamically moni-
to dynamically evolve based on a hybrid update strategy, for tors applications in a protected environment by taint analysis.
decreasing the exposure time of the detector and invalidating However, gathering runtime behaviors requires executing ap-
the previous knowledge of attackers. In particular, the main plications, which often leads to additional time overhead.
contributions of this paper are summarized as follows: ML-based Detection. To mitigate these threats efficiently,
• We presented a novel Android malware detection frame- researchers have developed many malware detection systems
work that leverages a seamless blend of dynamicity, based on ML algorithms, which have shown high accuracy
diversity, and heterogeneity to enhance security and ro- and low false positivity. For example, Drebin [27] exploits
bustness. To the best of our knowledge, this is the first static features and applies a Support Vector Machine (SVM)
work that fully exploits MTD properties to proactively for malware detection. MaMaDroid [28] extracts the API calls
defend against a wide range of evasion attacks in the and utilizes K-Nearest Neighbors (KNN) to identify malware.
Android ecosystem. Besides, some deep learning (DL) based methods [29]–[31]
• We deployed a model pool consisting of multiple het- have also been proposed with outstanding capability in mal-
erogeneous models to greatly mitigate the impact of ware or attack detection. Unfortunately, some of them have
the transferability of adversarial malware on individual been proven to be very vulnerable to adversarial examples.
models within the framework. Additionally, we proposed Following previous Android malware detection studies,
a diversified adversarial training method to generate di- the proposed method seamlessly integrates syntactic features
versified variant classifiers, making the detection system derived from static analysis with various ML algorithms,
highly resilient to a variety of adversarial examples. resulting in efficient and accurate classification. Additionally, it
• We proposed a two-stage model selection algorithm to incorporates diverse adversarial training methods to overcome
systematically select the most suitable sub-classifiers and the limitations of existing ML-based detectors, significantly
optimize the construction of the ensemble malware detec- enhancing their robustness against evasion attacks.
tor at test time. Furthermore, we designed a hybrid update
strategy to rebuild all aspects of the detection framework, B. Evasion Attacks & Defenses in ML-based Detectors
ensuring the long-term effectiveness of the system and
strengthening its robustness against evolving attacks. Evasion attacks [3] employ crafted inputs to mislead models
• We compared MTDroid with the state-of-the-art methods such that malicious apps will be classified as benign. Accord-
on three widely-used Android application datasets. Exper- ing to Ref. [32], it can be divided into two types as follows.
imental results demonstrated that MTDroid achieves high Feature-Space Attacks. Feature-space attacks map the
effectiveness in countering 23 evasion attacks across six malware example into a vector and add perturbations to the
attack scenarios, as well as outperforming other defenses values of the vector to achieve misclassification. For example,
in terms of higher comprehensive robustness and lower Grosse et. al [33] manifested the vector based on the Jacobian
overhead. matrix and the saliency map, and Xu et. al [34] perturbed one
The remainder of this paper is organized as follows. We vector each iteration via the simulated annealing algorithm.
discuss the related work in Section II and propose the threat Problem-Space Attacks. These attacks change the actual
model in Section III. Section IV demonstrates the framework instance directly to generate real adversarial malware. For in-
design. Section V presents the evaluation results. We discuss stance, Chen et. al [3] applied perturbations onto both manifest
the limitations of our work and open challenges in Section VI. and Dalvik bytecode to deceive detectors, and Yang et. al [35]
Conclusions and future work are summarized in Section VII. proposed an approach for creating novel variants from se-
mantic analysis of existing malware. In addition, Bostani and
Moonsamy [36] utilized an n-gram-based similarity method to
II. R ELATED W ORK
directly manipulate malware samples to evade detection.
A. Android Malware Detection To improve the robustness of ML-based detectors, there are
In the last few years, a lot of solutions have been proposed several countermeasures have been proposed.
to cope with the growing number of Android malware samples, Defensive Distillation. This method uses the additional
which can be mainly classified into three categories as follows. knowledge extracted during distillation to smooth the detection
3
models, reducing their sensitivity to adversarial perturbations seamlessly integrates the key aspects of MTD technology,
and improving their generalizability properties [14], [37]. including dynamicity, diversity, and heterogeneity. MTDroid
Unfortunately, it has been broken by Carlini and Wagner [38]. encompasses the creation of a diverse pool of heterogeneous
Adversarial Training. This solution effectively strengthens models, the adversarial training of variant models, the system-
the robustness of detection models against certain attacks by atic selection of candidate models, and the strategic utilization
recursively feeding crafted adversarial examples of this class and updating of these models. This framework can effectively
into the training dataset [39], [40]. However, it may be helpless minimize the effect of transferability of adversarial examples
to resist adversarial examples that are not seen in training. across models and improve adversarial robustness against a
Ensemble Defense. Recently, ensemble defenses [41], [42] wide range of potential evasion attacks.
have also been proposed, which combine several models with Furthermore, unlike other defenses, MTDroid utilizes mu-
particular methods to produce robust predictions. For example, tually exclusive training subsets to train basic models and
Li et. al [13] jointly employed a Variational AutoEncoder adaptively retains a portion of clean examples during the ad-
(VAE) and Multilayer Perceptron (MLP) and combined their versarial training phase, preserving classification accuracy on
outcomes to make the final decision. Nevertheless, adversaries clean data. In addition, MTDroid takes into account statistical
can continuously query the system and eventually find a way, information on both queries and misclassifications to provide
as the ensemble classifier remains a fixed target. an adaptive and tailored updating strategy, rather than relying
Distributed Defense. With the proliferation of Internet of on periodic model renewal as seen in existing approaches.
Things (IoT) devices operating on the Android platform, there
has been a surge in research efforts [43], [44] focusing on III. T HREAT M ODEL
distributed and collaborative approaches to strengthen security A. Attacker’s Goal and Capability
in IoT networks. For example, Yazdinejad et al. [45] employed The attacker aims to craft adversarial examples with well-
federated learning to facilitate collaborative anomaly detection designed perturbations to evade detection, and thus, these
by leveraging distributed local ML models, thereby safeguard- undetected malware examples would infect Android devices
ing blockchain-based industrial IoT networks against cyberat- for data collection, credential theft, remote control, etc.
tacks. This growing body of work has served as inspiration In the context of evasion attacks, an attacker is capable
for our investigation into countering evasion attacks. of manipulating both features (e.g., feature addition and re-
Unlike previous work in defensive distillation and adver- moval [51]) and samples (e.g., string encryption [52]) at test
sarial training, MTDroid takes a step further by employing without compromising the malicious functionality. In addition,
a combination of heterogeneous models along with multiple we assume that the attacker may know the existence of defense
variants trained with adversarial examples based on diversified mechanisms, but is unaware of the deployment details.
attack principles. This approach greatly enhances the robust-
ness against a wide range of attacks. In addition, MTDroid B. Attacker’s Knowledge
tackles the limitations of existing ensemble methods through
the strategic movement of malware detectors. This dynamic Depending on different levels of knowledge of the targeted
adaptation creates a moving target for adversaries, making it system, attacks can be classified into three categories:
a significant challenge for successful evasion. White-Box Attack. The white-box attacker has complete
knowledge of the targeted detector, including the training data
D = (X , Y), corresponding feature set, and classification
C. MTD for Mitigating Adversarial Examples algorithm f with its parameters θ.
The existing MTD techniques have demonstrated their ef- Black-Box Attack. Instead, the black-box attacker has no
fectiveness in dealing with adversarial examples in the image internal information about the target and can only access the
domain [46]. For instance, Roy et. al [47] proposed a Stack- predicted results (i.e., benign or malicious labels) by querying
elberg game-based switching method among multiple ML the malware detector. Accordingly, y = fθ (x).
algorithms to improve the robustness, and Amich et. al [48] Grey-Box Attack. The grey-box attack falls in between the
deployed a pool of models and automatically replaced them to aforementioned two boundaries, in which the attacker learns
achieve dynamic defense. Besides, Qian et. al [49] leveraged the victim partially (e.g., learns a surrogate classifier fˆ) and
differential knowledge distillation and Bayesian Stackelberg exploits the feedback to improve the knowledge.
game to mitigate adversarial example attacks, but the perfor-
mance will decline on clean data. In addition, Song et. al [19] C. Attack Scenarios
applied multiple fork models and replaced them in idle time The attack scenario defines how the attacker implements
to collaboratively thwart adversarial images. However, recent malicious actions. Given a sample-label pair (x, y) of a
studies [20], [50] have proved that adversarial examples can malware sample and an adversarial manipulation δ, evasion
fool all classifiers if there are a limited number of candidate attacks can be written as
models, indicating that a combination of homogeneous models
fθ (x′ ) = fθ (x + δ) ̸= y, x′ ∈ Ω(x), (1)
may fail in these cases.
In contrast to previous MTD-based approaches from other where fθ : X → Y is the classification model parameterized by
domains that typically rely on small ensembles of models, θ which maps an input to a predicted label, and Ω(x) is the
MTDroid presents a comprehensive defensive framework that allowable perturbation constraints.
4
In this section, we consider six distinct attack scenarios 3) Obfuscation-based attacks: These attacks suggest mal-
guided by three selection criteria: (i) We focus on utilizing ware authors exploiting obfuscation technology to camouflage
commonly observed and well-known evasion techniques in malicious functionality [51], and the evaluation in this attack
the field of malware detection. (ii) We prioritize attacks that scenario will reflect the robustness of the detector in the
possess the potential to significantly impact the effectiveness absence of partial features. Note that they do not require
and robustness of the malware detector. (iii) We ensure the in- knowledge of the victim classifier and perform attacks directly
clusion of a diverse range of evasion techniques, encompassing on APKs. Usually, these attacks leverage certain techniques
various attack perturbations, evasion mechanisms, and levels to generate malware variants, such as encryption, renaming,
of sophistication. These criteria allow us to provide a compre- resource modification, and Java reflection.
hensive assessment of the proposed approach across multiple 4) Ensemble-based attacks: The ensemble-based attack en-
challenging attack scenarios, although it is acknowledged that ables attackers to compromise a specific classifier through
there may exist other possible attack methods. multiple attack methods with multiple manipulations. More-
1) Gradient-based attacks: This scenario involves white- over, recent studies have shown its high attack effectiveness for
box attacks, where attackers aim to find a perturbation δ such an unknown classifier [10], [58]. For this reason, it is necessary
that the perturbed sample misleads the model while keeping to investigate whether the proposed can simultaneously thwart
δ as small as possible to maintain the adversarial sample’s an ensemble of attacks.
naturalness. The specific mechanisms and steps depend on 5) Transferability-based attacks: This attack method be-
the particular type of attack, but they all rely on gradient longs to the grey-box attack, in which attackers first attack
information for iterative optimization. a surrogate model and then compromise the target classifier
For example, Bit Coordinate Ascent (BCA) [53] attack based on the transferability property. This means that ad-
updates one bit in each iteration by considering the feature versarial examples generated for a particular model can be
with the maximum corresponding partial derivative of the loss. utilized to evade detection by other models [59]. Hence, these
Formally, it can be described as attacks serve as a means to assess the robustness of MTDroid
in scenarios where adversaries possess the ability to create
x′ ∈ arg max
t
L(fθ (x + δ t ), y), x + δ t ∈ Ω(x), (2) adversarial examples with high transferability.
δ
6) Query-based attacks: In this attack scenario, adversaries
where t is the iteration and L is a loss function that measures
know the existence of defenses and are permitted to adapt their
the mismatch between f (x + δ) and y. The objective of the
perturbations based on query history [60]. In each iteration,
iterative update is to maximize the value of L(fθ (x + δ t ), y),
random features from benign samples will be perturbed into
thereby strategically misleading the detector to classify adver-
the malware feature vector, and this attack strategy will also
sarial examples as benign.
remove a proportion of features to make it conform to the
Besides, the Projected Gradient Descent (PGD) attack [54]
distribution of legitimate queries and the training data. Given
offers the attacker greater flexibility by enabling the addition
the persistence and strength of this attack, it is imperative to
and removal of features while retaining their malicious func-
assess our method’s ability to effectively respond to it.
tionalities. It initializes the perturbation with a zero vector and
perturbs it via an iterative process, such that
IV. F RAMEWORK OVERVIEW
δ t+1 = Π δ t + λ∇δ L(fθ (x + δ t ), y) , x + δ t ∈ Ω(x), (3) In this section, we first propose the framework and its
design details. Then, we describe the training process of basic
where λ > 0 is the step size, Π is the projection operator and
models among the pool and integrate the concept of N-variant
∇δ indicates the gradient of L with respect to δ. Since the
into adversarial training to enhance the effectiveness against
derivative values may be too small to wage attacks, researchers
multiple attacks. Further, we design an algorithm to select
usually normalize ∇δ L in the direction of ℓp norm, typically
optimal models from the model pool, resulting in an ensemble
p = 1, 2, or ∞. Note that we normalize the gradients in a
classifier with excellent generalization and robustness. Finally,
direction of ℓ∞ norm for the PGD attack in this study.
we describe the approach to establishing and updating the ever-
In addition, several related algorithms are also inves-
changing model pool.
tigated to test the sensitivity of the model to perturba-
tions in the gradient direction, including Fast Gradient Sign
Method (FGSM) [55], Jacobian-based Saliency Map Attack A. Framework Design
(JSMA) [3], Grosse [33], and Gradient Descent with Kernel To overcome the drawbacks of traditional static defense,
Density Estimation (GDKDE) [56]. we have enabled detection with comprehensive MTD proper-
2) Gradient-free attacks: Under this scenario, attackers are ties, which can be summarized as dynamicity, diversity, and
permitted to get access to a surrogate dataset and launch heterogeneity, to defend against the evolving malware and its
grey-box attacks. The Mimicry attack is an example in which adversarial variant. The system architecture is shown in Fig. 1,
adversaries add or remove features in malware to mimic the which includes the following details.
benign data [51], and we select this attack to evaluate the Dynamic model pool. Instead of statically building a
detector’s ability to distinguish similar samples. The Salt and deterministic model for malware detection, our framework
Pepper noises and Pointwise attacks [10], [57], which inject exploits the dynamicity from the MTD concept and maintains
noises to disrupt the detector, are also investigated in this study. a ‘moving’ model pool that can be updated and optimized over
5
Fig. 1. An overview of the proposed system architecture in MTDroid. In the training phase, multiple heterogeneous models are first trained based on distinct
training sets, and each model is hardened by adversarial training with different perturbations to create N-variants. An ensemble-based defense is then built
based on elaborately selected sub-classifiers, to discriminate between malware and benign applications in an adversarial environment. In the test phase, the
label of an APK is determined according to the result of majority voting by the selected models. After a series of queries or too many failures, the proposed
framework will refresh and retrain all basic models in the background.
time. Especially, the model pool contains multiple heteroge- detail, the proposed framework automatically ranks models
neous models, such as SVM, MLP, and Convolutional Neural and selects the optimal ones based on the selection algorithm,
Networks (CNN), to increase the attacker’s uncertainty. We which is fully described in Section IV-D. Then, an ensemble
also assign distinct training sets for each model so that evasion classifier is built based on the selected heterogeneous models,
attacks are less likely to be transferable across models. To and it makes predictions with the majority voting rule.
decrease the exposure time of the detector and increase the dif-
ficulty of attacks, this pool will be continuously ‘moving’ via B. Basic ML Model Training
an update strategy, which will be elaborated in Section IV-E.
According to previous work [3], [10], [51] on Android
N-variant detection models. The N-variant was a tradi-
Malware detection, Drebin performs a set of static analysis on
tional MTD method that was first proposed to provide fault
Android applications and extracts features from the Android
tolerance in software, it creates functionally equivalent variants
manifest and classes.dex files. More specifically, all fea-
and selects a random variant at runtime. Building upon the
tures can be represented as strings and organized in 8 different
aforementioned research, we have incorporated the concept
feature sets, including 4 subsets extracted from the manifest
of N-variant for robust malware detection. In our study, we
(e.g., hardware components), and the other 4 subsets extracted
generate diversified model variants by introducing various per-
from the disassembled dexcode (e.g., API calls), as listed in
turbations to the original model, and employ specific rules to
Table I. Therefore, an APK can be mapped into the feature
select one or multiple variants during test time. By leveraging
space as a binary feature vector, in which we can have 0 or 1
the diversity principle of MTD, these N-variant models force
along each dimension, indicating the absence or presence of
adversaries to confront different targets at different times,
the corresponding feature.
rendering their knowledge of previous classifiers ineffective
Due to its efficiency in feature extraction, we also use
and thus enhancing resilience against a wide range of evasion
Drebin features in our study. Besides, we associate these
attacks. The adversarial training with respect to N-variant
features with corresponding labels to build the dataset. Here,
models will be demonstrated in Section IV-C.
the label 0 or 1 represents the benign or malware class,
Optimized ensemble classification. While the ensemble respectively. The training process begins by distinctly splitting
classifier has shown promise in countering adversarial ex- the training dataset into several subsets. Each subset is used
amples in the image domain, an ensemble of homogeneous to train one basic model, such as SVM or MLP, in parallel.
models may not effectively mitigate attacks with strong trans- Once all the basic models have been optimized, we establish
ferability. To address this limitation, our study proposes a the initial model pool, which serves as the foundation for
novel ensemble approach that combines heterogeneous mod- subsequent steps in the detection process.
els. By leveraging the concept of heterogeneity from MTD,
this approach aims to reduce the transferability of adversarial
malware and increase the complexity of evasion attacks. C. Diversified Adversarial Training
Unlike existing research on ensemble algorithms, we aim to Drawing inspiration from the success of N-variant ap-
optimize the selection of sub-classifiers from the model pool proaches in the software field, we leverage the diversity
to further enhance the robustness of malware detection. In principle of MTD to introduce diversified adversarial training
6
Algorithm 2: Two-Stage Optimal Model Selection Algorithm 3: Hybrid Pool Update Strategy
Input: Set of variant models V ={ν(1,1),...ν(1,n),...ν(m,n)}, Input: Number of failures pt with its threshold ρ, and
objective function εi , evaluation function ξ, number of queries qt with its threshold σ;
attack validation set V, selection number η, and Output: Binary update strategy πt at time step t;
p q
perturbation type analysis function α; 1 Initialize the algorithm with counters C0 = C0 = 0;
Output: Set of selected models Vs ; 2 for time step t = 1 to T do
1 Initialize the selection algorithm with Vs = ∅; 3 Update counters: Ctp = Ct−1
p
+pt , Ctq = Ct−1
q
+qt ;
p
2 for basic model i = 1 to m do 4 if Ct ≥ ρ then
3 for variant model j = 1 to n do 5 Trigger the update: πt ← 1;
4 Set ν(i,∗) = arg min εi (ν(i,j) ); 6 Reset both counters: Ctp = Ctq = 0;
5 Compute the evaluation function ξ(ν(i,∗) , V); 7 else if Ctp < ρ and Ctq ≥ σ then
6 Sort ν(i,∗) in descending order based on ξ; 8 Trigger the update: πt ← 1;
7 for selected model k = 1 to η do 9 Reset the counter for queries: Ctq = 0;
8 Insert ν(i,∗) into the selected model
S set Vs ; 10 else
9 Set Vs (k) = arg min(η − α(Vs ν(i,∗) )); 11 Maintain the current model pool πt ← 0;
10 Return the selected optimal model set Vs 12 Return the model pool update strategy πt
such function. This step can be considered as the first stage current model pool and the generated ensemble classifier will
that helps select the optimal variant for a specific model. We remain unchanged for upcoming detection queries.
repeat this process m times to confirm all candidates for the Moreover, we employ two strategies to reset counters in the
second stage. Then, the algorithm evaluates different variant process of decision-making. A high number of failures may
models based on function ξ and sorts them in descending indicate that the attacker is continuously exploiting the system
order according to the evaluation results. In our case, the and evaluating the adversarial examples they have created.
evaluation function can be given as the F1 score of the target Herein, the algorithm triggers the update engine while resetting
variant model, and the attack validation set V is formed by both counters for future analysis. For another case, we also
introducing various evasion attacks into the original set. More rebuild the model pool to prevent the exposure of the attack
particularly, we analyze the type of variant models and tend surface for too long. However, the failure statistics during
to include more different types in the final decision. Finally, periodic updates cannot completely rule out the possibility of
the algorithm returns the selected models for generating the evasion attacks. Thus, we only reset the time window and
ensemble classifier with the majority voting rules. continue to count failures over time.
After receiving the trigger signal, the update engine will
E. Model Pool Update Strategy execute the update actions in the following four steps. (i) The
Due to the inherent advantages attackers possess in probing update engine initiates the process of re-segmenting the train-
target systems, they can persistently query the target and ing dataset. It combines the existing data with new incoming
develop attack strategies to get beyond its defenses over a samples and shuffles them to create new training subsets. (ii)
long time window. For this reason, we adopt the dynamicity Subsequently, the basic models are reconstructed based on
principle of MTD and implement a dynamic update mech- these new subsets using corresponding ML algorithms. (iii)
anism for the malware detector. By regularly updating the Additionally, variant adversarial models are created for each
detector, we effectively shift the attack surface and invalidate basic classifier following Algorithm 1. (iv) To ensure optimal
the knowledge adversaries may have acquired in previous at- performance against current threats, new candidate models are
tempts. This proactive approach enhances the resilience of the reordered and selected from all variant models using Algo-
system against evolving attack techniques. Note that the update rithm 2. The optimal ensemble classifier for Android malware
of the model pool involves a large-scale shuffle of training sets, detection is finally constructed based on these selected models.
basic models, variants, and optimization processes. After completing these actions in the background, the update
The hybrid update strategy, which integrates query and engine proceeds to the next round, waiting for a new updating
failure statistics to trigger the update engine coordinately, is signal generated by Algorithm 3. It is important to note that
described in Algorithm 3. Specifically, inputting an Android the currently running detector remains unchanged until all the
app into the detector means a single query, and the misclassi- aforementioned update processes are completed.
fication error by a sub-classifier is considered a failure.
We first set two counters to record the number of detection V. E XPERIMENTS AND E VALUATION
queries and failures among sub-classifiers of the ensemble In this section, we first introduce the configurations in
classifier, respectively. Then, the algorithm triggers an update the experiments. Then, we evaluate the performance of MT-
when the counter for failures exceeds the threshold, otherwise, Droid under different attack scenarios on three popular public
it shuffles the models after a certain number of detection datasets, and finally, the effectiveness of MTDroid is compared
queries. Finally, if neither of the thresholds is reached, the with the state-of-the-art work in terms of standard metrics.
8
S in g le M o d e l M o d e l P o o l S in g le M o d e l M o d e l P o o l S in g le M o d e l M o d e l P o o l
1 0 0 1 0 0 1 0 0
8 0 8 0 8 0
A c c u ra c y (% )
A c c u ra c y (% )
A c c u ra c y (% )
6 0 6 0 6 0
4 0 4 0 4 0
2 0 2 0 2 0
0 0 0
P G D P o in tw is e C o m b in e d E n s e m b le T ra n s fe r Q u e ry P G D P o in tw is e C o m b in e d E n s e m b le T ra n s fe r Q u e ry P G D P o in tw is e C o m b in e d E n s e m b le T ra n s fe r Q u e ry
A tta c k s A tta c k s A tta c k s
Fig. 2. The comparison between single classifier and model pool against typical attacks on Drebin, CICMalDroid 2020, and AndroZoo datasets.
85 90 85
70 80 70
RA (%)
RA (%)
RA (%)
55 70 55
25 50 25
98.50 98.75 99.00 99.25 99.50 99.75 98.65 98.9 99.15 99.4 99.65 99.9 97.75 98.2 98.65 99.1 99.55 100
Fig. 3. CA-RA trade-offs exhibited by LN-AT, PGD-AT, and D-AT on Drebin, CICMalDroid 2020, and AndroZoo datasets. The number of variant models
varies from 10 to 1 for the points from top-left to bottom-right on each curve. Robust accuracy is evaluated using ensemble-based attacks.
Pepper attack by increasing the noise intensity of 0.001 each B. Experimental Results
time until misclassification and repeating this process 10 times. 1) The Impact of Dynamic Model Pool: We evaluated
Besides, we mimic the one from ten benign samples with the the accuracy against typical evasion attacks on two different
smallest perturbations to launch the Mimicry attack, and the scenarios, one using a single static model without adversarial
Pointwise attack utilizes Mimicry as the initial input. training and the other employing the proposed dynamic model
For the obfuscation-based (OB) attacks, we utilize an ob- pool. The average accuracy rates of different scenarios are
fuscator called AVPASS [52] and launch five attacks, including presented in Fig. 2. Note that we repeated the experiment on
Java Reflection, String Encryption, Variable Renaming, Re- every basic model of the model pool and computed its mean
source Modification, and the above four techniques combined. value as the result. Moreover, we select the GB+GF+OB attack
For the ensemble-based attacks, PGD and Pointwise are as the ensemble-based attack, and the transfer attack is also
selected as representative GB and GF attacks, while Combined conducted by this attack based on the other models in the pool.
Obfuscation is used to implement the OB attack. Then, we can As we can see in Fig. 2, the dynamic model pool has a huge
wage the GB+GF, GB+OB, GF+OB, and GB+GF+OB attacks advantage over a single model in terms of accuracy. Especially,
through different combinations of the aforementioned attacks. when dealing with the query-based attack on the CICMalDroid
For the transferability-based attacks, we perturb a malware 2020 dataset, there is a significant increase in the accuracy
sample upon the other models (surrogate models) and transfer (from 3.08% to 96.47%) after switching the single model to
it to the target model. We generate adversarial examples in this the model pool. The underlying reason may be that the attacker
way by four attacks: GB, GF, OB, and GB+GF+OB attacks. with knowledge of basic models can bypass static detection,
For the query-based attacks, we run the adaptive query whereas the dynamic model pool can leverage multiple models
algorithm (see Ref. [60]) with 10% of features permitted to and continuous refreshes to strengthen the defense.
be removed per iteration and a maximum query of 500. 2) The Impact of N-Variant Models: We evaluate the effect
7) Metrics: The effectiveness of the detector on clean data of diversified adversarial training on the performance against
is measured by five standard metrics: Accuracy (Acc), F1 the GB+GF+OB attack and present the trade-off between
score, False-Positive Rate (FPR), False-Negative Rate (FNR), robustness and accuracy in Fig. 3. The x-axis represents
and Mean-Time-To-Detect (MTTD). We also evaluate the the clean accuracy (CA) on the test set, while the y-axis
overhead in terms of training time, power, and GPU usage. represents the robust accuracy (RA), that is, the accuracy
To evaluate the robustness of the detector against evasion on adversarial malware samples. We compare the diversified
attacks, we employ the robust accuracy (RA) metric, which adversarial training (D-AT) method with LN-AT [48] and
measures the proportion of correctly identified samples among PGD-AT [54], where LN-AT generates different models with
all adversarial samples. Additionally, we utilize the evasion Laplace noise under the scale λ = 1 and PGD-AT retrains
rate as an evaluation criterion, where a lower evasion rate models by randomly varying the weight value from 0 to 1.
indicates better robustness, allowing us to assess the detector’s For each curve, we decrease the number of variant models
performance under continuous attack queries. from 10 to 1 from left to right.
10
R a n d o m U n ifo r m O p tim z e d
and transfer attacks. This can be attributed to the intrinsic
1 0 0
vulnerability of the models when the powerful attacker gains
8 0 knowledge of them, however, the maintenance of good perfor-
mance confirms the importance of the MTD-based detection
A c c u ra c y (% )
6 0
that is more difficult for an attacker to bypass.
4 0 4) The Impact of Query/Failure Threshold: In this sec-
tion, we evaluate the effect of the proposed thresholds in
2 0
the MTDroid framework on the performance against query-
0 based attacks. Fig. 5 denotes the robust accuracy of MTDroid
P G D P o in tw is e C o m b in e d E n s e m b le T ra n s fe r Q u e ry
A tta c k s against query-based attacks with different thresholds under
different datasets. In these figures, the x-axis presents the query
Fig. 4. The comparison of different selection strategies for building the threshold σ ranging from 200 to 1000, and the y-axis denotes
ensemble classifier against typical evasion attacks.
the failure threshold ρ that ranges from 40 to 200.
We observe that when the thresholds of query and failure
increase, there is a significant decrease in the effectiveness
As the number of variant models decreases, low robustness against query-based attacks on all datasets. The reason may be
is expected because few variants are not necessarily resistant that the higher thresholds will leave a longer attack window
to evasion attacks, as illustrated in Fig. 3. PGD-AT achieves to conduct enough attack queries. Another observation is that
a close CA-RA trade-off compared with LN-AT across all the the decrease in accuracy caused by an increase in the failure
cases. For example, on the Drebin dataset, the most robust threshold is smaller than that caused by the query threshold.
model achieved by PGD-AT (top-left corner point of yellow The reason may be that the selected optimal sub-classifiers
curve) has similar RA and CA compared with that achieved decrease the possibility of misclassification errors, thus making
by LN-AT (top-left corner point of red curve). Similar obser- this method insensitive to the failure threshold.
vations are also drawn from the other two datasets. However, a lower threshold results in frequent model re-
Another observation is that the proposed D-AT method training, which inevitably incurs significant overhead. Accord-
achieves the best CA-RA trade-off, showing its necessity in ing to the accuracy values shown in Fig. 5, we have found that
retraining diversified variant models. For example, on the the proposed method can identify ≥ 90% adversarial samples
CICMalDroid 2020 dataset, D-AT can be smoothly adjusted generated by query-based attacks with a query threshold of no
from the most robust state with 98.95% CA and 99.73% RA, more than 400. Thus, we set the query threshold σ as 400 and
to the most accurate state with 99.45% CA and 92.67% RA, the failure threshold ρ as 200 in other experiments, which can
by varying the number of variant models. This is because the achieve a good balance between effectiveness and cost.
D-AT method retains part of clean data during retraining, its 5) Comparison with Previous Works: We compare our
CA value does not significantly decrease as the number of approach with the state-of-the-art (SOTA) methods as follows.
variant models increases. Thus, in other experiments in this • SVM [27]. It employs an SVM model with the ℓ2 penalty
paper, we adopt the most robust state of D-AT, which retrains and hinge loss function for malware detection, which intro-
10 variant models for each basic model in the model pool. duces no countermeasures against evasion attacks.
3) The Impact of Optimized Ensemble: To evaluate the • Sec-SVM [51]. It enhances the security of the SVM model
impact of optimized ensemble learning, Fig. 4 compares the with more evenly-distributed feature weights. Sec-SVM uses
average accuracy under different strategies, including random the same penalty and loss as SVM, and it additionally iter-
strategy, uniform strategy, and optimized strategy (see Algo- ates 100 times with step size 0.5 for parameter optimization.
rithm 3). In detail, the random strategy randomly selects can- • Transcend [63]. It employs a LinearSVC as the base clas-
didates to create the ensemble classifier, whereas the uniform sifier with 100 iterations and step size 0.5. Transcend also
strategy chooses models of the same type each time. integrates a non-conformity scoring function based on p-
Regarding Fig. 4, note first how the probability of detecting values with threshold 0.05 to improve predictive reliability.
adversarial malware examples, increases from random strategy • DroidEvolver [64]. It integrates multiple models with equal
to uniform strategy (except the effectiveness against combined weights assigned into a dynamic online learning pool,
attacks), and from uniform strategy to optimized strategy. The including Passive Aggressive (PA), Online Gradient De-
underlying reason may be that the random strategy possibly scent (OGD), Adaptive Regularization of Weight Vectors
picks variants from the same basic model whereas both (AROW), Regularized Dual Averaging (RDA), and Adaptive
uniform and optimized strategies can guarantee the hetero- Forward-Backward Splitting (Ada-FOBOS).
geneity of different models, and the optimized strategy further • AT-Adam [65]. It uses an MLP model that has two fully-
improves the performance via two-stage selection. connected hidden layers (each having 160 neurons) and
An interesting observation is that the performance of differ- the ReLU activation function. Then, it exploits adversarial
ent strategies under the query-based attack is similar, because training with the PGD attack, which iterates 50 times with
the key to defeating this attack lies in timely model updates. step size 0.02 and is optimized by the Adam optimizer.
Nevertheless, the optimized ensemble classifier still achieves • dADE-MA [10]. It uses the same parameters as AT-Adam
the highest effectiveness. Another observation is that there is for the basic MLP model. Further, it introduces adversarial
a slight decrease in the effectiveness against ensemble attacks training with the max attack, including ℓ1 norm PGD with
11
Fig. 5. The effect of query and failure thresholds on performance against query-based attacks on Drebin, CICMalDroid 2020, and AndroZoo datasets.
step size 1.0 and iterations 50, ℓ2 norm PGD with step size However, these methods cannot guarantee the robust accuracy
1.0 and iterations 100, ℓ∞ norm PGD with step size 0.01 and (see Table III). The MTD-based method exhibits training time
iterations 100, and Adam optimizer with step size 0.02 and of 1h35m, 46m56s, and 2h23m on three datasets, respec-
iterations 100. In addition, the balance factor is set to 1.0 tively. It outperforms DroidEvolver, AT-Adam, dADE-MA,
for Drebin and CICCalDroid 2020, and 0.1 for AndroZoo. and MalProtect on the training time, and partially demonstrates
• Morphence [48]. It deploys a base model that has the same advantages in terms of power consumption and GPU usage. In
parameters as the MLP model in AT-Adam, and builds a particular, MTDroid achieves a 66.07% decrease in GPU mem-
pool of 10 student models generated from the base model, ory usage when compared with AT-Adam on the CICMalDroid
in which 5 models are adversarially training by the FGSM 2020 dataset, and exhibits a significant decrease of more than
attack with step size 0.01 and iterations 100. Additionally, one week in the training time when compared with dADE-MA
it automatically replaces the pool after 1000 queries. on the AndroZoo dataset.
• MalProtect [60]. It exploits a stateful solution via several Results in the Presence of Attack. Table III reports the
threat indicators to detect perturbations. MalProtect utilizes robust accuracy of the given methods against GB, GF, and OB
a neural network (NN) with 4 fully-connected layers (each attacks. Here, MTDroid achieves the accuracy of ≥ 95.44%
having 128, 64, 32, and 2 neurons) as the decision model, under 13 attacks, ≥ 98.33% under 14 attacks, and ≥ 96.00%
and a veto voting-based ensemble classifier (including DT, under 12 attacks, respectively. The results indicate that our
NN, SVM, and Random Forest) as the prediction model. method significantly improves the robustness and outperforms
We consider two cases: (i) evaluating the performance the state-of-the-art methods in most real-world scenarios.
and overhead of given defenses on clean samples, and (ii) Specifically, as SVM does not introduce malicious pertur-
analyzing the impact of various attacks on them. bations into the training process, it exhibits lower accuracy for
Results in the Absence of Attack. Table II presents the all evasion attacks. Although Sec-SVM improves the security
result of different methods under the scenario that there are against GF and OB attacks, it exhibits poor performance that
no evasion attacks. We observe that when compared with the is very close to the SVM model in the case of GB attacks. It
SVM model, most of the other defenses achieve lower FPR but can also be seen that neither of the SVM-based models can
higher FNR on all datasets. The reason may be that adversarial distinguish GDKDE and PGD attacks on the CICMalDroid
training introduces adversarial examples into the training set and AndroZoo datasets. Moreover, Transcend enhances the
and enlarges the search space of malware samples, resulting robustness via a scoring function while DroidEvolver employs
in a decrease in FPR and an increase in FNR. a dynamic online learning pool. However, they can hardly
Nevertheless, our proposed defense does not significantly defeat GB attacks on all datasets and only mitigate OB attacks
decrease the performance in the absence of attack, achieving on the CICMalDroid 2020 dataset.
the F1 score of 99.38%, 97.98%, and 99.56% on three datasets, The incorporation of adversarial training enhances the ro-
respectively. Note that MTDroid achieves higher accuracy (at bustness of AT-Adam and dADE-MA against certain types of
most a 2.97% increase on Drebin, 7.69% on CICMalDroid attacks. Especially, AT-Adam achieves an accuracy of 99.89%
2020, and 1.78% on AndroZoo) when compared with AT- against the resource modification attack on the CICMalDroid
Adam and dADE-MA. The underlying reason may be that 2020 dataset, and dADE-MA outperforms all other defenses
the diversified adversarial training reconciles both clean data against the FGSM and Mimicry attacks on the AndroZoo
and perturbed examples at the same time. Besides, our method dataset. Nevertheless, neither AT-Adam nor dADE-MA can
involves a clear reduction in the MTTD when compared with defend against all kinds of attacks effectively.
other adversarial training methods. For example, MTDroid An interesting observation is that Morphence achieves an
achieves an F1 value of 99.38% with an MTTD of 123.98 ms, accuracy of 100% against GDKDE and PGD attacks across
whereas MalProtect produces an MTTD of more than 1600 ms all datasets. However, it exhibits poor performance when
with a lower F1 value of 94.11% on the Drebin dataset. defeating GF attacks, such as Mimicry and Pointwise (at most
As shown in Table II, SVM, Sec-SVM, and Transcend 26.78% and 53.22% on the Drebin dataset). The reason for this
exhibit less training time (< 1.5h) and lower power con- observation may be that Morphence adopts PGD to create a
sumption (< 200W) with no GPU usage across all datasets. set of adversarial training data, achieving good defense against
12
TABLE II
P ERFORMANCE AND OVERHEAD OF THE M ETHODS W HEN T HERE A RE N O E VASION ATTACKS .
Performance Overhead
Dataset Method
Acc(%) F1(%) FNR(%) FPR(%) MTTD(ms) Training Time Power(W) GPU Memory(MB)
SVM 98.98 99.43 0.18 8.16 5.26 9m18s 130.37 N/A
Sec-SVM 98.64 99.24 0.63 7.54 1.64 19m43s 180.44 N/A
Transcend 96.87 86.78 3.02 3.14 3.93 2m52s 124.35 N/A
DroidEvolver 99.01 99.01 5.86 0.41 77.22 3h59m 139.52 N/A
Drebin AT-Adam 97.71 98.71 2.09 3.99 296.90 8h45m 138.19 1312
dADE-MA 95.91 97.66 4.25 2.75 531.60 45h12m 297.26 2848
Morphence 98.54 92.85 10.06 0.45 355.67 40m24s 321.55 13764
MalProtect 98.81 94.11 9.94 0.16 1677.46 3h39m 155.87 8958
MTDroid 98.88 99.38 0.37 7.32 123.98 1h35m 283.75 7556
SVM 99.04 97.98 2.02 0.63 0.49 1m20s 129.57 N/A
Sec-SVM 99.01 97.90 2.53 0.51 0.21 8m29s 193.02 N/A
Transcend 96.38 97.68 0.27 14.41 7.48 1m45s 114.06 N/A
DroidEvolver 99.04 99.04 0.55 2.28 273.54 1h19m 135.87 N/A
CICMalDroid
AT-Adam 95.12 89.87 8.60 3.73 44.17 2h50m 225.43 2848
2020
dADE-MA 91.26 82.09 12.77 7.84 93.51 25h41m 353.12 17184
Morphence 98.32 98.88 1.05 3.58 242.55 42m11s 318.87 14140
MalProtect 98.83 99.24 0.51 3.29 578.96 2h46m 154.33 8958
MTDroid 98.95 97.98 3.20 0.28 22.86 46m56s 322.08 5830
SVM 99.33 99.59 0.47 1.58 3.27 49m11s 130.13 N/A
Sec-SVM 98.53 99.11 0.92 4.05 0.89 1h22m 174.46 N/A
Transcend 97.50 93.30 0.61 2.90 2.40 25m51s 129.20 N/A
DroidEvolver 98.88 98.89 1.79 0.98 77.88 10h40m 142.93 N/A
AndroZoo AT-Adam 98.08 98.83 1.33 4.69 185.19 31h40m 171.19 4485
dADE-MA 98.76 99.24 0.98 2.47 346.20 191h4m 314.96 8992
Morphence 99.37 98.21 1.46 0.45 234.08 1h26m 319.08 14144
MalProtect 99.45 98.46 2.41 0.14 853.01 3h42m 155.03 8958
MTDroid 99.28 99.56 0.44 2.07 73.84 2h23m 328.11 12678
GB attacks but being unable to resist other attacks. Similarly, without transferability among different classifiers. With more
MalProtect achieves good robust accuracy against GB attacks, details, we observe that there is a decrease in the effectiveness
but it shows poor performance on other types of attacks due of most alternative methods when defending against ensemble-
to a lack of samples collected from certain evasion attacks. based (GB+GF+OB) transfer attacks. However, our approach
Thanks to the diversity brought by variant models, these can detect these attacks effectively, achieving the accuracy of
attacks have minimal impact on the performance of MTDroid, 86.89%, 93.67%, and 91.11% on three datasets, respectively.
which maintains an accuracy rate of no less than 92% across The results disclose the transferability in evasion attacks
all attacks in Table III. While our method may exhibit slightly among different classifiers, and although the robustness is
lower effectiveness in a few attacks, we can confidently slightly reduced, the proposed method still exhibits the best
conclude that it demonstrates a superior and reliable capability or near-best performance against these attacks.
compared to alternative approaches, when considering the
In addition, Fig. 8 shows the evasion rate of query-based
overall performance across various attack scenarios.
attacks against the given methods versus the maximum number
To provide some additional insights, we report the accuracy of queries (from 0 to 500). These results clearly demonstrate
of different methods against ensemble and transfer attacks. that the given methods except for MalProtect and MTDroid
As demonstrated in Fig. 6, SVM, Sec-SVM, Transcend, and have almost no resistance against this attack with a high
DroidEvolver show poor performance (100% misclassification evasion rate of ≥ 98% in some cases. The reason for this
in some cases) against ensemble attacks that involve the GB is that SVM, Sec-SVM, AT-Adam, and dADE-MA all belong
attack. Since these methods are not effective against PGD to static defenses, and both Transcend and MalDroid are not
(which is selected to wage the GB attack in this scenario), they enough robust against continuous and adaptive adversarial
also fail to cope with PGD-based ensemble attacks. MTDroid queries. In addition, the differences between student models
outperforms other given defenses in most scenarios, exhibiting in Morphence are insufficient, resulting in poor effectiveness
the accuracy of ≥ 88.78%, ≥ 97.78%, and ≥ 95.89% against in defeating query-based attacks. Although MalProtect has a
ensemble attacks on three datasets. The underlying reason certain level of defense, it still allows more than half of the
may be that the diversified perturbations among heterogeneous attacks to evade detection, especially when the number of
models enable it to mitigate multiple evasion attacks. queries is high. However, our method can effectively mitigate
Fig. 7 depicts the accuracy of different classifiers against query-based attacks, with the evasion rates of ≈ 10%, ≈ 4%,
transferability-based attacks. As we can see, the performance and ≈ 8% for three datasets, respectively. This emphasizes
of defending against the OB attack is the same as that in the significant advantages of MTDroid in effectively miti-
Table III, because this attack is performed on APK files and gating strong and adaptive evasion attacks. These benefits
13
TABLE III
ROBUST ACCURACY OF THE G IVEN M ETHODS W HEN D EFENDING AGAINST GB, GF AND OB ATTACKS .
Accuracy (%)
Type Name
SVM Sec-SVM Transcend DroidEvolver AT-Adam dADE-MA Morphence MalProtect MTDroid
FGSM 21.78 23.11 32.11 8.00 93.33 98.22 87.44 88.33 99.89
JSMA 51.89 54.33 0.00 0.00 70.44 93.00 60.44 97.78 98.11
GB BCA 17.56 52.00 0.00 0.00 68.11 92.22 93.44 100.0 99.67
Attacks Grosse 17.78 51.89 0.00 0.00 67.67 92.11 29.00 96.89 97.44
GDKDE 0.00 0.00 23.56 0.00 78.33 88.00 100.0 100.0 100.0
PGD 1.33 14.44 25.33 0.00 76.00 93.67 100.0 100.0 100.0
Drebin
Salt & Pepper 66.89 95.00 96.22 95.22 94.56 97.22 91.56 92.78 99.33
GF
Mimicry 47.33 92.44 19.11 20.11 86.56 91.33 26.78 33.78 95.44
Attacks
Pointwise 37.33 54.44 32.44 32.78 72.00 93.44 53.22 12.56 98.22
Reflection 26.33 89.00 90.67 91.56 92.56 96.11 84.67 30.44 98.33
Encryption 29.22 86.33 82.67 83.78 91.11 94.33 81.56 7.56 98.56
OB
Renaming 31.11 73.33 69.00 68.22 92.22 94.22 66.22 26.78 98.00
Attacks
Resource 32.22 62.78 61.33 61.67 96.44 93.89 46.67 28.67 98.33
Combined 25.44 48.44 31.78 33.44 85.33 86.33 48.00 74.22 92.00
FGSM 0.22 0.22 0.22 0.22 94.56 98.67 83.67 99.67 99.89
JSMA 0.11 8.44 0.11 0.11 63.56 86.56 22.22 99.11 99.44
GB BCA 0.00 8.22 0.00 0.00 64.22 93.13 84.44 97.89 98.33
Attacks Grosse 0.00 8.33 0.00 0.00 64.33 83.56 29.89 96.89 98.67
CICMalDroid 2020
GDKDE 0.00 0.00 0.00 0.00 27.78 70.67 100.0 100.0 100.0
PGD 0.00 0.00 0.00 0.00 32.56 76.44 100.0 100.0 100.0
Salt & Pepper 1.78 97.78 98.56 93.56 90.56 97.78 91.89 22.53 99.33
GF
Mimicry 0.00 87.44 0.00 0.00 79.67 98.33 17.11 0.00 99.78
Attacks
Pointwise 0.00 80.67 73.67 83.56 33.67 98.78 51.33 6.22 99.33
Reflection 86.11 92.22 100.0 100.0 81.44 97.67 81.67 60.78 100.0
Encryption 52.00 69.67 99.67 99.67 81.56 97.67 89.67 22.44 99.67
OB
Renaming 60.56 75.22 99.67 99.67 81.56 97.67 69.78 15.44 99.67
Attacks
Resource 47.33 66.44 99.22 99.22 99.89 99.00 45.22 34.89 99.67
Combined 92.00 94.22 99.33 99.33 95.00 97.89 46.78 80.89 99.67
FGSM 1.67 1.33 29.58 70.01 95.78 99.89 85.44 98.22 98.89
JSMA 3.89 6.67 11.56 2.44 71.33 96.89 25.33 97.22 98.00
GB BCA 0.11 0.22 19.89 2.56 60.00 89.78 89.11 92.67 94.67
Attacks Grosse 0.67 2.67 29.44 10.78 59.33 88.78 30.89 93.11 94.67
GDKDE 0.00 0.00 24.89 70.67 52.67 92.11 100.0 100.0 96.00
AndroZoo
PGD 0.00 0.00 0.00 0.00 80.00 97.56 100.0 100.0 100.0
Salt & Pepper 11.00 66.44 35.89 23.89 82.67 92.78 87.67 96.89 99.44
GF
Mimicry 8.33 33.56 91.67 24.67 82.67 100.0 20.33 0.22 99.00
Attacks
Pointwise 0.00 1.56 56.78 25.56 68.78 93.56 47.89 62.11 98.11
Reflection 66.22 97.67 45.67 14.78 76.33 94.67 82.89 2.11 99.89
Encryption 51.89 97.44 33.44 16.78 68.22 92.33 81.22 2.11 99.78
OB
Renaming 53.56 97.44 65.44 27.78 67.56 88.78 64.00 2.11 99.78
Attacks
Resource 18.00 56.22 78.33 24.67 65.67 98.00 44.89 6.44 97.78
Combined 16.44 18.33 33.44 27.22 50.33 96.89 46.44 42.78 98.67
are achieved through the seamless integration of essential model retraining and ensemble retraining, which can be easily
aspects of MTD technology, such as dynamicity, diversity, and integrated into this framework. In addition, we can also
heterogeneity, within the proposed defensive framework. incorporate new data for training to facilitate the detection
system with the help of dynamic updates.
VI. L IMITATIONS AND O PEN I SSUES To the end, although our approach is not bulletproof, we be-
Despite the promising results achieved by MTDroid, it is lieve that it significantly improves the security of the Android
clear that such an approach exhibits some intrinsic limita- malware detection system and benefits the resistance against
tions. First, it may be defeated by careful manipulations that various kinds of evasion attacks with acceptable overhead.
can evade all targets. However, this is not a vulnerability
of the framework itself, and can be mitigated with simple VII. C ONCLUSION AND F UTURE W ORK
countermeasures, such as enlarging the pool size or replacing In this paper, we have first exploited a general framework
vulnerable models. For this reason, we have only introduced named MTDroid for introducing the concept of MTD, by
MTD into the framework, and there are flexible settings on the leveraging dynamicity, diversity, and heterogeneity to mitigate
pool size and models that depend on different attack scenarios. the negative impact of evasion attacks in a proactive manner. In
Another limitation of our approach may be its perfor- detail, we have developed a dynamic model pool with multiple
mance would be affected by the concept drift phenomenon. heterogeneous models and generated diversified variant models
Fortunately, there exist several ways to handle it, such as for each basic model via injecting different perturbations.
14
S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m
d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id
1 0 0 1 0 0 9 8 .8 9
1 0 0
9 6 .8 9 9 5 .5 6 9 7 .7 8 9 8 .3 3 9 7 .7 8 9 7 .4 4 9 7 .3 3 9 6 .2 2 9 5 .8 9
9 1 .1 1 8 8 .7 8
8 0 8 0 8 0
A c c u ra c y (% )
A c c u ra c y (% )
A c c u ra c y (% )
6 0 6 0 6 0
4 0 4 0 4 0
2 0 2 0 2 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
G B + G F G B + O B G F + O B G B + G F + O B G B + G F G B + O B G F + O B G B + G F + O B G B + G F G B + O B G F + O B G B + G F + O B
E n s e m b le A tta c k s E n s e m b le A tta c k s E n s e m b le A tta c k s
Fig. 6. The accuracy of different classifiers against ensemble attacks on Drebin, CICMalDroid 2020, and AndroZoo datasets.
S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m S V M S e c -S V M T ra n s c e n d D r o id E v o lv e r A T -A d a m
d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id d A D E -M A M o rp h e n c e M a lP r o te c t M T D r o id
1 0 0 1 0 0 1 0 0
9 3 .8 9 9 1 .8 9 9 2 9 9 .6 7 9 8 .6 7
9 6 .2 2 9 6 .6 7 9 3 .5 6 9 4 .2 2
8 6 .8 9 9 3 .6 7 9 1 .1 1
8 0 8 0 8 0
A c c u ra c y (% )
A c c u ra c y (% )
A c c u ra c y (% )
6 0 6 0 6 0
4 0 4 0 4 0
2 0 2 0 2 0
0 0 0
G B (P G D ) G F ( P o in tw is e ) O B ( C o m b in e d ) G B + G F + O B G B (P G D ) G F ( P o in tw is e ) O B ( C o m b in e d ) G B + G F + O B G B (P G D ) G F ( P o in tw is e ) O B ( C o m b in e d ) G B + G F + O B
T ra n s fe r A tta c k s T ra n s fe r A tta c k s T ra n s fe r A tta c k s
Fig. 7. The accuracy of different classifiers against transfer attacks on Drebin, CICMalDroid 2020, and AndroZoo datasets.
80 80 80
Evasion Rate (%)
40 40 40
20 20 20
0 0 0
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500
Fig. 8. Evasion rates of query-based attacks against the given methods on different datasets. The maximum number of queries varies from 0 to 500.
We have then considered optimizing the ensemble learning robustness and effectiveness of applying ML techniques to
process, and shown that the performance can be significantly Android malware detection.
upgraded in the presence of various evasion attacks. According
to our experiments, the MTDroid can achieve the highest ACKNOWLEDGMENTS
accuracy in most scenarios and outperform the state-of-the- This work was supported in part by the National Natural
art methods on widely-used public datasets. To the best of our Science Foundation of China under Grant 62202097 and
knowledge, our work is the first to overcome the challenge 62072100, in part by the China Postdoctoral Science Founda-
of Android evasion attacks through the deployment of MTD. tion under Grant 2022M710677, in part by the Jiangsu Fund-
This approach allows one to improve system security against ing Program for Excellent under Grant 2022ZB137, and in
carefully crafted adversarial malware samples, without signif- part by Southeast University-China Mobile Research Institute
icantly affecting the performance in the absence of attack. Joint Innovation Foundation under Grant CMYJY-202100163.
A future development of our work, which may further
R EFERENCES
improve classifier security, is to extend this work to defend
against poisoning attacks. Although our approach cannot di- [1] Zimperium. 2022 global mobile threat report. [Online]. Available:
https://www.zimperium.com/global-mobile-threat-report/
rectly solve the problem, one may exploit MTD techniques [2] T. Shishkova. The mobile malware threat landscape in 2022. [Online].
to improve this framework to deal with such attacks. Another Available: https://securelist.com/mobile-threat-report-2022/108844/
interesting future extension of our approach may be to investi- [3] X. Chen, C. Li, D. Wang, S. Wen, J. Zhang, S. Nepal, Y. Xiang, and
K. Ren, “Android hiv: A study of repackaging malware for evading
gate more advanced techniques such as generative adversarial machine-learning detection,” IEEE Transactions on Information Foren-
networks (GANs). They can be used to perturb malware and sics and Security, vol. 15, pp. 987–1001, 2020.
build effective adversarial examples, and the defender can [4] H. Fu, P. Hu, Z. Zheng, A. K. Das, P. H. Pathak, T. Gu, S. Zhu, and
P. Mohapatra, “Towards automatic detection of nonfunctional sensitive
also train with these samples to prevent such evasion attacks. transmissions in mobile applications,” IEEE Transactions on Mobile
These two parts of the research will significantly improve the Computing, vol. 20, no. 10, pp. 3066–3080, 2020.
15
[5] Y. Wu, H. Fu, G. Zhang, B. Zhao, M. Xu, Y. Zou, X. Feng, and [25] L. K. Yan and H. Yin, “{DroidScope}: Seamlessly reconstructing the
P. Hu, “Tracedroid: Detecting android malware by trace of privacy {OS} and dalvik semantic views for dynamic android malware analysis,”
leakage,” in International Conference on Wireless Algorithms, Systems, in 21st USENIX security symposium (USENIX security 12), 2012, pp.
and Applications. Springer, 2022, pp. 466–478. 569–584.
[6] H. Cai, N. Meng, B. Ryder, and D. Yao, “Droidcat: Effective android [26] W. Enck, P. Gilbert, S. Han, V. Tendulkar, B.-G. Chun, L. P. Cox,
malware detection and categorization via app-level profiling,” IEEE J. Jung, P. McDaniel, and A. N. Sheth, “Taintdroid: an information-
Transactions on Information Forensics and Security, vol. 14, no. 6, pp. flow tracking system for realtime privacy monitoring on smartphones,”
1455–1470, 2018. ACM Transactions on Computer Systems (TOCS), vol. 32, no. 2, pp.
[7] W. Wang, M. Zhao, and J. Wang, “Effective android malware detection 1–29, 2014.
with a hybrid model based on deep autoencoder and convolutional neural [27] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and
network,” Journal of Ambient Intelligence and Humanized Computing, C. Siemens, “Drebin: Effective and explainable detection of android
vol. 10, no. 8, pp. 3035–3043, 2019. malware in your pocket.” in Ndss, vol. 14, 2014, pp. 23–26.
[8] M. Cai, Y. Jiang, C. Gao, H. Li, and W. Yuan, “Learning features [28] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. Ross,
from enhanced function call graphs for android malware detection,” and G. Stringhini, “Mamadroid: Detecting android malware by building
Neurocomputing, vol. 423, pp. 301–307, 2021. markov chains of behavioral models (extended version),” ACM Trans-
[9] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing actions on Privacy and Security (TOPS), vol. 22, no. 2, pp. 1–34, 2019.
adversarial examples,” in 3rd International Conference on Learning [29] T. Kim, B. Kang, M. Rho, S. Sezer, and E. G. Im, “A multimodal deep
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, learning method for android malware detection using various features,”
Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2015. IEEE Transactions on Information Forensics and Security, vol. 14, no. 3,
[Online]. Available: http://arxiv.org/abs/1412.6572 pp. 773–788, 2018.
[10] D. Li and Q. Li, “Adversarial deep ensemble: Evasion attacks and [30] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, “Dl-droid: Deep learning
defenses for malware detection,” IEEE Transactions on Information based android malware detection using real devices,” Computers &
Forensics and Security, vol. 15, pp. 3886–3900, 2020. Security, vol. 89, p. 101663, 2020.
[11] C. Li, X. Chen, D. Wang, S. Wen, M. E. Ahmed, S. Camtepe, and [31] J. Sakhnini, H. Karimipour, A. Dehghantanha, A. Yazdinejad, T. R.
Y. Xiang, “Backdoor attack on machine learning based android malware Gadekallu, N. Victor, and A. Islam, “A generalizable deep neural net-
detectors,” IEEE Transactions on Dependable and Secure Computing, work method for detecting attacks in industrial cyber-physical systems,”
vol. 19, no. 5, pp. 3357–3370, 2022. IEEE Systems Journal, vol. 17, no. 4, pp. 5152–5160, 2023.
[12] I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, “Adversarial [32] F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, “Intriguing
machine learning attacks and defense methods in the cyber security properties of adversarial ml attacks in the problem space,” in 2020 IEEE
domain,” ACM Computing Surveys (CSUR), vol. 54, no. 5, pp. 1–36, symposium on security and privacy (SP). IEEE, 2020, pp. 1332–1349.
2021. [33] K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel,
[13] H. Li, S. Zhou, W. Yuan, X. Luo, C. Gao, and S. Chen, “Robust android “Adversarial examples for malware detection,” in European symposium
malware detection against adversarial example attacks,” in Proceedings on research in computer security, 2017, pp. 62–79.
of the Web Conference 2021, 2021, pp. 3603–3612. [34] G. Xu, G. Xin, L. Jiao, J. Liu, S. Liu, M. Feng, and X. Zheng, “Ofei:
[14] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation A semi-black-box android adversarial sample attack framework against
as a defense to adversarial perturbations against deep neural networks,” dlaas,” IEEE Transactions on Computers, vol. 73, no. 4, pp. 956–969,
in 2016 IEEE symposium on security and privacy (SP). IEEE, 2016, 2024.
pp. 582–597. [35] W. Yang, D. Kong, T. Xie, and C. A. Gunter, “Malware detection in
[15] Y. Zhou, G. Cheng, and S. Yu, “An sdn-enabled proactive defense adversarial settings: Exploiting feature evolutions and confusions in
framework for ddos mitigation in iot networks,” IEEE Transactions on android apps,” in Proceedings of the 33rd Annual Computer Security
Information Forensics and Security, vol. 16, pp. 5366–5380, 2021. Applications Conference, 2017, pp. 288–302.
[16] J.-H. Cho, D. P. Sharma, H. Alavizadeh, S. Yoon, N. Ben-Asher, T. J. [36] H. Bostani and V. Moonsamy, “Evadedroid: A practical evasion attack on
Moore, D. S. Kim, H. Lim, and F. F. Nelson, “Toward proactive, adaptive machine learning for black-box android malware detection,” Computers
defense: A survey on moving target defense,” IEEE Communications & Security, vol. 139, p. 103676, 2024.
Surveys & Tutorials, vol. 22, no. 1, pp. 709–745, 2020. [37] G. Apruzzese, M. Andreolini, M. Colajanni, and M. Marchetti, “Hard-
[17] S. Sengupta, A. Chowdhary, A. Sabur, A. Alshamrani, D. Huang, and ening random forest cyber detectors against adversarial attacks,” IEEE
S. Kambhampati, “A survey of moving target defenses for network Transactions on Emerging Topics in Computational Intelligence, vol. 4,
security,” IEEE Communications Surveys & Tutorials, vol. 22, no. 3, no. 4, pp. 427–439, 2020.
pp. 1909–1941, 2020. [38] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural
[18] Y. Zhou, G. Cheng, S. Jiang, Y. Zhao, and Z. Chen, “Cost-effective networks,” in 2017 ieee symposium on security and privacy (sp). Ieee,
moving target defense against ddos attacks using trilateral game and 2017, pp. 39–57.
multi-objective markov decision processes,” Computers & Security, [39] D. Li, Q. Li, Y. Ye, and S. Xu, “Enhancing robustness of deep neural
vol. 97, p. 101976, 2020. networks against adversarial malware samples: Principles, framework,
[19] Q. Song, Z. Yan, and R. Tan, “Deepmtd: Moving target defense for and aics’2019 challenge,” arXiv preprint arXiv:1812.08108, 2018.
deep visual sensing against adversarial examples,” ACM Transactions [40] D. Li, S. Cui, Y. Li, J. Xu, F. Xiao, and S. Xu, “Pad: Towards
on Sensor Networks (TOSN), vol. 18, no. 1, pp. 1–32, 2021. principled adversarial malware detection against evasion attacks,” IEEE
[20] Y. Lu, Y. Jia, J. Wang, B. Li, W. Chai, L. Carin, and S. Velipasalar, Transactions on Dependable and Secure Computing, vol. 21, no. 2, pp.
“Enhancing cross-task black-box transferability of adversarial examples 920–936, 2024.
with dispersion reduction,” in Proceedings of the IEEE/CVF Conference [41] M. Ficco, “Malware analysis by combining multiple detectors and
on Computer Vision and Pattern Recognition, 2020, pp. 940–949. observation windows,” IEEE Transactions on Computers, vol. 71, no. 6,
[21] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, pp. 1276–1290, 2022.
Y. Le Traon, D. Octeau, and P. McDaniel, “Flowdroid: Precise context, [42] Z. Yu, S. Li, Y. Bai, W. Han, X. Wu, and Z. Tian, “Remsf: A robust
flow, field, object-sensitive and lifecycle-aware taint analysis for android ensemble model of malware detection based on semantic feature fusion,”
apps,” Acm Sigplan Notices, vol. 49, no. 6, pp. 259–269, 2014. IEEE Internet of Things Journal, vol. 10, no. 18, pp. 16 134–16 143,
[22] L. Li, A. Bartel, T. F. Bissyandé, J. Klein, Y. Le Traon, S. Arzt, 2023.
S. Rasthofer, E. Bodden, D. Octeau, and P. McDaniel, “Iccta: Detecting [43] A. Yazdinejad, A. Dehghantanha, R. M. Parizi, G. Srivastava, and
inter-component privacy leaks in android apps,” in 2015 IEEE/ACM 37th H. Karimipour, “Secure intelligent fuzzy blockchain framework: Effec-
IEEE International Conference on Software Engineering, vol. 1. IEEE, tive threat detection in iot networks,” Computers in Industry, vol. 144,
2015, pp. 280–291. p. 103801, 2023.
[23] M. I. Gordon, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C. [44] A. Yazdinejad, A. Dehghantanha, G. Srivastava, H. Karimipour, and
Rinard, “Information flow analysis of android applications in droidsafe.” R. M. Parizi, “Hybrid privacy preserving federated learning against
in NDSS, vol. 15, no. 201, 2015, p. 110. irregular users in next-generation internet of things,” Journal of Systems
[24] F. Wei, S. Roy, X. Ou, and Robby, “Amandroid: A precise and general Architecture, p. 103088, 2024.
inter-component data flow analysis framework for security vetting of [45] A. Yazdinejad, A. Dehghantanha, R. M. Parizi, M. Hammoudeh,
android apps,” ACM Transactions on Privacy and Security (TOPS), H. Karimipour, and G. Srivastava, “Block hunter: Federated learning
vol. 21, no. 3, pp. 1–32, 2018. for cyber threat hunting in blockchain-based iiot networks,” IEEE
16
Transactions on Industrial Informatics, vol. 18, no. 11, pp. 8356–8366, Yuyang Zhou received the B.S. degree in Electronic
2022. Information Engineering from Nanjing University
[46] S. Sengupta, T. Chakraborti, and S. Kambhampati, “Mtdeep: boosting of Science and Technology in 2016 and the Ph.D.
the security of deep neural nets against adversarial attacks with moving degree in Cyberspace Security from Southeast Uni-
target defense,” in Workshops at the thirty-second AAAI conference on versity in 2021. He is currently working as a postdoc
artificial intelligence, 2018. with the School of Cyber Science and Engineering,
[47] A. Roy, A. Chhabra, C. A. Kamhoua, and P. Mohapatra, “A moving Southeast University. His major research interests
target defense against adversarial machine learning,” in Proceedings of include moving target defense, Android malware
the 4th ACM/IEEE Symposium on Edge Computing, 2019, pp. 383–388. detection, and security modeling. He has published
[48] A. Amich and B. Eshete, “Morphence: Moving target defense against in some of the topmost journals and conferences
adversarial examples,” in Annual Computer Security Applications Con- like IEEE TIFS, IEEE TII, and ACM CCS, and is
ference, 2021, pp. 61–75. involved as a reviewer and in technical program committees of several journals
[49] Y. Qian, Y. Guo, Q. Shao, J. Wang, B. Wang, Z. Gu, X. Ling, and C. Wu, and conferences in the field. He is a Member of IEEE and CCF.
“Ei-mtd: Moving target defense for edge intelligence against adversarial
attacks,” ACM Transactions on Privacy and Security, vol. 25, no. 3, pp.
Guang Cheng received the B.S. degree in Traffic
1–24, 2022.
Engineering from Southeast University in 1994, the
[50] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille,
M.S. degree in Computer Application from Hefei
“Improving transferability of adversarial examples with input diversity,”
University of Technology in 2000, and the Ph.D.
in Proceedings of the IEEE/CVF Conference on Computer Vision and
degree in Computer Network from Southeast Uni-
Pattern Recognition, 2019, pp. 2730–2739.
versity in 2003. He is a Full Professor in the
[51] A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck,
School of Cyber Science and Engineering, Southeast
I. Corona, G. Giacinto, and F. Roli, “Yes, machine learning can be more
University. He has authored or coauthored eight
secure! a case study on android malware detection,” IEEE Transactions
monographs, and produced more than 100 technical
on Dependable and Secure Computing, vol. 16, no. 4, pp. 711–724,
papers, including top journals and top conferences
2019.
like IEEE ToN, IEEE TIFS, IEEE TII, and INFO-
[52] C. Jeon, I. Yun, J. Jung, M. Wolotsky, and T. Kim, “Avpass: Leaking
COM. His research interests include network security, network measurement,
and bypassing antivirus detection model automatically,” in Black Hat
and traffic analysis. He is the director of Jiangsu Cyber Security Association,
USA 2017, 2017.
China, the vice director of China Computer Federation Technical Committee
[53] A. Al-Dujaili, A. Huang, E. Hemberg, and U.-M. O’Reilly, “Adversarial
of Internet (CCF TCI), and the vice director of Jiangsu Computer Society,
deep learning for robust detection of binary encoded malware,” in 2018
China. He is a Member of IEEE and a Distinguished Member of CCF.
IEEE Security and Privacy Workshops (SPW), 2018, pp. 76–82.
[54] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards
deep learning models resistant to adversarial attacks,” in International Shui Yu obtained his PhD from Deakin University,
Conference on Learning Representations, 2018. Australia, in 2004. He currently is a Professor of
[55] X. Liu, J. Zhang, Y. Lin, and H. Li, “Atmpa: attacking machine School of Computer Science, University of Tech-
learning-based malware visualization detection methods via adversarial nology Sydney, Australia. Dr. Yu’s research interests
examples,” in 2019 IEEE/ACM 27th International Symposium on Quality include Big Data, Security and Privacy, Networking,
of Service (IWQoS), 2019, pp. 1–10. and Mathematical Modelling. He has published two
[56] B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, monographs and edited two books, and produced
G. Giacinto, and F. Roli, “Evasion attacks against machine learning more than 500 technical papers, published in top
at test time,” in Joint European conference on machine learning and journals such as IEEE TPDS, TC, TIFS, TMC,
knowledge discovery in databases. Springer, 2013, pp. 387–402. TKDE, TETC, ToN, and INFOCOM. His h-index is
[57] L. Schott, J. Rauber, M. Bethge, and W. Brendel, “Towards the first 68. Dr. Yu initiated the research field of networking
adversarially robust neural network model on MNIST,” in International for big data in 2013, and his research outputs have been widely adopted by
Conference on Learning Representations, 2019. industrial systems, for example, the auto scale strategy of Amazon Cloud
[58] Z. Che, A. Borji, G. Zhai, S. Ling, J. Li, X. Min, G. Guo, and against distributed denial-of-service attacks. He is currently serving a number
P. Le Callet, “Smgea: A new ensemble adversarial attack powered by of prestigious editorial boards, including IEEE Communications Surveys and
long-term gradient memories,” IEEE Transactions on Neural Networks Tutorials (Area Editor), IEEE Communications Magazine, IEEE Internet of
and Learning Systems, 2020. Things Journal, among others. He is a member of AAAS and ACM, a
[59] A. Demontis, M. Melis, M. Pintor, M. Jagielski, B. Biggio, A. Oprea, Distinguished Visitor of IEEE Computer Society, a voting member of IEEE
C. Nita-Rotaru, and F. Roli, “Why do adversarial attacks transfer? ComSoc Educational Services board, and an elected member of Board of
explaining transferability of evasion and poisoning attacks,” in 28th Governor of IEEE Vehicular Technology Society. He is a Fellow of IEEE.
USENIX security symposium (USENIX security 19), 2019, pp. 321–338.
[60] A. Rashid and J. Such, “Malprotect: Stateful defense against adver-
sarial query attacks in ml-based malware detection,” arXiv preprint Zongyao Chen received the B.S. degree in Infor-
arXiv:2302.10739, 2023. mation Security from HaiNan University in 2022.
[61] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. He is currently pursuing the master degree with the
Ghorbani, “Dynamic android malware category classification using School of Cyber Science and Engineering, South-
semi-supervised deep learning,” in 2020 IEEE Intl Conf on Dependable, east University. His major research interests include
Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence moving target defense, Android malware detection,
and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf and reverse engineering.
on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/-
CyberSciTech), 2020, pp. 515–522.
[62] K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon, “Androzoo:
Collecting millions of android apps for the research community,” in 2016
IEEE/ACM 13th Working Conference on Mining Software Repositories Yujia Hu received the B.S. degree in Cyber Security
(MSR), 2016, pp. 468–471. from Southeast University in 2021. He is currently
[63] R. Jordaney, K. Sharad, S. K. Dash, Z. Wang, D. Papini, I. Nouretdinov, pursuing the master degree with the School of Cy-
and L. Cavallaro, “Transcend: Detecting concept drift in malware ber Science and Engineering, Southeast University.
classification models,” in 26th USENIX security symposium (USENIX His major research interests include moving target
security 17), 2017, pp. 625–642. defense, Android malware detection, and traffic en-
[64] K. Xu, Y. Li, R. Deng, K. Chen, and J. Xu, “Droidevolver: Self-evolving gineering.
android malware detection system,” in 2019 IEEE European Symposium
on Security and Privacy (EuroS&P). IEEE, 2019, pp. 47–62.
[65] H. Li, S. Zhou, W. Yuan, J. Li, and H. Leung, “Adversarial-example at-
tacks toward android malware detection system,” IEEE Systems Journal,
vol. 14, no. 1, pp. 653–656, 2019.