0% found this document useful (0 votes)
35 views12 pages

PlausMal-GAN Plausible Malware Training Based On Generative Adversarial Networks For Analogous Zero-Day Malware Detection

The article presents PlausMal-GAN, a framework utilizing Generative Adversarial Networks (GAN) to enhance detection of analogous zero-day malware by generating high-quality and diverse malware images. The framework trains a discriminator to learn various malware features from both real and generated data, demonstrating improved performance in detecting zero-day malware. The study indicates that this approach is beneficial for developing and updating malware detection systems, addressing the challenges posed by traditional antivirus methods that often fail against zero-day threats.

Uploaded by

Manan Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views12 pages

PlausMal-GAN Plausible Malware Training Based On Generative Adversarial Networks For Analogous Zero-Day Malware Detection

The article presents PlausMal-GAN, a framework utilizing Generative Adversarial Networks (GAN) to enhance detection of analogous zero-day malware by generating high-quality and diverse malware images. The framework trains a discriminator to learn various malware features from both real and generated data, demonstrating improved performance in detecting zero-day malware. The study indicates that this approach is beneficial for developing and updating malware detection systems, addressing the challenges posed by traditional antivirus methods that often fail against zero-day threats.

Uploaded by

Manan Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 1

PlausMal-GAN: Plausible Malware Training


Based on Generative Adversarial Networks for
Analogous Zero-day Malware Detection
Dong-Ok Won, Yong-Nam Jang, and Seong-Whan Lee, Fellow, IEEE

Abstract—Zero-day malicious software (malware) refers to a previously unknown or newly discovered software vulnerability. The
fundamental objective of this paper is to enhance detection for analogous zero-day malware by efficient learning to plausible generated
data. To detect zero-day malware, we proposed a malware training framework based on the generated analogous malware data using
generative adversarial networks (PlausMal-GAN). Thus, the PlausMal-GAN can suitably produce analogous zero-day malware images
with high quality and high diversity from the existing malware data. The discriminator, as a detector, learns various malware features
using both real and generated malware images. In terms of performance, the proposed framework showed higher and more stable
performances for the analogous zero-day malware images, which can be assumed to be analogous zero-day malware data. We
obtained reliable accuracy performances in the proposed PlausMal-GAN framework with representative GAN models (i.e., deep
convolutional GAN, least-squares GAN, Wasserstein GAN with gradient penalty, and evolutionary GAN). These results indicate that the
use of the proposed framework is beneficial for the detection and prediction of numerous and analogous zero-day malware data from
noted malware when developing and updating malware detection systems.

Index Terms—Zero-day Malware, Analogous Malware Detection, Malware Augmentation, Malware Data, Generative Adversarial
Networks
F

1 I NTRODUCTION

M ALWARE can be defined as malicious software that


is designed to cause outages, denial of activity, col-
lection of personal data without user consent, unauthorized
tems typically cannot detect zero-day malware. Zero-day
malware is an important threat to computer security, and
zero-day malware detection is a top priority for malware
access to system resources, and similar inappropriate behav- detection systems.
iors. With the rapid development of information technology, To detect zero-day malware, we propose a deep learning
the exponential increase in malware has become one of method of generating arbitrarily modified malware features
the main threats to computer security [1]–[3]. Malicious using the malware’s raw code without running it. Malware
software detection has become more difficult as the number code based on specific rules and actions generates certain
and variety of applications increase in computer security patterns. Examples of the malware sample used in this study
[4]–[6], with more than 143 thousand new malicious pro- are shown in Figure 1 and Figure 11 [9], [10].
grams targeting mobile devices detected during 2013 [5], While, when dealing with classification tasks using neu-
and as Kaspersky Lab’s research shows that nearly 30% of ral networks, data augmentation techniques have been used
all computers were threatened at least once during 2018 [7]. to compensate for imbalance or data insufficiency problems.
Zero-day malware is an unknown or unaddressed soft- In the malware detection research area, several papers also
ware vulnerability that hackers use to do malicious things, used simple data augmentation techniques (e.g., sliding
such as destroying programs, stealing data, or paralyzing window, transformation, etc.) to deal with these issues [11],
networks [8]. A range of antivirus systems and other strate- [12].
gies are used to help protect against the introduction of mal- In this study, we investigated and focused on the
ware, which helps in detection if such malware is already different direction of malware training technique with
present. Antivirus systems typically fail to detect zero-day generating zero-day malware data, not focused imbalance
malware because they rely on signatures to identify mal- or data insufficiency. We proposed a plausible malware
ware. Computers are more vulnerable to zero-day malware training framework capable of detecting analogous
than to general malware because traditional antivirus sys- zero-day malware that can handle newly plausible
malware (Plausible malware training framework based
• D.-O. Won is with the Department of Artificial Intelligence Convergence,
on generative adversarial networks, PlausMal-GAN).
Hallym University, Republic of Korea. E-mail: [email protected] Our main contribution is the proposed malware training
• Y.-N. Jang is with the Department of Brain and Cognitive Engineering, framework based on generative adversarial networks
Korea University, Seoul, Republic of Korea. E-mail: yn [email protected] (GAN) with generated analogous malware samples. The
D.-O. Won and Y.-N. Jang authors contributed equally to this work.
• S.-W. Lee is with the Department of Artificial Intelligence, Korea Univer- proposed framework trains a generator and discriminator
sity, Seoul, Republic of Korea. E-mail: [email protected] based on real malware data and the generated malware
S.-W. Lee is the corresponding author. data in the first phase. In the second phase, the generator
Manuscript received February 24, 2021; revised October 7, 2021. is fixed and the discriminator is re-trained based on real

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 2

geneous deep-learning framework composed of an autoen-


coder stacked up with a layer of associative memory and
multilayer restricted Boltzmann machines [27]. Kabanga et
al. used data from converted malware images as input to
the convolutional neural networks (CNNs) [28]. Yan et al.
used CNN and long short-term memory networks to learn
Ramnit Lollipop Kelihos_ver3 from grayscale image and opcode sequence, respectively,
and takes a stacking ensemble for malware classification
[11]. The aforementioned methods have disadvantages that
detect only certain variants of malware. The developers of
malware use obfuscation techniques, such as null byte in-
jection, code exchange, and subroutine reordering, to create
new variants with signatures different from existing mal-
ware. However, the aforementioned methods use malware
Vundo Simda Tracur that has been discovered so far. Thus, unlearned malware
will not be detected. To detect attacks that bypass deep-
learning methods [33], Wang et al. proposed a resistant
method that is robust to adversarial malware samples by
nullifying arbitrary features [33]. However, in this way,
malware characteristics are randomly removed, which risks
removing not only unnecessary features but also important
Kelihos_ver1 Obfuscator.ACY Gatak ones. There are now hybrid methods that combine static and
dynamic methods [22], [34]. While these methods can be
Fig. 1: Examples of nine type classes of malware images. effective for malware detection, they have the disadvantage
of being time-consuming and highly complex.

malware data and the generated malware data by the fixed Recently, there have been some methods developed for
generator. Ideally, the proposed framework can apply any zero-day malware detection [13], [14], [35], [36]. Venka-
kind of GAN model, so we evaluated the performance by traman and Alazab used a similarity matrix of malware
applying the latest and repetitive GAN models. Moreover, for visualization in order to detect zero-day malware [14].
we obtained stable performance for abundant analogous This method can be used to visually observe that different
zero-day malware test data in relatively few training data malware families exhibit significantly different behavior
conditions. patterns. Gupta and Rani proposed a big data framework
to address the big data problem caused by increase in mal-
ware [35]. They also attempted to detect zero-day malware
2 BACKGROUND using big data analysis techniques and machine-learning
2.1 Malware Detection algorithms.
Owing to the increasing damage caused by malware and This method modeled a series of opcodes to detect zero-
zero-day malware, research on malware detection methods day malware. Due to the increasing threat of malware in
have been continuously improving. We discuss two aspects a cyber-physical system, Huda et al. proposed a detection
of malware detection: malware detection and zero-day mal- method that uses methods like SVM and K-means to detect
ware detection. unknown malware by extracting knowledge and essential
Several reported studies have dealt with malware de- structures from already unlabeled, cheap, available data
tection [10], [13]–[17]. Nataraj et al. presented a visualiza- [36]. In the aforementioned zero-day malware detection
tion approach that differs from traditional approaches for methods, certain rules are fixed, and zero-day malware that
malware detection [10], where they transformed the mal- does not follow these rules cannot be detected. Recently,
ware’s binary information into grayscale malware images. Kim et al. has proposed transferred deep-convolutional
Ye et al. and Ndibanje et al. used Windows Audit Log generative adversarial network (tDCGAN), which generates
and API Call for malware detection [18], [19]. Traditional fake malware and learns to distinguish it from real malware
machine learning algorithms such as hidden Markov mod- [13]. This method obtained not only enhanced performance
els, support vector machines (SVMs) and random forests in malware detection but also showed possibility in a zero-
were also used for malware detection [20]–[23]. Singh et al. day attack experiment. Since the method is no consideration
proposed a big data analysis framework based on random of high diversity (e.g., plausible diversity) or quality in gen-
forests for malware detection [24]. Chen et al. attempted to erated zero-day malware, nor was it measured numerically
detect malware by analyzing mobile network traffic with (i.e., fréchet inception distance, etc.), it is difficult to assume
machine-learning methods [25]. Recently, there have been that focused on zero-day malware detection. While, we
many methods to use deep learning and generative ad- implemented analogous zero-day malware classifier with
versarial networks (GAN) because the available computing GAN models to create new high-diversity and high-quality
power has increased [11], [12], [26]–[31]. Pascanu et al. used malware images for generating plausible malware augmen-
recurrent neural networks for time-series information in tation. The generated data is used to create a robust detector
malware classification [26], [32]. Ye et al. presented a hetero- for zero-day malware detection.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 3

2.2 Data Augmentation


Malware Classification loss Class
Class
Data Augmentation encompasses a suite of techniques that Samples (Malware, Generated)
enhance the size and quality of training datasets such that x ~ ܲௗ௔௧௔ (x) Yes
better deep learning models can be built using them [37],
Did
[38]. The simple data augmentations based on basic image
manipulations are flipping, cropping, rotation, translation, Class c, Noise z ~ ܲ(c,z)
D Gఏ
Discriminator converge ?
etc [37], [38]. Recently, GAN based approach refers to the Generated
practice of creating artificial instances from a dataset such Gఏభ ǡఏమ ǥ ఏഋ Samples
No
that they retain similar characteristics to the original set [39], Generator
[40]. In malware detection, several papers applied data aug- Adversarial loss
Real / Fake
mentation method to solve imbalance or data insufficiency Classification loss
Class (Malware)
issues [12], [41]. Classification loss
Class (Generated)
To our best knowledge, there have been no studies to
date which focused on the high diversity and quality of
plausible malware in terms of analogous malware augmen- Fig. 2: The architectures of the proposed framework for
tation, which is an important factor to be investigated for analogous zero-day malware detection.
various transformations or analogous data augmentation
using a zero-day malware detection system. In this study,
we proposed a plausible malware training framework based to distinguish between the real data sample x ∼ pdata (x)
on GAN that could consider high diversity in generating and the generated data sample x̂ ∼ pgen (x̂).
analogous zero-day malware data. Moreover, the proposed
method showed stable performance even with relatively lit- LD = −Ex∼pdata [log D(x)] − Ex̂∼pgen [log(1 − D(x̂))]. (1)
tle training data. We applied different kinds of several recent
GAN models (i.e., deep convolutional GAN (DCGAN) [42],
least-squares GAN (LSGAN) [43], Wasserstein GAN with
3 M ETHODS
gradient penalty (WGAN-GP) [44], evolutionary GAN (E-
GAN) [40]) to our design, it could be shown as a potentially In this section, we describe a plausible malware train-
reliable adaptation in state-of-the-art GAN models. ing framework based on generative adversarial networks
(GAN) that generates analogous malware with a malware
classifier and training discriminator as a malware detector.
2.3 Generative Adversarial Networks Figure 2 is an architectures of our proposed framework.
GAN [39] is a deep-learning model that emerged for the
purpose of generating data similar to the training data using 3.1 PlausMal-GAN Framework
the given training data. Unlike the original GAN, which
uses only one objective function (e.g., minimax), Wang et al. To generate analogous malware samples for each kind of
proposed E-GAN [40] using several objective functions (i.e., malware, the proposed framework trains a generator and
minimax, heuristic, and least-squares). Generators using discriminator based on GAN with a malware classifier using
each objective function are evaluated by a discriminator, real malware data and the generated malware data in the
and the best-performing generator is chosen to evolve to the first step. The discriminator not only discriminates real or
next stage. In the process of evolution, the evolved generator fake, but also learns to classify malware classes. In the
is expected to gradually adapt to the discriminator, which second step, the generator is fixed and the discriminator
means that the evolved generator can provide high-quality, is re-trained based on real malware data and the generated
high-diversity samples and learn the real data distribution. malware data by the fixed generator. Figure 3 shows the
The evolutionary process consists of three stages (i.e., varia- overview and process of the proposed framework. The aux-
tion, evaluation, and selection): iliary classifier GAN (AC-GAN) [45] proposed a structure
that produces data that matches class labels as well as
First, the variation stage used the variation operators to
data that are close to real data. For malware classifier, the
produce its offspring {Gθ1 , Gθ2 , ...}, given an individual
architectures of the proposed framework is following the
Gθ in the population. In particular, several copies of each
AC-GAN structures (Figure 2). Our malware generator gen-
individual or parent were created, each of which was mod-
erates fake malware samples x̂ that contain noise sample z
ified by different mutations. Then, each modified copy is
by malware class c, and discriminator not only distinguishes
regarded as one child. Second, in the evaluation stage, we
between real x ∼ pdata (x) and fake x̂ ∼ pgen (x̂) but also
evaluated the performance or individual quality for each
class c. The difference between our method and the existing
child by a fitness function F that depends on the current
AC-GAN is that the discriminator does not learn the class
environment (i.e., discriminator D). Third, in the selection
information of the generated malware sample, only the class
stage, we selected all children according to their values and
information of the real malware sample. Our discrimination
removed the worst ones. The rest remained alive (i.e., free
training loss is defined as follows:
to act as parents) and evolved to the next iteration.
Compared to the generator using multiple objective
LD = − Ex∼pdata [log D(x) − log p(c|x)]
functions, the discriminator is the same as the objective (2)
function of the original GAN. The discriminator D is trained − Ex̂∼pgen [log(1 − D(x̂))].

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 4

Real
(a) Samples
Malware classes
Ramnit

ࡳࣂ૚ ࡳࣂ૛ ࡳࣂೖ Lollipop
‫ܯ‬૚ ‫ܯ‬૛ ‫࢑ܯ‬ Kelihos_ver3

Generated Discriminator D Vundo


Evaluation

Generator ࡳࣂ Samples
Simda
Noise ा
Tracur
Kelihos_ver1
Class ࢉ
Obfuscator.ACY

Gatak


ࡳࣂ૚ ࡳࣂ૛ ࡳࣂ࢑
ऐ૚ ൐ ऐ૛ ൐ … ൐ ऐ௞
Real or Fake

(b) Malware classes


Fixed (not training) Ramnit
Generated
Generator ࡳࣂ Samples Malware Detector Lollipop
Noise ा Discriminator D Kelihos_ver3

Vundo
Class ࢉ Real
Samples Simda

Tracur
Kelihos_ver1
Obfuscator.ACY

Gatak

Fig. 3: The proposed PlausMal-GAN framework consists of two-phases. (a) The generator and discriminator training
based on GAN with malware classifier. (b) Training the discriminator as a zero-day malware detector from plausible
malware augmentation. For an intuitive explanation, it is shown using evolutionary GAN, which is one of the
representative GANs.

And, we considered standard GAN approach (minmax), (i.e., D(x̂) → 0). In other words, if the discriminator is
least-squares approach, heuristic approach, and combin- confident that the generated malware data is fake malware
ing the preceding three-approach for DCGAN, LSGAN, data, the generator may not train well. However, we have
WGAN-GP, and E-GAN model in the proposed framework, been able to solve this problem to some extent by adding
respectively. In E-GAN, we considered an evolutionary a classification loss. Unlike early gentle gradients, if the
step consists of three sub-steps: variation, evaluation, and generated malware distribution is somewhat similar to the
selection. In the variation step, we adopt three objectives real malware distribution, the minimax mutation provides a
that are interpretable and complementary as mutations pro- steep gradient, which later allows stable learning.
posed by Wang et al. [40]. As shown in Figure 4, the dif-
ference between the three objective functions are minimax
mutation, heuristic mutation, and least-squares mutation. Mminimax = Ex̂∼pgen [log(1 − D(x̂)) − log p(c|x̂)]. (3)
G
In addition, we added a classification loss function to the
existing mutation functions, because not only the data is
close to real but also data corresponding to the class must be The heuristic mutation minimizes the log probability that
generated. The minimax mutation is similar to the minimax the discriminator will do well, which maximizes the log
objective function of the original GAN, which aimed to probability that the discriminator will go wrong. Using this
minimize the log probability that the discriminator would mutation, the gradient is steep even though the discrimi-
do well. In the original GAN, gradient vanishing can occur nator is convinced that the generated malware data is fake.
when the discriminator produces a result close to zero Thus, the heuristic mutation can avoid a vanishing gradient,
unlike the minimax mutation, which suggests the possibility

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 5

Algorithm 1 Plausible malware training framework (i.e.,


with E-GAN case)
Require: batch size m = 32. discriminator’s updating
steps per iteration nD = 1; number of parents µ = 1;
number of mutations nm = 3; Adam hyper-parameters
α = 0.0002, β1 = 0.5, β2 = 0.99; the hyper-parameter γ
of evaluation function.
Require: initial discriminator’s parameters w0 . initial gen-
µ
erator’s parameters {θ01 , θ02 , ..., θ0 }.
for number of training iterations do
for k = 0, ... , nD do n om
Sample a batch of x(i) ∼ pdata (training data),
n oi=1
m
and a batch of (c, z)(i) ∼ pc,z (noise sample z
i=1
by class c).
m
1 X
gw ←∇w [ log Dw (x(i) )
m i=1
µ m/µ
1 XX
Fig. 4: Mutation (or objective) functions with classification + log(1 − Dw (Gθj ((c, z)(i) )))
loss function. m j=1 i=1
µ m/µ
1 XX
+ log p(c(i) |x(i) )]
of better learning in the early stages than the minimax m j=1 i=1
mutation. w ← Adam(gw , w, α, β1 , β2 )
end for
Mheuristic = −Ex̂∼pgen [log(D(x̂)) + log p(c|x̂)]. (4) for j = 0, ... , µ do
G
for h = 0, ... , nm do n om
Lastly, the least-squares mutation is similar to the least- Sample a batch of (c, z)(i) ∼ pc,z (noise sam-
squares objective function of the LSGAN, which aimed at i=1
ple z by classh c). n
deceiving the discriminator by penalizing the generator. om i
Using this mutation, we get a gentle slope overall and can gθj,h ← ∇θj MhG (c, z)(i) , θj
i=1 
j,h
avoid a vanishing gradient as in a heuristic mutation. Be- θchild ← Adam gθj,h , θj , α, β1 , β2
sides, least-squares mutations, when compared to heuristic F j,h ← Fqj,h + γFdj,h
mutations, do not assign very high costs to generate fake end for
malware samples but do not assign very low costs to mode end
 j for
dropping, which partially avoids mode collapse [43]. F 1 ,h1 , F j2 ,h2 , . . . ← sort F j,h
 
j1 ,h1 j2 ,h2 jµ ,hµ
θ1 , θ2 , . . . , θµ ← θchild , θchild , . . . , θchild
Mleast-s. = Ex̂∼pgen (D(x̂) − 1)2 − log p(c|x̂) . end for
 
G (5)
In the evaluation step, the 1) malware quality and 2) diver-
sity of the generated malware samples are measured and the generated malware data is fake. In contrast, when the
evaluated. To detect zero-day malware, it was important generator generates data that does not change the discrim-
to generate samples of high-diversity malware with high inator gradient significantly, the generated malware data is
quality, so we adopted the evaluation step of the E-GAN not labeled as fake and tends to achieve high diversity.
architecture. First, the quality fitness score was used as a
measure of quality. This method puts the generated malware
image based on the noise sample by class into discriminator Fd = − log ||∇D || . (7)
D and uses the output value. We use the output of D
multiplied by the probability of that class to measure the Using the two fitness scores mentioned above, the criterion
image quality score for each class. And, we use the average for the E-GAN evaluation is as follows:
output value. The closer the value is to 1, the closer to reality
the malware data is. In other words, the closer to 1, the
F = Fq + γFd (8)
higher quality malware data.
where γ > 0 is the balance between the quality and diversity
measurements.
Fq = Ex̂∼pgen [D(x̂) × log p(c|x̂)]. (6)
In the selection step, the offspring with the highest
Second, the diversity fitness score is used as a measure fitness score is selected and proceeds to the next variation
of malware diversity. This method uses the minus log- step. Throughout the evolution process, the generator will
gradient-norm of the discriminator. When the generator gradually generate data for each class as well as generating
generates data that greatly changes the gradient of the data similar to real data. We use the converged generator for
discriminator, the discriminator is likely to determine that malware detection in the next step.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 6

3.2 Malware Detection


16k
For analogous zero-day malware augmentation, the mal- R= (11)
C
ware generator generates high-quality and high-diversity
The malware images were so large that they were reduced
images. We use the discriminator’s classifier as a malware
to 128 × 128 using Pillow which python image library. Then
detector. The discriminator has trained anew as a malware
we used jet colormaps to represent RGB color images.
detector without adversarial training with the generator. As
a malware detector, the discriminator is trained using both 4.1.2 Malimg dataset
generated and real malware images. The objective function
In Supplementary Materials Appendix C, we show the fre-
of the discriminator is as redefined:
quency distribution of malware families and their variants
LD = −Ex∼pdata [log p(c|x)] − Ex̂∼pgen [log p(c|x̂)] (9) in the Malimg dataset [10]. We were able to find malware
data from malware class that shared the family name (i.e.,
when training the discriminator, the generator is not trained
Worm: Allaple.A and Allaple.L, PWS: C2Lop.gen!G and
and only generates malware images. Figure 3 shows training
C2Lop.P, Trojan: Lolyda.AA1 and Lolyda.AA2, TDown-
the discriminator with data augmentation as a malware
loader: Swizzor.gen!I and Swizzor.gen!E). In Table 1 and
detector.
Figure 11, eight different malware data have four pairs with
Algorithm 2 Training discriminator based on the proposed two different and similar family names and shared similar
framework properties. For the second zero-day malware experiments,
Require: batch size m = 32; discriminator’s updating we evaluated malware data with similar properties family
steps per iteration nD = 1; Adam hyper-parameters in the Malimg dataset, which consists of 5,543 malware
α = 0.0002, β1 = 0.5, β2 = 0.99. samples from 8 different malware families.
Require: initial discriminator’s parameters w0 ; initial gen- TABLE 1: Malware data with similar family names in the
erator’s parameters θ0 . Malimg dataset for the second zero-day malware
for number of training iterations do experiment
for k=0,...,nD do n om
Sample a batch of x(i) ∼ pdata (training data), Malware family names Type No. of Variants
n oi=1
m
and a batch of (c, z)(i) ∼ pc,z (noise sample z Allaple.A Worm 2949
i=1 Allaple.L Worm 1591
by class c). C2Lop.gen!G PWS 200
m
1 X C2Lop.P PWS 146
gw ←∇w [ log p(c(i) |x(i) ) Lolyda.AA1 Trojan 213
m i=1 Lolyda.AA2 Trojan 184
m Swizzor.gen!I TDownloader 132
1 X
log p(c(i) |Gθ ((c, z)(i) ))]
+ Swizzor.gen!E TDownloader 128
m i=1
w ← Adam(gw , w, α, β1 , β2 )
end for 4.2 Experimental Details
end for The experiment is divided into two parts: a existing mal-
ware classification and a analogous zero-day malware attack
experiments. In the existing malware classification exper-
4 E XPERIMENTS AND R ESULTS iment, we compared the proposed framework with rep-
resentative GANs (i.e., DCGAN, LSGAN, WGAN-GP, and
This section describes the experiments and results for eval-
E-GAN) and previous methods experimental results [13].
uating the proposed framework.
In the proposed framework, we used the same network
structure (Supplementary Table S2). In the first analogous
4.1 Datasets
zero-day malware attack experiment, we also compared
4.1.1 Microsoft malware classification challenge dataset our framework with the four GAN models and previous
To verify the data generation and detection performance methods results (i.e., random forest, decision tree, nearest
of the proposed framework, we used a malware data from neighbors, Naive Bayes, multi-layer perceptron (MLP) [46],
the Microsoft dataset [9]. The malware file was a byte file, CNN [47], GAN [39], and tDCGAN [13]). In the second
and we used binary code written to it. The total number zero-day malware experiment, we compared the proposed
of malware is 10,868, divided into 9,781 training sets and framework phase 1 and phases 1&2 with the representative
1,087 test sets (9:1 train-test ratio). Appendix B shows the four GAN models.
malware data types used and the number of malware for The operating system of the computer used in the exper-
each malware type [9]. iments was Ubuntu 16.04.2 LTS, and the central processing
As Nataraj et al. did [10], we convert malware binary unit was Intel Xeon Gold 6148. The random-access memory
code into an image called malware image. If k is the length of was Samsung DDR4 16 GB × 4, and the graphics processing
the binary code, C is the size of the converted column, and unit was TITAN XP. When implementing the proposed
R is the size of the converted row, this is how to calculate framework, we used the Pytorch library. The generative and
the size of the converted columns and rows: discriminative network architectures used in the generator

log 16k
and discriminator respectively, are shown in Supplementary
+1
C=2 log 2 (10) Table S2.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 7

(a) (b)

Fig. 5: Examples of (a) real malware images and (b) generated malware images in the proposed framework.

TABLE 2: FIDs between generated malware images and 


real malware images in the proposed framework


E-GAN 
Model DCGAN LSGAN WGAN-GP
(r = 0.1 , r = 0.5)

$FFXUDF\

FID 220.16 190.70 206.23 146.39, 127.96




4.3 Analysis of Generated Malware Data  '&*$1

/6*$1

Figure 5 shows examples of the generated malware images  :*$1*3

(*$1
using the Microsoft dataset [9]. In qualitative terms, Figure

5 shows the generation of malware images that are similar        

to the real malware images, which shows that the proposed ,WHUDWLRQV
framework can also generate modified malware or analo-
gous zero-day malware. Fig. 6: Classification accuracy according to the training
We choose the Fréchet inception distance (FID) [48] as iterations for the proposed framework with four
a quantitative metric for evaluating generator convergence. representative models.
The FID uses pre-trained Inception v3 networks to extract
features of the generated images and real images. Then
experiment using the Microsoft dataset [9]. The average
model the data distribution for extracted features using a
classification accuracy achieved by the proposed framework
multivariate Gaussian distribution with mean µ and covari-
was 95.56%, which means that the performance of our
ance Σ. The FID between the real images x and generated
proposed framework was much better than the previous
images g is computed as below:
methods. Table 3 shows the numerical classification results
2 with four difference models (i.e., DCGAN, LSGAN, WGAN-
FID(x, g) = kµx − µg k2
 1
 (12) GP, and E-GAN). Because the performance was the most
+ Tr Σx + Σg − 2 (Σx Σg ) 2 , dominant when using the E-GAN model, only the proposed
framework with this model was used for some further
where Tr is the sum of all the diagonal elements.
analysis (i.e., Table 4, Figures 7 and 9).
A lower FID implies that the distribution distance be-
To verify the performance of the proposed malware
tween the real images and generated images is closer. It also
classifier model, we showed a confusion matrix in Figure
means that the generated images have high quality and high
7. We calculated the precision, recall, and F1-score for each
diversity. As shown in Table 2, our proposed framework has
malware type and summarized them in Table 4. Also, we
the lowest FID score. This means that the generator of our
compared the classification accuracies for the proposed
proposed framework generated a high-quality and high-
framework with difference four GAN models according to
diversity malware sample. While low FIDs do not actually
the training iterations in Figure 6. In results, the E-GAN
produce new malware, it is likely a variant of existing
models showed higher classification performance than other
malware. This allows us to expect data augmentation with
Representative models.
the generated data.
4.5 Zero-day Malware
4.4 Malware Classification 4.5.1 Zero-day malware experiment I using generated
To derive a more accurate estimate of model prediction analogous zero-day malware
performance, we used 10-fold cross-validation for all meth- We modeled plausible zero-day malware for analogous
ods and it was used for the existing malware classification zero-day malware attack experiments using the Microsoft

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 8

TABLE 3: Comparison of malware classification accuracies in the proposed framework with four representative GAN
models and previous methods

Proposed Framework (PlausMal-GAN)


Model MLP CNN GAN tGAN
DCGAN LSGAN WGAN-GP E-GAN
Accuracy (%) 83.06 94.63 87.81 88.10 94.99 96.02 94.86 96.35
Std. dev. 7.54e-04 2.12e-05 3.44e-05 8.05e-05 0.596 0.351 0.255 0.539

TABLE 5: Comparison of analogous zero-day malware


attack performances in two difference combined rates (CR)
for the proposed framework and previous methods (%)

Model\SSIM 0.60 0.62 0.64 0.66 0.68 CR


Random Forest 91.28 95.19 92.88 95.58 91.40
Decision Tree 95.64 96.46 96.71 96.41 96.18
Nearest Neighbors 97.71 97.72 98.36 98.34 98.09
Naive Bayes 90.60 90.89 91.51 91.16 90.45
MLP 96.78 96.46 97.26 97.23 96.82
CNN 98.16 98.23 98.63 98.61 98.41
GAN 96.32 96.96 96.99 96.95 96.50
tGAN 97.24 96.96 97.81 97.78 97.45
8:2
tDCGAN 98.39 98.73 98.63 98.61 98.41
Proposed framework
(with DCGAN) 99.59 99.66 99.60 98.58 99.54
(with LSGAN) 99.43 99.66 99.64 98.41 99.54
(with WGAN-GP) 99.59 99.66 99.60 98.58 98.63
(with E-GAN) 99.94 100.0 99.74 99.58 99.84
Proposed framework
(with DCGAN) 99.02 99.25 99.28 96.52 99.05
(with LSGAN) 97.28 99.10 99.42 97.00 99.05
Fig. 7: Confusion matrix for malware classification results (with WGAN-GP) 99.02 99.70 99.38 96.52 97.94
7:3
in 9:1 train-test ratio. (with E-GAN) 99.86 100.0 99.51 98.42 99.68

TABLE 4: Results of precision, recall, and F1-score for each


malware type in the proposed framework 5). The plausible zero-day malware modeling with noise is
calculated as follows:
R L K3 V S T K1 O G
Precision 0.954 0.971 0.996 0.854 0.666 0.864 1.000 0.982 0.960
Recall 0.961 0.975 0.993 0.979 0.500 0.933 0.975 0.894 0.950 Nk (x, y) = (1 − ξ)x + ξy, (14)
F1-score 0.957 0.973 0.994 0.912 0.571 0.897 0.987 0.936 0.955
where SSIM(x, y) > k , ξ is 0.3, 0.2 in combined ratio 7:3
and 8:2, respectively. Figure 8 shows examples of deformed
dataset (Figure 8) [9]. The previous study assumed that plausible zero-day malware.
the zero-day attacks can be modeled by introducing noise The results of the analogous zero-day malware attack
into existing malware data [13]. The noise was generated experiment in Table 5 divided the malware images into an
by the structure similarity (SSIM) method, which uses the experiment with an 8:2 combined ratio and a 7:3 combined
structural similarity of images [49]. We likewise used the ratio. We used 10-fold cross-validation (i.e., the train-test
SSIM method for systematic noise generation. The method ratios: 9:1). In 8:2 combined ratio experiments, the proposed
of calculating the SSIM values for a pair of images x,y frameworks’ models were more accurate than other previ-
includes calculating µx , µy as the means for the pixels of ous recent methods [13], and we obtained stable accuracy
the images x, y . performance in our frameworks with tested GAN mod-
els in all SSIM conditions. Moreover, in the 7:3 combined
ratio experiments, we also obtained reliable high aver-
(2µx µy + c1 ) (2σxy + c2 ) aged performance 98.62%, 98.37%, 98.51%, and 99.49% for
SSIM(x, y) =   (13)
µ2x+ µ2y + c1 σx2 + σy2 + c2 the proposed framework methods with DCGAN, LSGAN,
WGAN-GP, and E-GAN model, respectively. In particular,
2 2
where, c1 = (k1 L) , c2 = (k2 L) , k1 = 0.01, k2 = 0.03, L = the decreasing SSIM values or combined high noise ratio
2# bits per pixel − 1. could be an analogous zero-day attack compared to exist-
We used altered malware images with 0.02 intervals ing malware, but the proposed framework showed stable
between 0.60 and 0.68 (SSIM value) for analogous zero-day performances in any SSIM values or combined ratios. As
malware evaluation [13], [49]. Then, for more diverse zero- a result, the proposed framework obtained high and stable
day malware evaluation, we regenerated the transformed performance even the large variations of existing malware
malware images with two combined ratios such as 7:3 and (e.g., combined ratio 7:3 or SSIM value 0.6) in a analogous
8:2 ratios. We also compared the proposed framework with zero-day malware attack. Moreover, we were conducted in
previous methods results (in 8:2 combined ratio) [13] (Table few training data condition by the changing train-test ratios

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 9

‫ݔ‬ ‫ݕ‬ ‫ݔ‬ ‫ݕ‬

+ +
ͳ െ ߦ ‫ ݔ‬൅ ߦ‫ݕ‬ ͳ െ ߦ ‫ ݕ‬൅ ߦ‫ݔ‬ ͳ െ ߦ ‫ ݔ‬൅ ߦ‫ݕ‬ ͳ െ ߦ ‫ ݕ‬൅ ߦ‫ݔ‬

(a) (b)

Fig. 8: Examples of plausible zero-day malware with SSIM values of (a) 0.60 and (b) 0.68.

experiment (9:1→5:5) for a thorough performance verifica-


tion evaluation with 2-fold cross-validation (10-fold → 2-
fold cross-validation). This experiment was able to evaluate
more various zero-day malware data by increasing the num-
ber of existing test data (the average number of zero-day
malware data: 506 (122∼1,122) → 14,850 (4,262∼33,710)).
As shown in Table 6 and Figure 9, we obtained stable
test performance even though not only the relatively few
training data (reduced to half) but also increased analogous
zero-day malware test data in the proposed framework (>
99%).

4.5.2 Zero-day malware experiment II using malware data


with similar family names
We conducted a zero-day malware attack experiment II with
different class malware data sharing the family name with
similar properties from Malimg dataset [10]. We discovered
data from the Malimg dataset that are very suitable for
use in zero-day malware experiments (Table 1 and Figure
Fig. 9: Confusion matrix for zero-day malware classification 11). We trained and tested four classes using two differ-
results in 5:5 train-test ratio with 8:2 combined ratio and ent family name data with similar properties (Four types
0.64 SSIM. (5,543); Worm: Allaple.A (2,949) and Allaple.L (1,591), PWS:
C2Lop.gen!G (200) and C2Lop.P (146), Trojan: Lolyda.AA1
(213) and Lolyda.AA2(184), TDownloader: Swizzor.gen!I
TABLE 6: Results of performances for zero-day malware (132) and Swizzor.gen!E (128)). For richer interpretation and
attack experiment in the few training data conditions (%) analysis, we designed the zero-day experiment into two ses-
sions and conducted training and testing. For session A, the
Model\SSIM 0.60 0.62 0.64 0.66 0.68 CR training dataset (3,494) consists of Allaple.A, C2Lop.gen!G,
Proposed framework
Lolyda.AA1, Swizzor.gen!I, and the test dataset (2,049) con-
(with DCGAN) 97.58 98.74 98.83 97.76 97.73 sists of Allaple.L, C2Lop.P, Lolyda.AA2, Swizzor.gen!E. In-
(with LSGAN) 98.79 99.17 99.29 98.79 98.74
8:2 versely, session B consists of a training dataset (2,049) and
(with WGAN-GP) 97.92 98.73 98.96 98.43 98.23 a test dataset (3,494). Session B has a challenging problem
(with E-GAN) 99.11 99.42 99.51 99.06 99.25
of learning with a small amount of training data. This is
Proposed framework a big issue not only in the field of machine learning but
(with DCGAN) 97.99 98.82 98.96 98.00 98.09
(with LSGAN) 98.98 99.18 99.64 99.06 98.78 also in developing malware detection, especially zero-day
7:3 malware detection technology. Even if it is derived from
(with WGAN-GP) 98.10 98.64 99.04 98.33 98.30
(with E-GAN) 99.40 99.67 99.70 99.31 99.13 the same malware family, it is zero-day malware that is
not previously learned, and it can cause a big performance

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 10

(a) Proposed framework with only phase 1 (b) Proposed framework with whole phases

Fig. 10: Confusion matrix for zero-day malware classification results (session B) in the proposed framework (E-GAN) with
(a) only phase 1 and (b) whole phases (1&2) used similar malware family from the Malimg dataset.

TABLE 7: Comparison of zero-day malware classification


accuracies for second zero-day experiment (Malimg
dataset) in the proposed framework with only phase 1 and
whole phases
Allaple.A C2Lop.gen!G Lolyda.AA1 Swizzor.gen!I
Proposed Train
DCGAN LSGAN WGAN-GP E-GAN
framework session
with only A 87.70 82.67 85.06 90.53
phase 1 B 39.15 39.78 39.32 39.78
with phase A 100 98.19 99.95 99.21
1&2 B 98.42 97.82 96.88 98.74
Allaple.L C2Lop.P Lolyda.AA2 Swizzor.gen!E

Fig. 11: Examples of eight malware images from Malimg


dataset for second zero-day experiment [10]. the proposed framework can learn very effectively when
there is little data, showing excellent performance in the
zero-day malware detection problem.
degradation problem in the initial period as there is a very In practice, it is known that zero-day malware is often
limited data to learn. To verify that the proposed framework derived from variations of existing malware [8], [13]. To
can handle zero-day malware problems and a few data explore the limits in the performance of proposed frame-
issues, we designed a second zero-day experiment using works, we performed on the restricted dataset for evaluation
a similar malware family from the Malimg dataset. The even using two different datasets [9], [10]. The first zero-
experiment consists of the training sessions that were not day experiment designed assumes a plausible zero-day
only composed of session A and B, but also we evaluated malware attack by transforming existing malware instead
the proposed framework with only phase 1 and with phases of the actual zero-day malware attack data. Additionally,
1&2. The proposed framework deal with analogous new we designed other zero-day experiments using a similar
data by composing phase 1 to train the generator and malware family from different malware types. Although
discriminator and phase 2 to train the discriminator on the we have obtained outstanding results in various zero-day
analogous zero-day malware data. In Table 7 and Figure experiments, we might have obtained more meaningful
10, we showed that the models trained up to phase 2 interpretation and discussion if we measured and utilized
performed better than only phase 1 learned in all sessions (A a richer malware database.
and B). In particular, very interesting results were obtained While, the GAN based image-processing approach
in session B, where training was performed with a small method has a one-way limitation about malware code to
amount of training data. In session B, the result of learning the image in the malware detection field [8], [13], [29].
only phase 1 of the proposed framework was disastrous However, conversion to the malware code is not required
in all tested GAN models. This experiment demonstrates to achieve the goals and objectives of this study. In this
that existing GAN studies (i.e., phase 1 in the proposed paper, the proposed framework is to detect a myriad of
framework) may not respond properly to new data. On the similar malware that can be made with slight changes. Even
other hand, the final model trained up to phase 2 of the if the proposed framework cannot reproduce the malware
proposed framework showed very stable and high averaged code, it is a model that can detect and classify the analogous
accuracy (> 98.65%) (Table 7 and Figure 10). Consequently, malware with high similarity to the learned sample malware

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 11

data. In addition, if a new type of zero-day malware that is R EFERENCES


not used for learning appears, the proposed method also [1] A. Mosenia and N. K. Jha, “A Comprehensive Study of Security
has the advantage of being able to quickly learn about the of Internet-of-Things,” IEEE Transactions on Emerging topics in
new type of malware and apply it. Therefore, in terms of Computing, vol. 5, no. 4, pp. 586–602, 2016.
practicality and convenience, it is a very helpful framework [2] A. Li, S. Xue, X. Li, L. Zhang, and J. Qian, “AppDNA: Profiling
App Behavior via Deep-Learning on Function Call Graphs,” IEEE
when developed zero-day malware detection software. Transactions on Emerging Topics in Computing, 2020.
Meanwhile, as it is known from the adversarial attack, [3] S. Homayoun, A. Dehghantanha, M. Ahmadzadeh, S. Hashemi,
the performance of many machine learning based systems and R. Khayami, “Know Abnormal, Find Evil: Frequent Pattern
Mining for Ransomware Threat Hunting and Intelligence,” IEEE
is greatly reduced and neutralized by small distortion (e.g., Transactions on Emerging topics in Computing, vol. 8, no. 2, pp. 341–
combining noise, etc.) [50], [51]. This is no different in this 351, 2017.
field, and some hackers will be taking this vulnerability. [4] T. Saha, N. Aaraj, N. Ajjarapu, and N. K. Jha, “Sharks: Smart hack-
Therefore, it is necessary to build a robust and stability ing approaches for risk scanning in internet-of-things and cyber-
physical systems based on machine learning,” IEEE Transactions on
security system from these easy modifications. The pro- Emerging Topics in Computing, 2021.
posed framework is intuitively generating and learning a [5] W. Zhang, Y. Wen, and X. Zhang, “Towards Virus Scanning as a
plausible new malware from existing malware, and it can Service in Mobile Cloud Computing: Energy-Efficient Dispatching
be a complementary measure to deal with these challenge Policy Under N -Version Protection,” IEEE Transactions on Emerg-
ing Topics in Computing, vol. 6, no. 1, pp. 122–134, 2015.
problems. [6] S. D. SL and C. Jaidhar, “Windows malware detector using con-
volutional neural network based on visualization images,” IEEE
Transactions on Emerging Topics in Computing, 2019.
5 C ONCLUSIONS [7] “Kaspersky Security Bulletin 2018. Statistics,”
In the present study, the proposed framework based on 2018. [Online]. Available: https://securelist.com/
kaspersky-security-bulletin-2018-statistics/89145
plausible malware training and augmentation using a gener-
[8] M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang, “Riskranker:
ative adversarial network was to solve the problems caused Scalable and Accurate Zero-day Android Malware Detection,” in
by malware and analogous zero-day malware. In particular, Proceedings of the 10th International Conference on Mobile Systems,
because zero-day malware is often created by the defor- Applications, and Services, 2012, pp. 281–294.
[9] R. Ronen, M. Radu, C. Feuerstein, E. Yom-Tov, and M. Ahmadi,
mation of existing malware, the proposed framework with “Microsoft Malware Classification Challenge,” arXiv preprint
representative GAN models augmented even for the high- arXiv:1802.10135, 2018.
quality and high-diversity evolved malware images. For de- [10] L. Nataraj, S. Karthikeyan, G. Jacob, and B. Manjunath, “Malware
tection and classification, the discriminator was trained us- images: Visualization and Automatic Classification,” in Proceedings
of the 8th International Symposium on Visualization for Cyber Security,
ing malware images generated by the generator and robust 2011, p. 4.
to zero-day malware. Moreover, the proposed framework [11] J. Yan, Y. Qi, and Q. Rao, “Detecting Malware with an Ensemble
achieved high and stable averaged accuracy in the analo- Method Based on Deep Neural Network,” Security and Communi-
cation Networks, vol. 2018, p. 7247095, 2018.
gous zero-day malware attack experiment. We believe that [12] Z. Cui, F. Xue, X. Cai, Y. Cao, G.-g. Wang, and J. Chen, “Detec-
the proposed framework based plausible zero-day malware tion of Malicious Code Variants Based on Deep Learning,” IEEE
detection approach has important advantages for antivirus Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3187–3196,
systems in the computer security because it does not require 2018.
[13] J.-Y. Kim, S.-J. Bu, and S.-B. Cho, “Zero-day malware detection
inefficient malware signatures analysis. In this study, the using transferred generative adversarial networks based on deep
malware code has been converted to malware images with autoencoders,” Information Sciences, vol. 460, pp. 83–102, 2018.
fixed sizes through crop and pad operations for efficient [14] S. Venkatraman and M. Alazab, “Use of Data Visualisation for
Zero-Day Malware Detection,” Security and Communication Net-
learning. In fact, the processes could reduce the signatures works, vol. 2018, p. 1728303, 2018.
of malware. In future studies, we will expand the mal- [15] F. Xiao, Z. Lin, Y. Sun, and Y. Ma, “Malware Detection Based
ware types with various malware datasets (including zero- on Deep Learning of Behavior Graphs,” Mathematical Problems in
day malware) and solve the problem of various malware Engineering, vol. 2019, p. 8195395, 2019.
[16] J. Zhu, J. Jang-Jaccard, and P. A. Watters, “Multi-loss siamese
lengths. Moreover, further research should be conducted to neural network with batch normalization layer for malware de-
develop an optimized GAN model performing in our pro- tection,” IEEE Access, vol. 8, pp. 171 542–171 550, 2020.
posed framework for extensive zero-day malware detection. [17] S. Sharmeen, S. Huda, J. Abawajy, and M. M. Hassan, “An
In future studies it will be interesting to use explainable adaptive framework against android privilege escalation threats
using deep learning and semi-supervised approaches,” Applied
AI techniques (e.g., [52]) to gain a further understanding Soft Computing, vol. 89, p. 106089, 2020.
of zero-day malware features, thus allowing the zero-day [18] K. Berlin, D. Slater, and J. Saxe, “Malicious Behavior Detection us-
malware detection AI and its creators to learn better from ing Windows Audit Logs,” in Proceedings of the 8th ACM Workshop
on Artificial Intelligence and Security, 2015, pp. 35–44.
their mistakes. Moreover, cases of extreme changes, such as [19] B. Ndibanje, K. H. Kim, Y. J. Kang, H. H. Kim, T. Y. Kim, and
new type of zero-day malware, deserve further investigation H. J. Lee, “Cross-Method-Based Analysis and Classification of
to extend the possible application spectrum. Malicious Behavior by API Calls Extraction,” Applied Sciences,
vol. 9, no. 2, p. 239, 2019.
[20] C. Annachhatre, T. H. Austin, and M. Stamp, “Hidden Markov
ACKNOWLEDGMENTS models for malware classification,” Journal of Computer Virology
and Hacking Techniques, vol. 11, no. 2, pp. 59–73, 2015.
This work was supported by Institute of Information & [21] S.-W. Lee and A. Verri, Pattern Recognition with Support Vector
communications Technology Planning & Evaluation (IITP) Machines: Proc. of First International Workshop, Niagara Falls, Canada.
grant funded by the Korea government (MSIT) (No. 2019- Springer, 2003.
0-00079, Artificial Intelligence Graduate School Program [22] P. Wang and Y.-S. Wang, “Malware behavioural detection and
vaccine development by using a support vector model classifier,”
(Korea University); No. 2021-0-02068, Artificial Intelligence Journal of Computer and System Sciences, vol. 81, no. 6, pp. 1012–
Innovation Hub). 1026, 2015.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2022.3170544, IEEE
Transactions on Emerging Topics in Computing
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, VOL. XX, NO. XX, XX 2022 12

[23] F. C. C. Garcia, I. Muga, and P. Felix, “Random Forest for Malware [44] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” arXiv
Classification,” arXiv preprint arXiv:1609.07770, 2016. preprint arXiv:1701.07875, 2017.
[24] K. Singh, S. C. Guntuku, A. Thakur, and C. Hota, “Big Data Ana- [45] A. Odena, C. Olah, and J. Shlens, “Conditional Image Synthesis
lytics framework for Peer-to-Peer Botnet Detection using Random with Auxiliary Classifier GANs,” in Proceedings of International
Forests,” Information Sciences, vol. 278, pp. 488–497, 2014. Conference on Machine Learning. JMLR. org, 2017, pp. 2642–2651.
[25] Z. Chen, Q. Yan, H. Han, S. Wang, L. Peng, L. Wang, and B. Yang, [46] J. L. McClelland, D. E. Rumelhart, P. R. Group et al., “Parallel Dis-
“Machine learning based mobile malware detection using highly tributed Processing,” Explorations in the Microstructure of Cognition,
imbalanced network traffic,” Information Sciences, vol. 433, pp. 346– vol. 2, pp. 216–271, 1986.
364, 2018. [47] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classi-
[26] R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and fication with Deep Convolutional Neural Networks,” in Advances
A. Thomas, “Malware classification with recurrent networks,” in in Neural Information Processing Systems, 2012, pp. 1097–1105.
Proceedings of IEEE International Conference on Acoustics, Speech and [48] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochre-
Signal Processing, 2015, pp. 1916–1920. iter, “GANs Trained by a Two Time-Scale Update Rule Converge
[27] Y. Ye, L. Chen, S. Hou, W. Hardy, and X. Li, “DeepAM: a to a Local Nash Equilibrium,” in Advances in Neural Information
heterogeneous deep learning framework for intelligent malware Processing Systems, 2017, pp. 6626–6637.
detection,” Knowledge and Information Systems, vol. 54, no. 2, pp. [49] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
265–285, 2018. Quality Assessment: From Error Visibility to Structural Similarity,”
IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612,
[28] E. K. Kabanga and C. H. Kim, “Malware Images Classification
2004.
using Convolutional Neural Network,” Journal of Computer and
[50] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “To-
Communications, vol. 6, no. 1, p. 153, 2017.
wards Deep Learning Models Resistant to Adversarial Attacks,”
[29] V. S. Bhaskara and D. Bhattacharyya, “Emulating malware authors arXiv preprint arXiv:1706.06083, 2017.
for proactive protection using gans over a distributed image visu- [51] N. Carlini and D. Wagner, “Adversarial Examples Are Not Easily
alization of dynamic file behavior,” arXiv preprint arXiv:1807.07525, Detected: Bypassing Ten Detection Methods,” in Proceedings of the
2018. 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp.
[30] W. Hu and Y. Tan, “Generating adversarial malware examples for 3–14.
black-box attacks based on gan,” arXiv preprint arXiv:1702.05983, [52] S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek,
2017. and K.-R. Müller, “Unmasking clever hans predictors and assess-
[31] M. Kawai, K. Ota, and M. Dong, “Improved malgan: Avoid- ing what machines really learn,” Nature communications, vol. 10,
ing malware detector by leaning cleanware features,” in 2019 no. 1, p. 1096, 2019.
international conference on artificial intelligence in information and
communication (ICAIIC). IEEE, 2019, pp. 040–045.
[32] S.-W. Lee and H.-H. Song, “A new recurrent neural-network
architecture for visual pattern recognition,” IEEE Transactions on
Neural Networks, vol. 8, no. 2, pp. 331–340, 1997.
Dong-Ok Won received his B.S. degree in Com-
[33] Q. Wang, W. Guo, K. Zhang, A. G. Ororbia II, X. Xing, X. Liu, and puter Engineering from Tech University of Ko-
C. L. Giles, “Adversary Resistant Deep Neural Networks with an rea, Republic of Korea, in 2012, and his Ph.D.
Application to Malware Detection,” in Proceedings of the 23rd ACM degree in Department of Brain and Cognitive
SIGKDD International Conference on Knowledge Discovery and Data Engineering from Korea University, Republic of
Mining, 2017, pp. 1145–1153. Korea, in 2019. He is currently working as an
[34] Z.-U. Rehman, S. N. Khan, K. Muhammad, J. W. Lee, Z. Lv, S. W. assistant professor in the Department of Artificial
Baik, P. A. Shah, K. Awan, and I. Mehmood, “Machine learning- Intelligence at Hallym University, Republic of Ko-
assisted signature and heuristic-based detection of malwares in rea. His research interests are pattern recogni-
Android devices,” Computers & Electrical Engineering, vol. 69, pp. tion, machine learning, artificial intelligence, and
828–841, 2018. computer security.
[35] D. Gupta and R. Rani, “Big Data Framework for Zero-Day Mal-
ware Detection,” Cybernetics and Systems, vol. 49, no. 2, pp. 103–
121, 2018.
[36] S. Huda, S. Miah, M. M. Hassan, R. Islam, J. Yearwood, M. Alruba-
ian, and A. Almogren, “Defending unknown attacks on cyber- Young-Nam Jang received M.S. degree in De-
physical systems by semi-supervised approach and available un- partment of Brain and Cognitive Engineering
labeled data,” Information Sciences, vol. 379, pp. 211–228, 2017. from Korea University, Republic of Korea, in
[37] C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data 2020. His research interests are pattern recog-
Augmentation for Deep Learning,” Journal of Big Data, vol. 6, no. 1, nition, machine learning, and computer security.
p. 60, 2019.
[38] H.-G. Jung and S.-W. Lee, “Few-Shot Learning with Geometric
Constraints,” IEEE Transactions on Neural Networks and Learning
Systems, 2020.
[39] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial
Nets,” in Advances in Neural Information Processing Systems, 2014,
pp. 2672–2680.
[40] C. Wang, C. Xu, X. Yao, and D. Tao, “Evolutionary Generative
Adversarial Networks,” IEEE Transactions on Evolutionary Compu- Seong-Whan Lee (S’84–M’89–SM’96–F’10) re-
tation, vol. 23, no. 6, pp. 921–934, 2019. ceived the B.S. degree in computer science and
[41] R. Burks, K. A. Islam, Y. Lu, and J. Li, “Data Augmentation with statistics from Seoul National University, Seoul,
Generative Models for Improved Malware Detection: A Compar- Republic of Korea, in 1984, and the M.S. and
ative Study,” in Proceedings of the IEEE 10th Annual Ubiquitous Ph.D. degrees in computer science from the
Computing, Electronics & Mobile Communication Conference, 2019, Korea Advanced Institute of Science and Tech-
pp. 0660–0665. nology, Republic of Korea, in 1986 and 1989,
[42] A. Radford, L. Metz, and S. Chintala, “Unsupervised Represen- respectively. He is currently the Head of the
tation Learning with Deep Convolutional Generative Adversarial Department of Artificial Intelligence, Korea Uni-
Networks,” arXiv preprint arXiv:1511.06434, 2015. versity, Republic of Korea. His current research
[43] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, interests include artificial intelligence, pattern
“Least Squares Generative Adversarial Networks,” in Proceedings recognition, and brain engineering. Dr. Lee is a fellow of the International
of the IEEE International Conference on Computer Vision, 2017, pp. Association of Pattern Recognition (IAPR) and the Korea Academy of
2794–2802. Science and Technology.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like