0% found this document useful (0 votes)
53 views15 pages

GraphTunnel Robust DNS Tunnel Detection Based On DNS Recursive Resolution Graph

Uploaded by

marcelo almeida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views15 pages

GraphTunnel Robust DNS Tunnel Detection Based On DNS Recursive Resolution Graph

Uploaded by

marcelo almeida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL.

19, 2024 7705

GraphTunnel: Robust DNS Tunnel Detection Based


on DNS Recursive Resolution Graph
Guangyuan Gao, Weina Niu , Senior Member, IEEE, Jiacheng Gong , Graduate Student Member, IEEE,
Dujuan Gu, Song Li, Member, IEEE, Mingxue Zhang , and Xiaosong Zhang

Abstract— DNS tunnels, due to their versatility and conceal- accuracy in DNS tunnel detection, encompassing unknown DNS
ment, have become a preferred method for attackers to execute tunnels. Even in high false-positive environments caused by
Command and Control (C&C) attacks, posing a significant wildcard DNS, GraphTunnel maintains an F1-Score of 99.78%.
security threat to terminal devices. Therefore, the efficient and Moreover, GraphTunnel can identify DNS tunneling tools with an
accurate detection of DNS tunnels is important in reducing the accuracy rate exceeding 98.57%, enhancing the rapid mitigation
economic losses and privacy risks faced by both enterprises capabilities of emergency responders in dealing with malicious
and individuals. Despite notable advancements in the research DNS tunnels.
of intelligent detection of DNS tunnels, existing model-based
approaches predominantly concentrate on the surface-level fea- Index Terms— DNS tunnel detection, unknown DNS tun-
tures of domain names or packet payloads. This narrow focus nels, wildcard DNS, tunneling tool identification, graph neural
leads to low detection accuracy when dealing with unknown DNS networks.
tunnel attacks and traffic from wildcard DNS. Furthermore, these
methods struggle with accurately identifying DNS tunneling tools, I. I NTRODUCTION
complicating the task of swiftly locating and mitigating malware
for analysts. This paper proposes GraphTunnel, a framework
based on graph neural networks for detecting DNS tunnels and
identifying tunneling tools. It delves into the correlations among
D NS, as a fundamental internet infrastructure, facilitates
the translation of domain names into IP addresses for
user access to resources. However, the omnipresence and
DNS resolutions to construct paths that represent the recursive inherent stealth of DNS render it susceptible to exploitation
resolution process of DNS. By using central nodes that denote the by hackers for DNS tunneling attacks. A study conducted by
gateways, these paths are connected and transformed into graph the National Institute of Standards and Technology (NIST)
structures. Concurrently, it employs GraphSage to aggregate
the features of nodes and their edges in the graph, enabling [1] revealed that in 2021, an astounding 72% of organizations
effective detection of DNS tunnels. Additionally, GraphTunnel globally were subjected to DNS attacks. These attacks encom-
utilizes the G2M algorithm to capture the statistical features passed Distributed Denial of Service (DDoS) (46%), DNS
of nodes in the graph and maps them into grayscale images, tunneling (35%), and cache poisoning (33%). Furthermore, the
which are then processed by a CNN for multi-class identification “Global DNS Threat Report” disseminated by EfficientIP in
of DNS tunneling tools. Experimental results demonstrate that
in non-wildcard DNS scenarios, GraphTunnel achieves a 100% 2022 [2] disclosed that 88% of companies encountered DNS
attacks, involving tactics such as DNS phishing, DNS tun-
Manuscript received 11 January 2024; revised 2 June 2024 and neling, and DNS-based malware. Among these DNS attacks,
1 August 2024; accepted 6 August 2024. Date of publication 14 August 2024; those utilizing DNS tunneling techniques constituted 28%,
date of current version 22 August 2024. This work was supported in part
by CCF-NSFOCUS “Kunpeng” Research Fund under Grant CCF-NSFOCUS
marking a 4% surge compared to the previous year. On aver-
2023013, in part by the National Science Foundation of China under Grant age, these attacks precipitated a financial loss of $942,000,
62372086 and Grant U2336204, in part by Sichuan Natural Science Foun- with 24% of enterprises experiencing data exfiltration, impos-
dation under Grant 24ZNSFSC0038, and in part by the Financial Support
for Outstanding Talents Training Fund in Shenzhen. The associate editor
ing severe consequences on both businesses and individuals.
coordinating the review of this article and approving it for publication was In response to the security threats posed by DNS tunneling,
Dr. Daisuke Mashima. (Corresponding author: Weina Niu.) numerous researchers are currently engaged in studies to detect
Guangyuan Gao and Jiacheng Gong are with the School of Computer
Science and Engineering, University of Electronic Science and Technol-
DNS tunnels promptly [3], [4], [5], [6], [7], [8], [9], [10], [11],
ogy of China, Chengdu 611731, China (e-mail: [email protected]; [12], [13], [14], [15], [16], [17], [18], [19], [20], [21].
[email protected]). Current DNS tunnel detection methods are primarily cat-
Weina Niu and Xiaosong Zhang are with the School of Computer Science
and Engineering, University of Electronic Science and Technology of China, egorized into rule-based and model-based approaches [3].
Chengdu 611731, China, and also with the Institute for Advanced Study, Rule-based methods rapidly identify tunnel traffic by matching
University of Electronic Science and Technology of China, Shenzhen 518110, packet signatures or comparing specific feature thresholds.
China (e-mail: [email protected]; [email protected]).
Dujuan Gu is with NSFOCUS Technologies Group Company Ltd., Beijing However, these methods rely on rule sets generated from
100089, China (e-mail: [email protected]). specific tunneling software and may fail to effectively detect
Song Li and Mingxue Zhang are with the State Key Laboratory software deliberately modified by attackers. In contrast,
of Blockchain and Data Security and the School of Cyber Science
and Technology, Zhejiang University, Hangzhou 310058, China (e-mail: model-based approaches learn features from large-scale raw
[email protected]; [email protected]). traffic data, capturing distinctions between benign DNS traffic
Digital Object Identifier 10.1109/TIFS.2024.3443596 and tunnel traffic. Leveraging the characteristics of binary
1556-6021 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7706 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

classification algorithms, these methods train highly accurate maintaining high robustness even in the scenarios of unknown
detection models, offering flexibility to adapt to changes in dif- DNS tunnels and wildcard DNS. To address C2, GraphTunnel
ferent tunneling software. Nevertheless, existing model-based utilizes the G2M algorithm to statistically learn the node fea-
methods encounter the following challenges: tures within the graph and convert them into grayscale images.
C1: Suboptimal Accuracy in Detecting Unknown DNS A convolutional neural network (CNN) is then applied to
Tunnels and Wildcard DNS. Existing model-based detec- process the image, resulting in a multi-classification model for
tion methods typically focus solely on surface-level features DNS tunneling tools. By this method, GraphTunnel effectively
of domain names or packet payloads, lacking attention utilizes the statistical information and the capability of CNN,
to the behavioral structure characteristics of the establish- and gradually highlights the differences in node features within
ment and attack processes of DNS tunnels. This approach the graph, thus achieving high accuracy in distinguishing DNS
exhibits reduced detection performance when confronted with tunneling tools.
unknown DNS tunnel attacks and wildcard DNS. In summary, the primary contributions of this paper are as
follows:
• Multiple existing studies on DNS tunnel detection do
• We constructed DNS recursive resolution graphs and
not explicitly address the proble of traffic generated
by unknown DNS tunneling tools. In studies that do employed graph neural networks to detect DNS tun-
focus on detecting unknown samples, such as [21] and nels by analyzing the different behavioral graph patterns
[20], a “leave-one-out” method is employed to partition between normal DNS resolution and DNS tunnel resolu-
the dataset. However, these studies have not conducted tion.
• We developed the G2M algorithm to improve the
detection experiments specifically on traffic generated by
a completely unknown DNS tunneling tool. multi-classification of DNS tunneling tools by statistically
• Wildcard DNS is a technique that permits subdomains analyzing node feature vectors and organizing them into a
to utilize the wildcard * for ambiguous matching. This grayscale image matrix for effective convolutional aggre-
characteristic endows subdomains with flexibility in terms gation.
• Experimental results demonstrate that GraphTunnel
of length and character arrangement, closely resembling
the domain names employed by DNS tunnels for data achieves an F1 Score exceeding 99.78% in DNS tun-
transmission. When a DNS tunnel detection system nel detection, maintaining high robustness even against
employs surface-level domain features such as subdomain unknown DNS tunnels and wildcard DNS scenar-
length and information entropy for detection, it can easily ios. Furthermore, it achieves high accuracy exceed-
engender confusion between legitimate wildcard domains ing 98% in the multi-classification of DNS tunneling
and potential DNS tunnels, thereby precipitating instances tools.
of false positives. The structure of this paper is as follows: Section II furnishes
an overview of current endeavors in detecting DNS tunnels.
C2: Suboptimal Accuracy in Distinguishing DNS Tun- Section III delves into the architecture of the GraphTunnel
neling Tools. Despite the utilization of diverse encryption system, elucidating the composition and interrelationships of
techniques by recognized DNS tunneling tools, there is a each module. In Section IV, we expound on the experimen-
notable overlap in certain features such as domain length, tal environment settings and dataset collection, showcasing
information entropy, and packet size. This similarity impedes the results of the experimental evaluation. The discussion
the model’s ability to discern the nuanced differential features of research limitations is succinctly presented in Section V,
among various tools, thereby leading to diminished accuracy culminating in the overall conclusion in Section VI.
in the multi-classification task pertaining to DNS tunneling
tools.
To address the aforementioned challenges, we propose II. R ELATED W ORK
GraphTunnel, a framework based on graph neural networks Wang et al. [3] conduct a comprehensive analysis of nearly
designed for real-time detection of DNS tunnels and identifica- all detection methods developed from 2006 to 2020, categoriz-
tion of tunneling tools. Specifically, to tackle C1, GraphTunnel ing DNS tunnel detection methods into two types: rule-based
filters DNS traffic from network traffic and constructs paths detection and model-based detection. Building on this, we ana-
representing the DNS recursive resolution process. The nodes lyze some recent studies based on several indicators. For
in each path symbolize authoritative domains, with edges instance, methods with an accuracy rate exceeding 90%
mapping correlations among these domains. To enhance the are classified as HA (High-Accuracy), those encompassing
modeling of inbound and outbound traffic dynamics during datasets from five or more tunneling tools are labeled as ETC
DNS resolution, a central node is used to denote the gateway. (Extensive Tool Coverage), and those capable of detecting
This central node effectively connects individual paths, thus DNS tunnels on different platforms are categorized as CP
obtaining a comprehensive graph representation. Subsequently, (Cross-Platform). Additionally, we introduce specific metrics
GraphTunnel employs the GraphSAGE [22] algorithm to to assess the capabilities of these methods, including UDT
aggregate node and edge features within the graph. In this way, (Unknown DNS Tunnel), WD (Wildcard DNS), TTI (Tunnel-
GraphTunnel fits the DNS recursive resolution process well ing Tool Identification), and RM (Real-time Monitoring). The
and extracts unique spatio-temporal features from it, thereby details are outlined in Table I.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7707

TABLE I
C OMPARISON OF E XISTING M ETHODS FOR DNS T UNNELING D ETECTION

A. Rule-Based Detection they are contingent on signature rule sets generated for spe-
cific tunnel software, making them prone to false negatives
Rule-based detection can be further divided into two cat- when facing customized or modified tunneling tools. Further-
egories: signature-based methods and threshold-based meth- more, adversaries can deliberately alter or obfuscate signatures
ods [3]. within packets, rendering the signature-based approach inef-
1 Signature-Based Method: The signature-based method fective.
detects DNS tunnels through the matching of specific signa- 2 Threshold-Based Approach: This approach detects DNS
tures. These signatures are typically derived from professionals tunnels by scrutinizing specific features, such as the quantity
analyzing and extracting static features from traffic packets. of distinct hostnames and the cache hit rate. It hinges upon
For instance, the traffic packet of dnscat2 [23] will contain the comparative analysis of threshold values associated with
the content “dnscat”. these features, facilitating the differentiation between benign
Sheridan and Keane [4] employ snort with Iodine feature and tunneling DNS traffic.
rules to detect anomalous background traffic generated by Ozery et al. [8] propose an information-based real-time
Iodine through traffic signature analysis and baseline compar- detection method called ibHH. It is deployed on DNS servers,
ison. However, this method struggles to detect more complex capturing timestamps and domain names transmitted to reg-
covert channels. istered domains to calculate information volume. When the
Similarly, Adiwal et al. [5] propose a DNS intrusion detec- volume exceeds a set threshold, it flags the domain as
tion system (DID) based on snort. This approach extracts malicious. However, the method is still susceptible to false
corresponding features and generates signatures for ids rules positives due to the impact of wildcard DNS resolution.
by simulating DNS tunnel attacks, DNS amplification attacks, Sani and Setiawan [9] investigate a method for detecting
and DNS DoS attacks. Nevertheless, it is confined to known DNS tunnels using elasticsearch. The approach leverages
attack types and cannot effectively cope with new types of watcher to assess the diversity of hostnames, and flags the
DNS attacks. traffic as a tunnel if it exceeds 300. However, the method
Salat et al. [6] propose a method for detecting DNS tunnel adopts an empirically derived threshold, which necessitates
attacks in cloud environments based on elastic stack. This adaptation to different environments.
approach analyzes DNS traffic data in the cloud environment Ellens et al. [10] detect tunnels by analyzing traffic features,
and formulates rules for detection using suricata. However, such as the byte count of each flow, and applying statistical
it tends to increase the false positive rate during cloud detection methods. They set corresponding thresholds for
migration. different features, but the false positive rate is high.
Ghosh et al. [7] propose a multi-stage DNS tunnel detection Paxson et al. [11] devise a principled method. This method
technique. In the second stage, the method detects SSH employs a configurable threshold to constrain the information
handshakes in DNS tunnel traffic, analyzes data streams to transfer through DNS. The core concept involves utilizing
retrieve base64 encoded SSH signatures, and formulates rules lossless compression for estimating the information entropy
for matching detection. However, the signatures in this method of the entire DNS query flow to obtain an upper bound on the
can only match known DNS tunnel traffic in specific scenarios. information quantity. However, the approach is susceptible to
Signature-based methodologies swiftly discern suspected traffic splitting across multiple domains by the attacker.
tunnel traffic by matching distinct content within packets, Ishikura et al. [12] propose a method based on DNS cache
providing effective detection of known DNS tunnels. However, properties. This approach leverages the characteristics of cache

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7708 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

hits or misses generated on cache servers, introducing features Shafieian and Zulkernine [19] launch various attacks on
such as cache hit rate, access hit rate, and access miss count. enterprise networks within the AWS cloud to capture traffic,
It generates rule filters and LSTM filters for tunnel detection. and then utilize feature engineering and integrated learning
The threshold-based detection method of DNS tunnels offers methods to detect low-feature network intrusions.
the advantage of adjustable sensitivity by site-specific security Liang et al. [20] apply the “leave-one-out” method to
policies, thereby providing configurable tunnel traffic detec- generate multiple datasets. They design a FECC model that
tion. However, this method faces challenges in dealing with integrates CNN and k-means clustering, employing sliding
low-traffic tunnels, which can evade detection by maintaining windows to distill implicit features from the original DNS
traffic below the threshold. Furthermore, experiential judgment payload. Furthermore, they utilize k-means clustering to assess
is necessary for setting the threshold. An excessively high the homogeneity and exclusivity of the features, which are then
threshold may lead to under-detection, while an overly low applied in classification tasks and the detection of samples
threshold may increase the false positive rate. The threshold from unknown classes.
method is also susceptible to evasion tactics employed by Wang et al. [21] propose KRTunnel, a pioneering method
attackers, such as distributing traffic across multiple domains. for capturing DNS tunnel traffic from the Android side for
detection. This method employs a User-Agent check within
B. Model-Based Detection traffic packets to filter out Android traffic. Subsequently,
Model-based detection involves learning crucial features it extracts features such as subdomain average entropy and
extracted from extensive network data packets, including TTL from DNS requests and responses, using the isolation
packet size, TTL (Time to Live), DNS query type, and DNS forest algorithm for binary classification.
request domain name length. These features are then utilized In summary, model-based methods train on extracted fea-
in training machine learning or deep learning algorithms to tures from datasets, showing great potential in DNS tunnel
construct robust detection models. detection. They can effectively resist attackers’ attempts to
Ibraheemi et al. [13] propose a method of hybrid genetic evade detection by modifying signatures and maintain scal-
algorithm and support vector machine. This method simulates ability in complex network topologies. However, current
the utilization of four protocols such as HTTPS and FTP. model-based research primarily focuses on surface-level fea-
By capturing traffic during operation and applying a genetic tures of domain names or packet payloads, lacking attention
algorithm for feature selection, it discerns the optimal subset to the behavioral structure characteristics of DNS tunnel
of features and amalgamates it with an SVM classifier for establishment and attack processes. This often results in
detection. high false positive rates when dealing with unknown DNS
Sakarkar et al. [14] propose a method grounded in natural tunnels and wildcard DNS resolution scenarios. Additionally,
language processing. They utilize wireshark to capture mali- there is notable overlap in certain features such as domain
cious network packets, extract features such as timestamps and length, information entropy, and packet size, making it chal-
message information, fit them through word embeddings, and lenging to identify the specific DNS tunneling tools being
employ LSTM and GRU algorithms for detection. used.
Lal et al. [15] propose a hybrid deep learning architecture Therefore, we propose GraphTunnel, a model-based
named DNS-Tunnet. This approach transforms DNS queries approach that constructs DNS recursive resolution graphs.
into raw text, utilizing a CNN for automatic feature extraction. This method accurately simulates the DNS resolution pro-
The extracted features are subsequently input into an SVM cess and captures the differing behavioral patterns between
classifier for binary classification. normal DNS resolution and DNS tunnel resolution. This
D’Angelo et al. [16] extract features from the DNS query enhances the differentiation in feature space, ensuring our
payload, organize them into a 6 × 4 matrix to form a two- method remains robust even when faced with unknown
dimensional representation, and subsequently employ the CNN DNS tunnels and wildcard DNS resolution scenarios. More-
algorithm for binary classification. over, GraphTunnel effectively utilizes statistical information
Bai et al. [17] propose a method for identifying application and the capabilities of CNNs, gradually highlighting the
behavior in DNS tunnels based on spatiotemporal information. differences in node features within the graph, achiev-
They simulate different user application behaviors to capture ing high accuracy in distinguishing between various DNS
DNS traffic, divide each DNS traffic into equal-length seg- tunneling tools.
ments, and extract packet length and timing characteristics
from them. Features are selected through the information
III. M ETHODOLOGY
gain rate index and then applied to three machine learning
algorithms: bayes net, decision tree and random forest for To address the issues mentioned in Section I, we pro-
classification. pose GraphTunnel. This framework is engineered for the
Altuncu et al. [18] harness the Alexa top million websites real-time detection of DNS tunnels and the identification
and tools like Iodine for data collection. They developed a of tunneling tools. As illustrated in Figure 1, it encom-
deep feed-forward neural network model for classification passes modules for traffic input, traffic parsing, DNS tunnel
and conducted real-time testing in a network environment, detection, and tunneling tool Identification. In the sub-
achieving better real-time detection performance compared to sequent sections, we provide a detailed description of
the study by [24]. each module.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7709

Fig. 1. The framework of GraphTunnel.

A. Traffic Input Module information. However, these DNS flows originate from differ-
In the traffic input module, GraphTunnel can accept two ent platforms, and due to variances in the kernel and protocol
modes of input. The first mode, real-time monitoring, employs stack implementations, certain encapsulation formats such as
the cooked format may vary across platforms. For instance,
scapy [25] to oversee the specified network card interface
traffic from Android devices may exhibit the Linux cooked
and utilizes multi-threading for traffic processing. One thread
capture v2 identifier, while Linux traffic utilizes Linux cooked
is dedicated to traffic capture and storage in the message
capture v1, and Windows traffic is presented as Ethernet.
queue, while another retrieves packets from it for parsing.
This discrepancy may arise from specifying different values
This multi-threaded design circumvents the issue of traffic
with the -i parameter during tcpdump traffic capture, directly
capture speed being constrained by parsing and detection,
impacting the format of captured data and the included infor-
thereby enhancing overall monitoring and detection effi-
mation. Therefore, We utilize scapy version 2.5.0 [25] to
ciency. The second mode inputs traffic packets through pcap
address cross-platform compatibility issues, ensuring unifor-
files. To expedite data analysis, GraphTunnel partitions pcap
mity in parsing results for traffic captured across different
packets into smaller segments. Subsequently, multi-threading
platforms, such as Windows, Linux, and Android. Concur-
is employed to concurrently process these smaller pcap
rently, when dealing with DNS layer data, it is necessary
packets.
to consider the varying impacts of different resource records
on the parsing process, which may otherwise lead to parsing
B. Traffic Parsing Module errors and the inability to retrieve information corresponding to
the intended layer. Moreover, due to disparate implementation
In the traffic parsing module, GraphTunnel receive the pack- approaches among various tunneling tools, certain response
ets transmitted from the traffic input module and filter out DNS traffic may lack DNS layer headers. In such instances, Graph-
traffic based on protocols and port numbers. Subsequently, Tunnel directly extracts and analyzes the raw data from the
we uniformly parse DNS traffic from different platforms final layer of the packet.
to obtain data at the DNS layer or RAW layer. We then
analyze DNS recursive resolution on this data, generating 3) DNS Recursive Parsing: After extracting information
corresponding recursive resolution paths. from the DNS layer, we conduct a recursive parsing analysis
1) DNS Traffic Filtering: We classify network packets as of this data to generate corresponding recursive parsing paths.
DNS traffic based on an analysis of the UDP protocol and To preserve the integrity of the DNS resolution process,
port numbers. Initially, we ascertain the presence of a UDP we consider both DNS request and response traffic, doing a
layer within the packet. Subsequently, header information is one-to-one mapping based on the request domain. However,
extracted from this layer, and an examination is conducted on mismatches may occur during DNS domain resolution, where
both the source and destination ports. If either port is identified it is not always possible to establish a one-to-one correspon-
as 53, the packet is categorized as DNS traffic. Otherwise, it is dence between requests and responses. Specifically, if only
classified as non-DNS traffic. request data is presented, and subsequent response traffic
2) DNS Layer Resolving: Upon obtaining DNS traffic, is absent, GraphTunnel parses the request traffic, treating it
further packet analysis is necessary to extract the DNS layer as an isolated node. Conversely, if only response traffic is

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7710 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

TABLE II 63 bytes. DNS tunnels, aiming to transmit concealed data,


U SER C OVERAGE S CORES FOR D IFFERENT Q UERY T YPES often employ a multi-level subdomain structure, with each
level having a considerable length, yet still confined within the
63-byte limit. In contrast, conventional DNS queries, devoid
of the need for data concealment, exhibit a relatively straight-
forward and shallow hierarchy in their subdomain structure,
with shorter lengths at each level. Consequently, for a given
DNS traffic, we can statistically analyze the depth of the
domain name layers, extract the length of each subdomain,
and determine the maximum length among them, thereby
effectively distinguishing between DNS tunnel and benign
presented, and preceding request traffic is missing, we observe traffic.
that the response traffic data will contain the content of the
n
corresponding request. GraphTunnel prioritizes extracting and F3 = |Dmax | = max |Dsubi |, 0 < |Dsubi | ≤ 63
i=1
parsing the request part before parsing the response part.
In some instances of real-world traffic, we have encountered
where Dmax represents the maximum length domain name
peculiar cases where the status code is marked as a response
in subdomains, and Dsubi represents the i-th layer of the
packet, yet the data only encompasses the request portion
subdomain.
without any response data. GraphTunnel can still handle this
d) Count of prolonged consecutive consonant strings:
normally according to the aforementioned rules.
Subdomains within DNS tunnels often comprise a plethora of
4) Feature Extraction: We extract eight features from the
random letter combinations to maximize the transmission of
traffic data packets, which effectively capture the distinctions
information, resulting in an abundance of elongated consonant
between benign DNS traffic and DNS tunnel traffic.
strings. In contrast, conventional DNS subdomains typically
a) Record type corresponding scores: Herrmann et al.
employ meaningful word groups, making the generation of
[26] conducted a classification and statistical analysis of
excessive elongated consonant strings unlikely. Experimental
massive DNS logs, generating datasets of different query
results indicate that in the longest subdomain, noticeable
types, including the query volume, domain number, and other
differences in features emerge when the count of consecutive
indicators for each type. The findings reveal that the user
consonant characters exceeds 4. For benign subdomain, the
coverage rates of A and AAAA type queries are the highest,
count of continuous consonant strings with lengths greater
reaching 99.948% and 82.082%. This indicates that the major-
than 4 is typically limited to the range of 0-1, whereas DNS
ity of users employ these two query types, while other types
tunnel subdomain exhibits multiple instances of such strings.
commonly seen in DNS tunnels, such as TXT and NULL,
Hence, the tally of continuous consonant strings in the longest
are used by very few users. Consequently, we incorporate
subdomain can be an effective feature for distinguishing DNS
this information as a feature by assigning a score to the
tunnels from benign traffic. It is important to note that the
user coverage for each record type. Record types that are not
hyphen “-” is a frequently occurring character in domain
encompassed in the paper’s results suggest an absence of user
names and should be considered a delimiter during enumera-
queries, thus we assign their corresponding scores as zero.
tion rather than being completely disregarded.
Detailed data is presented in Table II.
b) Length of subdomain: The stipulated maximum length m
X
for a fully qualified domain name (FQDN) is confined to F4 = I (|Ci | > 4), Ci ∈ {Dmax }
255 bytes, as delineated in the DNS protocol RFC 1035 [27]. i=1
The second-level domain name, a relatively immutable com-
ponent of the domain name structure, serves as the principal where Ci represents the i-th consecutive consonant string
identifier and brand of the domain name. Conventional DNS in the subdomain, m denotes the number of consecutive
queries are primarily utilized for website access or server consonant strings, and I stands for the indicator function,
resolution, hence the subdomain name length is generally not determining whether the condition within the brackets is true.
extensive. However, to obscure transmission, DNS tunnels e) Entropy of the longest subdomain: Information
frequently construct elongated subdomain names subsequent entropy can quantify the randomness and uncertainty of a
to the second-level domain name to accommodate an increased string. A higher entropy indicates stronger randomness and
data volume. greater uncertainty, while a lower entropy suggests weaker
randomness and more regularity. In DNS tunnels, encoding
F2 = |Dsub | = |DFQDN − Dsecond |, 0 < |DFQDN | ≤ 255
techniques are often employed to conceal transmitted data,
where DFQDN represents the fully qualified domain name, resulting in more random letter combinations and, conse-
Dsecond represents the second-level domain, and Dsub repre- quently, higher entropy values for subdomains. Conversely,
sents the subdomain, which is the part before the second-level conventional DNS subdomains employ semantically coher-
domain. ent word groups, culminating in a comparatively diminished
c) Maximum length of subdomain: The DNS protocol information entropy. Therefore, the information entropy of the
stipulates that the length of each subdomain should not exceed longest subdomain can be harnessed as a distinguishing feature

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7711

between DNS tunnel and benign traffic. C. DNS Tunnel Detection Module
n In this module, we map the parsing path through the central
node into a graph form, and apply GraphSage to aggregate
X
F5 = − ( p(xi )logp(xi )).
i=1
neighbor node features to detect DNS tunnels.
1) Graph Generation: To promptly detect DNS tunnel traf-
Here, p(xi ) represents the frequency of xi appearing in the fic, it is imperative to minimize the volume of traffic processed
entire string. during each detection cycle, while maintaining a high degree
f) Time-to-live (TTL) value: The TTL value in DNS of detection accuracy. Consequently, we have empirically set
query results indicates the lifespan of the corresponding res- the graph size (K) to a small value of 20, determining the
olution record, representing the maximum time the resolution number of recursive query paths to be mapped and constructed.
result can be cached by DNS servers. Under regular circum- We establish a graph structure by connecting individual query
stances, TTL values are typically set to longer durations to paths through a central node, with each path representing a
minimize the necessity for repeated queries to nameservers. complete DNS recursive resolution, and each node denoting
However, in DNS tunnels, TTL values are deliberately set to a unique domain. Commencing from the initially queried
be short to ensure the rapid expiration of cached resolution domain, whenever a recursive query is forwarded to a new
records, facilitating the timely acquisition of the latest results authoritative domain server, we ascertain whether this server
for real-time data transmission. In essence, the TTL value node preexists in the graph. If it does, we establish a direct
distribution in tunnel traffic tends to be concentrated within connection. Otherwise, we instantiate a new node to symbolize
a shorter time frame, whereas the TTL values in benign the server and subsequently connect. Each node encapsulates
traffic are relatively longer and exhibit a broader distribution. seven attributes, including information entropy and TTL val-
Consequently, by statistically analyzing and comparing the ues, which are detailed in the section III-B. The edges within
TTL values in the returned packets of DNS traffic, we can the path delineate the recursive relationship among domain
identify an effective characteristic for detecting DNS tunnels. name resolutions, with the edge attribute corresponding to the
g) Packet size in bytes: DNS tunnels necessitate the average response time of the domain name resolution process.
concealment of supplementary data within DNS packets, Ultimately, we obtain a comprehensive recursive query path
resulting in considerably larger sizes for both corresponding extending from the root node to the leaf node. We replicate
request and response messages compared to benign DNS the aforementioned procedure, mapping and constructing K
traffic. In contrast, conventional DNS queries merely require independent recursive query paths into a relational graph
the inclusion of fundamental query information, resulting in enriched with features associated with nodes and edges.
a byte size distribution that falls within a relatively confined 2) DNS Tunnel Prediction: GraphSAGE [22], as an
range. Therefore, the byte size of resolved DNS data packets effective algorithm for aggregating neighbor features and
can be statistically analyzed and utilized as a distinguishing compatible with various types of graphs, is particularly suit-
feature between tunneling traffic and benign traffic. able for DNS traffic analysis applied to the construction of
h) Average response time: For each DNS request, we can recursive query relationship graphs. Specifically, we first set
document the timestamp at which the request message is the recursive depth n for the central node to aggregate neighbor
dispatched, as well as the reception time of the corresponding features. Then, we aggregate the features of neighbor nodes
response message. The difference between these two values from the 1st to the nth order for the central node in stages.
represents the total response time for the query. Conven- After mapping through multiple graph convolution layers and
tional recursive DNS resolution necessitates interaction with activation functions, we obtain the fused expression of node
multiple authoritative domain name servers, with each query features at different orders. Consequently, the central node
requiring a certain amount of time, thereby influencing the aggregates and encodes the topological structure information
overall response time. In contrast, DNS tunneling, designed of the entire DNS query relationship graph and the feature
for real-time data transmission, necessitates prompt responses. information of each node. Finally, we use the comprehensive
Typically, attackers employ DNS tunneling tools on the server feature vector of the central node as the embedded expression
side to facilitate DNS responses, resulting in shorter total of the entire graph, and input it into the linear layer for
query response times. By dividing the total response time computation, thereby achieving binary classification of traffic.
by the number of recursive layers traversed by the query,
we can obtain the average response time per layer. The average D. Tunneling Tool Identification Module
response time for tunneling traffic is noticeably shorter than
that for regular recursive queries. Therefore, this can serve as In the tunneling tool identification module, we propose a
a characteristic feature of the edges in the query path. G2M algorithm as depicted in Algorithm 1.
Upon detecting DNS tunnels, we apply statistical methods
Tr esponse − Tr equest to analyze each category of features in the incoming graph
F8 = embedding vector, extracting seven statistical attributes: vari-
Ln
ance, mean, standard deviation, range, median, skewness, and
Here, Tresponse represents the response time, Trequest rep- kurtosis. These features belong to the categories of central
resents the request time, and L n represents the number of tendency, dispersion, and distribution shape, and are com-
recursive layers traversed during the query. monly used in statistical analysis across various fields [28],

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7712 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

Algorithm 1 G2M Algorithm


Input: Graph G
Output: Matrix M
1: V ← GraphEmbedding(G)
2: M ← InitializeEmptyMatrix(7 × 7)
3: for each feature F in V do
4: M[i] ← [var(F), mean(F), std(F), range(F),
5: median(F), skewness(F), kurtosis(F)]
6: end for
7: return M

[29], [30]. They are relatively simple and efficient to compute,


sufficiently describing the statistical characteristics of the
data while avoiding computational complexity. This process
results in a 7 × 7 two-dimensional matrix. Then we arrange
Fig. 2. Traffic data collected by various tunneling tools.
this matrix to form a grayscale image and input it into a
convolutional neural network. The convolutional layers in the
network capture correlations between statistical features, while
the pooling layers aggregate feature information, progressively Research such as [14], [17], [19], and [20] independently
abstracting and compressing statistical features. Through this gather benign DNS traffic and DNS tunnel traffic by emulat-
methodology, the differences in node features within the graph ing real-world attack-defense environments. Nevertheless, the
are amplified, enabling accurate identification of various DNS spectrum of DNS tunneling tools incorporated in these studies
tunneling tools. is restricted to merely three to five types, and a substantial
fraction of the datasets remains publicly inaccessible.
IV. E VALUATION To assess the overall performance of GraphTunnel, we con-
We have conducted an evaluation of GraphTunnel’s detec- duct experiments using four distinct datasets. The first, denoted
tion performance and generalization capabilities. This section as Datasetour , comprises a substantial amount of benign traf-
presents the results of these experiments. fic alongside DNS tunneling traffic generated by ten different
DNS tunneling tools. The second, labeled as Datasetwildcar d ,
encompasses traffic data associated with wildcard domains.
A. Experiment Settings
The third dataset, referred to as DatasetC I C , is sourced from
Environmental Setup: The traffic data is collected on four publicly available data [36]. The final dataset, denoted as
distinct platforms: Windows 11 AMD64, Kali Linux x86_64, Datasetkor ving , is created by [37].
Centos7 × 86_64, and Thunder Simulator 9. The evaluation
1) Datasetour : We obtained the top 1,000,000 domain
of GraphTunnel is conducted on a 16-node GPU cluster. Each
names from the Cloudfare [38] website. Utilizing a distributed
node in this cluster is equipped with an Intel (R) Core (TM)
approach, we invoked browser instances to simulate genuine
i9-10920X CPU operating at 3.50 GHz, 256GB of RAM, and
user interactions with the assigned sublists of domain names.
two NVIDIA RTX 3080 GPUs. The system runs on Ubuntu
Concurrently, wireshark is employed to monitor and collect the
20.04 LTS with Linux kernel v.5.4.0. We deploy GraphTunnel
generated standard DNS traffic data. In addition, we replicated
in Python 3.10.
a real intranet environment and utilized ten DNS tunneling
tools, including iodine [39], dnscat2 [23] and dns2tcp [40],
B. Datasets Description to establish DNS tunnels between the local intranet and
Existing model-based detection methodologies predomi- public servers. Subsequently, we masqueraded as attackers and
nantly derive their datasets from two categories: public operated various intranet information collection tools, such
accessible datasets and self-collected datasets. These datasets, as Ladon [41], linEnum [42], and gather [43], on different
however, have limits on the diversity, magnitude, and accessi- operating systems to acquire a variety of sensitive information
bility of DNS tunnel traffic. from the target system. This process allowed us to capture the
Studies such as [8], [13], and [16] use public available DNS tunnel traffic during the interaction. Detailed tunneling
datasets, encompassing both benign DNS traffic and DNS traffic data is illustrated in Figure 2. It’s worth noting that due
tunnel traffic. Despite this, these datasets incorporate a limited to the varying data lengths transmitted by different tunneling
variety of DNS tunneling tools, typically including just three to tools, the volume of traffic collected also varied.
five. Furthermore, Ozery et al. [8] used the publicly available To evaluate the generalization ability of GraphTunnel in
ZIZA dataset [35], which only provides csv files containing detecting unknown DNS tunnel traffic, we partition the col-
information such as user_ip, domain, timestamp, and entropy, lected data. Table III comprises 2,012,494 instances of benign
without including the original pcap files. This limitation pre- DNS traffic and 375,810 instances of DNS tunneling traffic
vents the application of other methods that require extracting generated by five DNS tunneling tools such as Iodine [39] and
more contextual information from raw traffic. dnscat2 [23]. This dataset is utilized for training the detection

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7713

TABLE III
T RAFFIC DATA FOR D IFFERENT DNS R ECORD T YPES BY
VARIOUS T UNNELING T OOLS

Fig. 3. The proportion of Accessible, Inaccessible, Wildcard and


Non-Wildcard Domains on Cloudfare TOP 1000.

amazon.com, and facebook.com, highlighting the widespread


application of wildcard domains in the network ecosystem.
Finally, to simulate the pattern of wildcard resolution,
we employed a character variable comprising numerals, letters,
TABLE IV
and permissible domain characters such as “−”. We iterated
T RAFFIC DATA FOR D IFFERENT DNS R ECORD T YPES BY
through the input domain name list, randomly selecting strings
U NKNOWN T UNNELING T OOLS of lengths ranging from 1 to 64 for each domain, and appended
these to the original domain name to generate subdomains.
Subsequently, we simulate browser access to capture the DNS
resolution and response traffic generated during the access
process, totaling 639,208 instances.
3) DatasetC I C : DatasetC I C is a publicly available secu-
rity dataset known as CIC-Bell-DNS-EXF-2021 [36]. This
dataset encompasses 270.8 MB of DNS traffic, comprising
various file types such as audio, compressed files, exe, images,
text, and video. To simulate real attack scenarios, researchers
conduct a five-day experiment involving both mild and severe
attacks. Each day comprises a mixture of benign traffic and
attack traffic generated by various types of file transfer attacks.
In the case of severe attacks, the benign traffic to attack
traffic ratio is 6:4, while in mild attacks, this ratio reaches
model. Table IV contains an additional 587,049 instances of 9:1. The final dataset encompasses 323,698 samples from
traffic generated by five other DNS tunneling tools, employed severe attacks, 53,978 samples from mild attacks, and 641,642
to evaluate the model’s performance on unknown samples. distinct benign samples.
It is important to note that in the real world, the balance 4) Datasetkor ving : The dataset is created by Korving and
between regular DNS resolution traffic and DNS tunnel traffic Vaarandi [37]. They develop a configuration tool named
is skewed, with DNS tunnel traffic constituting only a small DACA that executes end-to-end automated attack scenarios
portion. Consequently, our dataset does not adhere to a one- and extracts security datasets from the analyzed systems.
to-one balanced ratio. In an effort to facilitate further research Specifically, they deploy DNS servers implemented by dif-
in this domain, we are making our dataset publicly available ferent tools to respond to DNS requests, including BIND9,
on https://github.com/ggyggy666/DNS-Tunnel-Datasets. CoreDNS, Dnsmasq, and PowerDNS. They also simulate real
2) Datasetwildcar d : We collected the Top 1000 domains scenarios and execute three DNS tunneling tools for command
from Cloudflare [38] and adjusted the relevant scripts of and control (C2) and file transfer, which include iodine,
the subdomain enumeration tool OneForAll [44] to verify dns2tcp, and dnscat. Finally, they generate a total of 12,789
whether these domains have adopted wildcard resolution. C2 traffic and 3,034,833 file transfer traffic.
Figure 3a depicts the distribution of domains that enable
wildcard resolution among the top 1,000 domains. Specifi-
cally, we successfully detected 913 accessible live domains C. Evaluation Metrics
within the Top 1000 domains. Subsequently, utilizing One- In this study, we employ a variety of binary classification
ForAll, we obtained subdomains for each live domain and evaluation metrics to assess the performance of our DNS
conducted a check for enabled wildcard resolution on these tunnel detection model.
subdomains, revealing 31,305 subdomains supporting wildcard TP +TN
resolution. Figure 3b presents a word cloud generated from a Acc =
T P + T N + FP + FN
subset of domains with enabled wildcard resolution, includ- TP
ing well-known international corporations like google.com, Recall =
T P + FN

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7714 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

TP TABLE V
Pr ecision =
T P + FP C OMPARATIVE R ESULTS OF GNN S ON THE DATASET
2 ∗ Pr ecision ∗ Recall
F1 =
Pr ecision + Recall
where TP (True Positives) represents the count of correctly
labeled DNS tunnel traffic, TN (True Negatives) denotes
the correct identification of benign DNS traffic. FN (False
Negatives) signifies the misclassification of DNS tunnel traffic
as benign, while FP (False Positives) indicates benign traffic
incorrectly identified as DNS tunnel traffic. in Table III. We partition the dataset in a 6:4 ratio, allocating
60% for training and the remaining 40% for testing. Through
D. Comparison Methods cross-validation experiments, we determine that a batch_size
of 64 and a learning rate of 0.005 yielded optimal parameters.
To evaluate our approach, we select three categories
We select four comprehensive graph neural network algorithms
of methods for comparison, including a set of baseline
for comparison, including GraphSage, GCN, GAT, and GIN.
methods, multiple GNNs, and ensemble learning methods.
Ensuring equivalent computational resources, we apply each
These selected methods are either well-recognized benchmarks
algorithm for training on the same dataset and performed
within the industry for multi-method comparisons or have
predictions on the test set. The specific outcomes are illustrated
demonstrated superior detection performance in recent years.
in Table V.
Specifically, the methods considered are as follows:
The experimental results table reveals that all four
1) Baseline Methods: D’Angelo et al. [16]: This method
GNNs exhibit robust performance on the given binary
extracts 22 features from DNS query payloads, arranges them
classification task. This exemplary performance underscores
into 6 × 4 two-dimensional images through padding with two
the effectiveness of our proposed detection method based
arbitrary constants, and utilizes the CNN algorithm for DNS
on GNN.
traffic classification.
GraphSage employs a random neighbor sampling technique,
Mahdavifar et al. [45]: This method extracts a total of
aggregating information from neighboring nodes to infer the
30 stateful and stateless features, employing five machine
label of each node. In the presence of imbalances within
learning algorithms such as GNB and RF for DNS tunnel
the dataset, GraphSage adeptly captures the characteristics
detection.
of tunneling traffic samples, demonstrating superior feature
Suman et al. [46]: This method categorizes DNS traffic
extraction capabilities. GCN employs analogous neighborhood
features into lexicon-based features, DNS statistics-based fea-
aggregation strategies, effectively utilizing information from
tures, and third-party-based features. It applies five machine
adjacent nodes for node classification.
learning algorithms such as SVM and KNN to train the DNS
GAT introduces an attention mechanism, enabling the model
tunnel detection model.
to focus on nodes crucial for the classification task. This
Filippo et al. [47]: This method proposes a prototype of a
mechanism enhances the discriminative power of the model
protocol tunnel detector that combines machine learning and
by directing attention to key nodes in the graph. GIN, rec-
deep learning. It identifies anomalous connections deviating
ognized for its high flexibility as a graph neural network
from the typically established network connections to detect
algorithm, exhibits stable performance robust to imbalanced
DNS tunnels.
data interference. It accurately discerns disparities between
2) Multiple GNNs: We integrate our approach with multiple
benign and tunneling traffic, thereby achieving outstanding
Graph Neural Networks (GNNs) for detection and classifi-
overall performance.
cation, including GraphSAGE [22], GCN [48], GAT [49],
RQ2: How does the generalization capability of Graph-
and GIN [50]. These GNNs are implemented using PyTorch
Tunnel perform against traffic from unknown DNS
Geometric (PyG).
tunneling tools?
3) Ensemble Learning Methods: Chowdhary et al. [51]:
In the real world, unknown DNS tunneling tools may oper-
This method employs query length and entropy as two primary
ate on various operating systems, including Windows, Linux,
features and integrates Gaussian Naive Bayes, Random Forest,
and Android. Owing to the disparities in traffic collection
Decision Tree, and K Nearest Neighbours algorithms to detect
methods across different operating systems, the traffic input
DNS tunnels.
into the DNS tunnel detection model may encounter parsing
issues stemming from format variations, thereby impeding the
E. Evaluation Results model’s detection efficacy.
To evaluate the effectiveness of GraphTunnel, we systemati- GraphTunnel considers this and adapts to DNS tunnel traffic
cally address the following questions and design corresponding collected from different operating systems. Specifically, the
experiments for validation. DNS tunnel tool traffic used in Table IV comes from different
RQ1: How well does GraphTunnel perform in detecting operating system terminals. The Windows terminal collects
DNS tunnels? traffic from the cobaltstrike [52] and tcp-over-dns [34], the
To ascertain the detection capabilities of GraphTunnel, Linux terminal collects traffic from the ozymandns [53] and
we conduct relevant experiments using the dataset presented dns2tcp [40], and the Android terminal collects traffic from the

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7715

TABLE VI Datasetkor ving . The detailed comparative outcomes are pre-


D ETECTION O UTCOMES FOR U NKNOWN T UNNELING T OOLS sented in Table VII.
The tabulated results demonstrate that in the DNS tun-
nel detection task, all baseline methods achieve F1 Scores
exceeding 90%, with Mahdavifar et al. [45] notably reaching
an exceptional 99.97%. This outstanding performance can be
attributed to the adoption of the random forest algorithm,
coupled with bootstrapping and feature random selection,
effectively mitigating the risk of overfitting and yielding
favorable outcomes on imbalanced datasets. Suman [46] apply
various machine learning techniques such as SVM and KNN,
achieving an F1 score of 98.9% through meticulous adjust-
ment of hyperparameters and integration of advanced feature
selection techniques. However, its robustness is comparatively
lower, achieving an F1 score of only 57.82% when confronted
with other machine learning algorithms like MLP.
Filippo et al. [47] combine unsupervised and supervised
methods like SVM for binary classification, achieving a
commendable F1 Score of 95.6%. However, this approach
slightly lags compared to other methods, possibly due to data
truncation during processing, leading to information loss and
Fig. 4. The proportion of tool traffic on different operating systems. impacting overall performance. D’Angelo et al. [16] arrange
features into two-dimensional images and apply CNN for
training, achieving an F1 Score of 99.71% for known DNS
AndIodine [54]. The distribution of these tools is presented in tunnels. By automatically learning and extracting relevant
Figure 4. features at different abstraction levels, CNN identifies complex
To evaluate the detection performance of GraphTunnel patterns in DNS query payloads. However, as indicated by
against unknown DNS tunneling tools across platforms, the results in Table VII, CNN struggles to accurately capture
we conduct the second experiment using the data from the behavioral patterns of entirely unfamiliar DNS tunneling
Table IV. Under the same experimental conditions as Q1, traffic. This limitation leads to a higher rate of false negatives,
we employ the pre-trained model to predict the traffic from resulting in a relatively low F1 Score of 57.14% and indicating
these tunneling tools, and the results are presented in Table VI. poor robustness.
The results demonstrate that GraphTunnel maintains a 100% In contrast, GraphTunnel captures unique spatiotemporal
detection accuracy even when facing with unknown tunneling features by retracing the DNS recursive resolution process.
tools. This is attributed to GraphTunnel’s consideration of The aggregation of neighbor nodes enhances the spatial
both request and response traffic, matching them one by one structural characteristics of DNS resolution obviously. Regard-
and incorporating authoritative domain name servers into the less of the intensity of the attack, GraphTunnel effectively
DNS resolution process, effectively constructing the spatial distinguishes benign traffic from tunnel traffic, resulting in
structure of the domain resolution process. Simultaneously, 100% detection accuracy. Even when faced with a completely
it captures the temporal cost of the DNS resolution process, unknown dataset, GraphTunnel maintains high robustness,
ingeniously transforming DNS resolution into embeddings of achieving at least a 99.37% F1 Score.
spatiotemporal features. RQ4: How effectively does GraphTunnel perform in
By intelligently aggregating the features of neighboring detecting wildcard DNS resolution scenarios?
nodes through graph neural network algorithms, GraphTunnel To evaluate the robustness of GraphTunnel in the con-
efficiently detects and dissects the complex graph structures text of wildcard DNS resolution scenarios, we incorporate
and patterns within traffic data. This leads to an increasing Datasetwildcar d as benign DNS traffic into the original model
disparity in the characteristics between benign DNS traffic and conduct a retraining process while keeping other condi-
and DNS tunneling traffic. Therefore, even for unknown DNS tions unchanged. In addition, we apply Chowdhary et al. [51]
tunneling tools, GraphTunnel consistently achieves robust to our dataset and perform the same operations for a more
detection performance. comprehensive comparative analysis with GraphTunnel.
RQ3: Is GraphTunnel superior to baseline methods? Figure 5 illustrates the detection performance of three
To evaluate the overall performance of GraphTunnel, ensemble models in Chowdhary et al. [51] and GraphTunnel
we conduct experiments using DatasetC I C under the men- before incorporating wildcard domain traffic. The first two
tioned experimental setup and compare the results with models display metrics exceeding 90%, whereas the third
baseline methods. Additionally, we employ GraphTunnel model exhibits a lower recall of merely 0.8951, albeit its
and the CNN model trained on DatasetC I C to predict accuracy and precision are the highest. This could be ascribed
the C2 and FileTransfer categories of the unknown dataset to the fewer integrated models, accentuating the detection

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7716 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

TABLE VII
P ERFORMANCE OF C OMPARISON E XPERIMENTS W ITH BASELINE M ETHODS

TABLE VIII
P ERFORMANCE C OMPARISON OF THE M ODEL A FTER W ILDCARD DNS R ESOLUTION

precision and accuracy. However, when the subdomain length


is excessively elongated, the significance of features such as
domain length and information entropy becomes pronounced,
predisposing the model to misclassify them as malicious DNS
tunneling behavior. This results in some positive samples
being erroneously classified as negative, leading to a decline
in recall, particularly in the first two models, with recall
decreasing by 12.61% and 10.61% respectively.
It is noteworthy that subsequent to the integration of
wildcard domain traffic data, the performance metrics of
GraphTunnel exhibit a downward trajectory across various
aspects. This suggests that wildcard domains exert a sub-
stantial impact on the model, which can readily induce bias
in the model’s detection results and give rise to false posi-
tives. However, in contrast to these three integrated models,
GraphTunnel takes into account DNS recursive resolution,
Fig. 5. Performance of comparison experiments with Chowdhary et al. [51]. aggregates domain name node features during the resolution
process, and procures unique spatiotemporal structure features.
As a result, the influence of features such as subdomain
characteristics of the two algorithms and predisposing them to length and information entropy on model detection is relatively
yield high precision. Conversely, all metrics for GraphTunnel minimal, and the recall rate of GraphTunnel has only declined
stand at 100%. by 0.41%. Furthermore, its F1 Score can still attain 99.78%,
Table VIII showcases the detection outcomes of the new thereby demonstrating considerable robustness and reliability.
model on the validation set subsequent to the integration of RQ5: How does GraphTunnel perform in recognizing
wildcard domain traffic. By observing the results in the table, tunneling tools?
we notice an enhancement in precision and accuracy for the To evaluate the capability of GraphTunnel in identifying
three ensemble models following the integration of additional tunneling tools, we conduct a multi-classification recogni-
regular wildcard domain traffic data. The reason lies in gen- tion experiment under the same experimental conditions as
erating 1-64 character random subdomains according to RFC previously mentioned, involving ten tunneling tools such as
specifications. When the subdomain length is brief, features Iodine [39] and dnscat2 [23]. In this experiment, we employ
such as domain length and information entropy are indistin- the metric of accuracy to assess the identification performance
guishable from benign DNS domains. Moreover, by emulating of GraphTunnel. Figure 6 presents the detailed recognition
how real users access domains via browsers, the collected results for different types of tunneling tool traffic.
traffic aligns with normal DNS traffic in terms of features like The graphical representation reveals the exemplary perfor-
DNS resolution record types and timestamps. Consequently, mance of the model in accurately classifying normal DNS
for models that concentrate on extracting superficial domain traffic and various tunneling tools. The model demonstrates
or packet features, the majority of positive samples can be a near-perfect classification with minimal misclassifications.
accurately classified as the positive class, thereby enhancing It exhibits an effective recognition of each category, with

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7717

TABLE IX
C OMPARISON OF T IME AND S PACE C ONSUMPTION B ETWEEN
CNN AND G RAPH T UNNEL

from the initiation of an attack to its detection. Each evaluation


experiment is repeated 5 times, and the average value is taken.
We employ the CNN model [16] to repeat the aforementioned
process for comparison. The specific experimental results are
presented in the table IX.
The experimental findings reveal that GraphTunnel demon-
strates higher time efficiency during training compared to
Fig. 6. Identification results for various tunneling tools. the CNN model, with a TTE of 66.69 seconds versus
CNN’s 110.85 seconds. However, GraphTunnel’s TMU of
1638.42 MB compared to CNN’s 1566.94 MB is higher. This
the accuracy rate for each type exceeding 98%. Notably, the can be attributed to GraphTunnel’s parsing of data into graph
traffic generated by the tools dnspot [31], Andiodine [54], and structures, which contain richer contextual information than
cobaltstrike [52] is identified with an accuracy rate of 100%. CNN’s vector data. This results in higher memory utilization
The performance is relatively less impressive for ozy- but allows GraphTunnel to cover more data packets, reducing
mandns [53], where 98.57% of the traffic can be successfully the number of iterations over the graph data in each epoch and
identified. There exists a 0.48% probability of being iden- thus achieving faster training.
tified as dns2tcp [40], a 0.24% chance of being identified In terms of prediction, GraphTunnel exhibits a longer PT of
as dnspot [31], and a 0.71% likelihood of being identified 48.65 seconds but lower PMU of 126.44 MB, contrasting with
as Andiodine [54]. This suggests that ozymandns is slightly CNN’s 12.39 seconds of prediction time and 1810.9 MB of
similar to these tools in tunneling data through DNS. memory usage. This aligns with expectations as CNN typically
In summary, GraphTunnel maintains high accuracy in iden- loads and processes the entire dataset at once during predic-
tifying traffic generated by various tools. In the face of tion, minimizing overhead from memory read/write operations.
potential threats, it proves instrumental for analysts to swiftly In contrast, GraphTunnel adopts a data segmentation strategy
locate malicious activities based on the identified results and when handling large-scale datasets, initially dividing data
take timely countermeasures. packets into smaller parts before predicting each segment.
RQ6: How well does GraphTunnel perform in terms of This approach reduces memory usage but leads to increased
time and space efficiency? processing time.
To comprehensively evaluate the performance of Graph- GraphTunnel requires 4.46 seconds for ADT, exceeding
Tunnel in real-time monitoring environments, we conduct CNN’s 0.89 seconds. This difference can be attributed to the
detailed experiments from multiple dimensions, including time CNN model’s simpler process of feature extraction from indi-
and space consumption. During the model training phase, vidual data packets, which enables faster real-time detection.
we utilize the Datasetour to ensure that the model learns However, the CNN model’s poor robustness in classify-
sufficient features and information. In the model prediction ing unknown tunneling traffic poses challenges for accurate
phase, we employ the Datasetkor ving to verify the model’s traffic classification, potentially increasing the workload for
generalization ability on unseen data. To better reflect real- emergency response personnel. In contrast, GraphTunnel’s
world scenarios, we deploy GraphTunnel in a real network approach of constructing a DNS recursive resolution graph
environment to monitor the traffic on network cards in real- requires more information from data packets for comprehen-
time. Additionally, we use dnscat2 [23] to construct a real sive traffic analysis and feature extraction. While this leads
DNS tunnel and launch attacks through this tunnel to evaluate to slower real-time detection, it offers higher robustness,
the real-time detection capability of GraphTunnel. ensuring accurate detection in complex and dynamic network
During the experiments, we measure several key perfor- environments, providing a more reliable basis for emergency
mance indicators. We record the Training Time per Epoch response. Therefore, we consider the time acceptable for real-
(TTE) to evaluate the time consumption of each epoch of time monitoring.
the model training process. Memory Usage (TMU) represents Indeed, there is potential to optimize our GraphTunnel to
the resource requirements of GraphTunnel during training. shorten traffic detection time and achieve faster response. For
Prediction Time (PT) is measured to assess the speed of the offline detection using the Datasetwildcar d , our GraphTunnel
prediction process, while Prediction Memory Usage (PMU) model requires 48 seconds, consuming only 126MB of mem-
provides insights into the memory requirements during infer- ory. Consequently, we can feasibly introduce a multithreading
ence. Attack Detection Time (ADT) measures the time taken mechanism, adopting a “space-for-time” strategy that enables

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
7718 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 19, 2024

parallel traffic analysis and integration of processed results traffic for bypassing, continuously enhancing GraphTunnel’s
into the graph for model detection. For real-time detection, ability to flexibly respond to unknown attack methods.
GraphTunnel necessitates gathering sufficient traffic context
information to construct a DNS recursive resolution graph, R EFERENCES
ensuring high robustness, which results in longer processing
times. Therefore, reducing the size of the graph is a viable [1] J. Coker. (2021). 72% of Organizations Experienced a DNS
Attack in the Past Year. [Online]. Available: https://www.infosecurity-
solution, as it implies analyzing a reduced amount of traffic magazine.com/news/72-orgs-dns-attack-last-year/
each time. As for determining the optimal graph size to [2] EfficientIP. (2022). IDC 2022 Global DNS Threat Report. [Online].
balance robustness and detection efficiency, it needs to be Available: https://efficientip.com/resources/idc-dns-threat-report-2022/
fine-tuned according to the specific circumstances in actual [3] Y. Wang, A. Zhou, S. Liao, R. Zheng, R. Hu, and L. Zhang, “A com-
prehensive survey on DNS tunnel detection,” Comput. Netw., vol. 197,
enterprise applications. Additionally, GraphTunnel’s perfor- Oct. 2021, Art. no. 108322.
mance is influenced by the packet sending frequency of the [4] S. Sheridan and A. Keane, “Detection of DNS based covert channels,”
DNS tunnel tool and the stability of the network environment. in Proc. Eur. Conf. Cyber Warfare Secur., 2015, p. 267.
[5] S. Adiwal, B. Rajendran, and S. D. Sudarsan, “DNS intrusion detection
We can mitigate the negative impact of these factors on (DID)—A SNORT-based solution to detect DNS amplification and DNS
ADT performance by optimizing the network rate processing tunneling attacks,” Franklin Open, vol. 2, Mar. 2023, Art. no. 100010.
mechanism, such as adjusting the data flow control strategy or [6] L. Salat, M. Davis, and N. Khan, “DNS tunnelling, exfiltration and
enhancing the packet reception mechanism. detection over cloud environments,” Sensors, vol. 23, no. 5, p. 2760,
Mar. 2023.
[7] T. Ghosh, E. El-Sheikh, and W. Jammal, “A multi-stage detection
V. D ISCUSION technique for DNS-tunneled botnets,” in Proc. EPiC Ser. Comput., 2019,
pp. 137–143.
Although GraphTunnel does not cover all unknown DNS [8] Y. Ozery, A. Nadler, and A. Shabtai, “Information based heavy hitters
tunnels, it still achieves 100% detection on the unkown for real-time DNS data exfiltration detection,” in Proc. Netw. Distrib.
Syst. Secur. Symp., 2024, pp. 1–15.
samples, which consists of five completely unknown DNS [9] A. F. Sani and M. A. Setiawan, “DNS tunneling detection using
tunneling tools. Moreover, as corroborated by the Q4 experi- elasticsearch,” IOP Conf. Ser., Mater. Sci. Eng., vol. 722, no. 1, 2020,
ment, wildcard DNS resolution poses a considerable challenge Art. no. 012064.
[10] W. Ellens, P. Żuraniewski, A. Sperotto, H. Schotanus, M. Mandjes, and
to accurate detection, and GraphTunnel does not currently E. Meeuwissen, “Flow-based detection of DNS tunnels,” in Proc. 7th
achieve 100% accuracy in this scenario. Nevertheless, it main- IFIP WG 6.6 Int. Conf. Auto. Infrastructure, Manage., Secur., Barcelona,
tains an F1 score of 99.78%, surpassing current methods across Spain. Heidelberg, Germany: Springer, Jun. 2013, pp. 124–135.
[Online]. Available: https://link.springer.com/chapter/10.1007/978-3-
various metrics. In the realm of application identification 642-38998-6_16#rightslink
research, the challenge of multi-classification looms large. [11] V. Paxson et al., “Practical comprehensive bounds on surreptitious
Despite GraphTunnel not achieving 100% identification for communication over DNS,” in Proc. 22nd USENIX Secur. Symp., 2013,
pp. 17–32.
every DNS tunneling tool, it currently holds the leading posi-
[12] N. Ishikura, D. Kondo, V. Vassiliades, I. Iordanov, and H. Tode, “DNS
tion among existing studies. Our commitment to continuous tunneling detection by cache-property-aware features,” IEEE Trans.
improvement and optimization in future research endeavors Netw. Service Manage., vol. 18, no. 2, pp. 1203–1217, Jun. 2021.
aims to further enhance GraphTunnel’s capabilities across all [13] F. A. Al-Ibraheemi, S. Al-Ibraheemi, and H. Amintoosi, “A hybrid
method of genetic algorithm and support vector machine for DNS
dimensions. tunneling detection,” Int. J. Electr. Comput. Eng. (IJECE), vol. 11, no. 2,
p. 1666, Apr. 2021.
[14] G. Sakarkar, M. K. H. Kolekar, K. P. G. Patil, P. Dutta, R. Chaturvedi,
VI. C ONCLUSION and S. Kumar, “Advance approach for detection of DNS tunnel-
In this study, we introduce GraphTunnel, a robust DNS tun- ing attack from network packets using deep learning algorithms,”
Adv. Distrib. Comput. Artif. Intell. J., vol. 10, no. 3, pp. 241–266,
nel detection framework through the DNS recursive resolution 2021. [Online]. Available: https://revistas.usal.es/cinco/index.php/2255-
graph. This framework is dedicated to the real-time detection 2863/issue/view/1322
of DNS tunnels and has the capability to identify specific [15] A. Lal, A. Prasad, A. Kumar, and S. Kumar, “DNS-tunnet: A
hybrid approach for DNS tunneling detection,” in Proc. 4th Int.
tunneling tools from DNS tunnel traffic. Initially, GraphTunnel Conf. Adv. Comput. Technol., Inf. Sci. Commun. (CTISC), Apr. 2022,
filters DNS traffic from network traffic and constructs a graph pp. 1–6.
structure representing the DNS recursive resolution process. [16] G. D’Angelo, A. Castiglione, and F. Palmieri, “DNS tunnels detection
via DNS-images,” Inf. Process. Manage., vol. 59, no. 3, May 2022,
Subsequently, it utilizes a GNN algorithm to aggregate features Art. no. 102930.
of nodes and edges in the graph for effective traffic classi- [17] H. Bai, W. Liu, G. Liu, Y. Dai, and S. Huang, “Application behavior
fication. Moreover, GraphTunnel applies the G2M algorithm identification in DNS tunnels based on spatial–temporal information,”
for statistical learning of node features in the graph and IEEE Access, vol. 9, pp. 80639–80653, 2021.
[18] M. A. Altuncu et al., “Deep learning based DNS tunneling detection
employs a CNN algorithm to generate intelligent identifiers for and blocking system,” Adv. Electr. Comput. Eng., vol. 21, no. 3,
DNS tunneling tools. Experimental results demonstrate that pp. 39–48, 2021.
GraphTunnel performs admirably in detecting DNS tunnels, [19] S. Shafieian and M. Zulkernine, “Multi-layer stacking ensemble learners
for low footprint network intrusion detection,” Complex Intell. Syst.,
with metrics surpassing those of existing baseline methods. vol. 9, no. 4, pp. 3787–3799, Aug. 2023.
GraphTunnel maintains high robustness when facing unknown [20] J. Liang, S. Wang, S. Zhao, and S. Chen, “FECC: DNS tunnel detec-
DNS tunneling tools and generic domain name resolution tion model based on CNN and clustering,” Comput. Secur., vol. 128,
scenarios. Additionally, it exhibits excellent performance in May 2023, Art. no. 103132.
[21] S. Wang, L. Sun, S. Qin, W. Li, and W. Liu, “KRTunnel: DNS channel
identifying DNS tunneling tools. In the future, we plan to pro- detector for mobile devices,” Comput. Secur., vol. 120, Sep. 2022,
pose techniques against GraphTunnel to generate adversarial Art. no. 102818.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.
GAO et al.: GraphTunnel: ROBUST DNS TUNNEL DETECTION BASED ON DNS RECURSIVE RESOLUTION GRAPH 7719

[22] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation [37] F. Korving and R. Vaarandi, “Daca: Automated attack scenarios and
learning on large graphs,” in Proc. Adv. Neural Inf. Process. Syst., dataset generation,” in Proc. Int. Conf. Cyber Warfare Secur., 2023,
vol. 30, 2017, pp. 1–11. vol. 18, no. 1, pp. 550–559.
[23] Dnscat2. Accessed: Sep. 23, 2023. [Online]. Available: https://github. [38] (2023). Cloudflare. Accessed: October 6, 2023. [Online]. Available:
com/iagox86/dnscat2 https://radar.cloudflare.com/domains
[24] J. Ahmed, H. H. Gharakheili, Q. Raza, C. Russell, and V. Sivaraman, [39] Iodine. Accessed: Sep. 23, 2023. [Online]. Available: https://code.
“Real-time detection of DNS exfiltration and tunneling from enterprise kryo.se/iodine/
networks,” in Proc. IFIP/IEEE Symp. Integr. Netw. Service Manage. [40] DNS2TCP. Accessed: Sep. 23, 2023. [Online]. Available:
(IM), Apr. 2019, pp. 649–653. https://github.com/alex-sector/dns2tcp
[25] Scapy. Accessed: Sep. 27, 2023. [Online]. Available: https://github. [41] Ladon. Accessed: Oct. 6, 2023. [Online]. Available:
com/secdev/scapy https://github.com/k8gege/Ladon
[26] D. Herrmann, C. Banse, and H. Federrath, “Behavior-based track- [42] LinEnum. Accessed: Oct. 6, 2023. [Online]. Available:
ing: Exploiting characteristic patterns in DNS traffic,” Comput. Secur., https://github.com/rebootuser/LinEnum
vol. 39, pp. 17–33, Nov. 2013. [43] (2023). Gather. Accessed: Oct. 6, 2023. [Online]. Available:
[27] P. Mockapetris, Domain Names—Implementation and https://github.com/wwl012345/gather
Specification, document RFC 1035, 1987. [Online]. Available: [44] OneForAll. Accessed: Oct. 13, 2023. [Online]. Available:
https://www.zytrax.com/books/dns/apd/rfc1035.txt https://github.com/shmilylty/OneForAll
[45] S. Mahdavifar et al., “Lightweight hybrid detection of data exfiltration
[28] S. Uddin and H. Lu, “Dataset meta-level and statistical features affect
using DNS based on machine learning,” in Proc. 11th Int. Conf.
machine learning performance,” Sci. Rep., vol. 14, no. 1, p. 1670,
Commun. Netw. Secur., Dec. 2021, pp. 80–86.
Jan. 2024.
[46] O. P. Suman, “A novel approach for malicious domain classification
[29] C. Guo, M. Lu, and J. Chen, “An evaluation of time series summary
based on DNS traffic analysis and machine learning,” Oct. 2023.
statistics as features for clinical prediction tasks,” BMC Med. Informat.
[47] F. Sobrero, B. Clavarezza, D. Ucci, and F. Bisio, “Towards a near-real-
Decis. Making, vol. 20, no. 1, pp. 1–20, Dec. 2020.
time protocol tunneling detector based on machine learning techniques,”
[30] M. Altaf, T. Akram, M. A. Khan, M. Iqbal, M. M. I. Ch, and C.-H. Hsu,
J. Cybersecurity Privacy, vol. 3, no. 4, pp. 794–807, Nov. 2023.
“A new statistical features based approach for bearing fault diagnosis
[48] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
using vibration signals,” Sensors, vol. 22, no. 5, p. 2012, Mar. 2022.
convolutional networks,” 2016, arXiv:1609.02907.
[31] Dnspot. Accessed: Sep. 23, 2023. [Online]. Available: [49] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and
https://github.com/mosajjal/dnspot Y. Bengio, “Graph attention networks,” 2017, arXiv:1710.10903.
[32] DNS-Shell. Accessed: Sep. 23, 2023. [Online]. Available: [50] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph
https://github.com/sensepost/DNS-Shell neural networks?” 2018, arXiv:1810.00826.
[33] Tuns. Accessed: Sep. 23, 2023. [Online]. Available: [51] A. Chowdhary, M. Bhowmik, and B. Rudra, “DNS tunneling detec-
https://members.loria.fr/LNussbaum/tuns.html tion using machine learning and cache miss properties,” in Proc.
[34] TCP-Over-DNS. Accessed: Sep. 23, 2023. [Online]. Available: 5th Int. Conf. Intell. Comput. Control Syst. (ICICCS), May 2021,
https://analogbit.com/software/tcp-over-dns/ pp. 1225–1229.
[35] K. Žiža, P. Tadić, and P. Vuletić, “DNS exfiltration detection [52] Cobalt Strike. Accessed: Sep. 23, 2023. [Online]. Available:
in the presence of adversarial attacks and modified exfiltrator https://www.cobaltstrike.com/
behaviour,” Int. J. Inf. Secur., vol. 22, no. 6, pp. 1865–1880, [53] OzymanDNS. Accessed: Sep. 23, 2023. [Online]. Available:
Dec. 2023. https://github.com/splitbrain/dnstunnel
[36] (2021). CIC-Bell-DNS-EXF-2021 Dataset. Accessed: Oct. 20, 2023. [54] AndIodine. Accessed: Sep. 23, 2023. [Online]. Available:
[Online]. Available: https://www.unb.ca/cic/datasets/dns-exf-2021.html https://github.com/yvesf/andiodine

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL. Downloaded on September 30,2025 at 17:14:27 UTC from IEEE Xplore. Restrictions apply.

You might also like