BDCC 07 00065
BDCC 07 00065
cognitive computing
Article
Threat Hunting Architecture Using a Machine Learning
Approach for Critical Infrastructures Protection
Mario Aragonés Lozano * , Israel Pérez Llopis and Manuel Esteve Domingo
Abstract: The number and the diversity in nature of daily cyber-attacks have increased in the last few
years, and trends show that both will grow exponentially in the near future. Critical Infrastructures
(CI) operators are not excluded from these issues; therefore, CIs’ Security Departments must have their
own group of IT specialists to prevent and respond to cyber-attacks. To introduce more challenges
in the existing cyber security landscape, many attacks are unknown until they spawn, even a long
time after their initial actions, posing increasing difficulties on their detection and remediation. To
be reactive against those cyber-attacks, usually defined as zero-day attacks, organizations must
have Threat Hunters at their security departments that must be aware of unusual behaviors and
Modus Operandi. Threat Hunters must face vast amounts of data (mainly benign and repetitive,
and following predictable patterns) in short periods to detect any anomaly, with the associated
cognitive overwhelming. The application of Artificial Intelligence, specifically Machine Learning
(ML) techniques, can remarkably impact the real-time analysis of those data. Not only that, but
providing the specialists with useful visualizations can significantly increase the Threat Hunters’
understanding of the issues that they are facing. Both of these can help to discriminate between
harmless data and malicious data, alleviating analysts from the above-mentioned overload and
providing means to enhance their Cyber Situational Awareness (CSA). This work aims to design a
system architecture that helps Threat Hunters, using a Machine Learning approach and applying state-
Citation: Aragonés Lozano, M.; of-the-art visualization techniques in order to protect Critical Infrastructures based on a distributed,
Pérez Llopis, I.; Esteve Domingo, M. scalable and online configurable framework of interconnected modular components.
Threat Hunting Architecture Using a
Machine Learning Approach for
Keywords: critical infrastructures protection; cyberattacks; machine learning; threat hunting;
Critical Infrastructures Protection.
visualization models; architecture
Big Data Cogn. Comput. 2023, 7, 65.
[Link]
bdcc7020065
to spot complex attacks, which are very quiet and remain in the protected infrastructure for
a long time.
Nevertheless, a huge amount of the actionable data, both in the network and host,
are related to harmless actions of the employees (such as DNS requests or WEB browsing).
Moreover, surveys conducted with Threat Hunters [1] on the traits of those datasets con-
cluded that there were specific and characterizable patterns for each of the studied actions,
resulting in them being harmless or potentially dangerous. Being that Machine Learning
is a scientific field characterized by providing outstanding techniques and procedures
in extracting models from raw data [2], it follows that using well-designed, adequately
tuned and scenario-customized ML algorithms can be helpful in classifying data samples
according to how benign or malign they are.
Furthermore, according to several studies [3–5], human cognition tends to predict
words, patterns, etc. strongly influenced by the context [6], even further if they seem to
be under stress conditions [7]. In fact, those stressful conditions are suffered by Threat
Hunters when they must face big amounts of data in highly dynamic scenarios where the
smallest mistake can have a very high impact. Moreover, Threat Hunting is a complex
decision-making process that encompasses many uncontrolled factors, typically working
with limited and incomplete information and possibly facing unknown scenarios, for
instance, zero-day attacks [8]. As a consequence, paying attention to the previously stated
strong dependency on context in prediction by human cognition, an attack quite similar
in behavior to a non-attack could be seen as such due to human bias; however, a Machine
Learning system could discriminate between both more accurately than humans do. Thus,
with all the data provided by the output of ML systems (such as likelihoods, feasibility
thresholds, etc.), Threat Hunters could be able to understand better what is going on at the
operations theater.
Moreover, it is well known that the human brain processes visual patterns more quickly
and accurately than any textual or speech report, gaining understanding at a glimpse, and
this, naturally, also happens in cybersecurity [9,10]; as a consequence, representing the data
(both raw and ML processed data) properly is also a decisive factor for Threat Hunters
in order to achieve Situational Awareness [11,12] and therefore an early detection of any
threat. Some studies have been trying to classify which advanced visualization fits best for
each kind of attack [13,14].
Lastly, using both Machine Learning and specifically defined data visualizations,
Threat Hunters will be able to generate hypotheses about what is going on in their systems
and networks, being able to quickly detect any threat and even have enough context
information to deal with it.
Systems capable of gathering all those huge amounts of data, processing them (includ-
ing Machine Learning techniques) and providing insightful visualization techniques must
be developed following a properly designed architecture in accordance to the challenges
that such an ambitious approach must face. The most relevant contribution of this work is
an architecture proposal and its implementation devoted to fulfill the stated needs. The
proposed architecture must provide means for dynamic and adaptable addition of ML
techniques at will and the selection of which to use from the existing ones at a given
moment. In addition, big data must be taken into account for vast amounts of data that must
be stored and analyzed. Moreover, due to the time-consuming nature of ML processing,
the architecture must enforce parallelization of as many processes as possible; therefore,
architecture components must be orchestrated to maximize this parallelism. Furthermore,
asymmetric scalability must be enforced in order to be efficient; thus, means should be
instantiated to guarantee that only necessary components are working at a certain time.
The architecture must be implemented in a distributed approach; therefore, communica-
tions, synchronization and decoupling of components and processing must be carefully
envisioned and designed. Lastly, but not least, the whole system must be secured regarding
the type of data it will process.
Big Data Cogn. Comput. 2023, 7, 65 3 of 26
more, several examples of enhancing the process by using ML techniques and useful
visualizations can be found.
Besides academia, companies are also trying to develop specific Machine Learning
techniques and algorithms for their Threat Hunting products to enrich current visualiza-
tions used to understand the cyber situational awareness of the monitored systems. Some
offered products that implement ML algorithms are systems for Security Information and
Event Management (SIEM), Firewalls, Antiviruses, Instrusion Detection System (IDS) and
Intrusion Prevention System (IPS). A few examples are those like Splunk [51], Palo Alto
next generation smart Firewalls [52], IBM immune system-based approach to cyber security
(IBM X-Force Exchange [53,54]) or even Anomali ThreatStream [55].
After conducting deep research on the current state-of-the-art in the area, it can be
concluded that, despite having made several outstanding efforts towards solving specific
areas of the problem, there is no effort to define an architecture where implementation
is rich enough to generate hypotheses about what is going on the system monitored.
As a consequence, there is a lack (1) in the design of a particular unified architecture to
help Threat Hunters with a Machine Learning approach with capabilities to define and
generate (manually or automatically) hypotheses about what is going on and (2) in the
provision of specific and useful visualizations, particularly in the issues detected for Critical
Infrastructures (as might be the case of business continuity) and coping with all detected and
envisioned scenarios. To fill this gap, an architecture with a specific component to define
and generate hypotheses is proposed that must ensure security, scalability, modularity and
upgradeability. It must also constitute a proper framework for developing platforms for
Threat Hunting based on flexible and adaptable Machine Learning over the time. This
work aims to solve this problem and fill the detected gap, mainly in terms of providing a
unified framework that interrelates existing different components from data acquisition to
knowledge generation (emphasizing the hypothesis generation) and visualization, which,
despite being generic, is particularized for Critical Infrastructures Protection.
about its availability. In addition, another relevant requirement is that each component must
be completely stateless to allow decoupling and parallelization of processes. Moreover,
with the components being stateless, the order of actions to do a simple process is not
relevant, and therefore it can be a pool of available elements that dequeue pending tasks
and, properly orchestrated, proceed to its completion, receiving all the required metadata
(the state) itself.
The proposed architecture is flexible and scalable in terms of resources for its deploy-
ment. If resources are scarce, for instance, in debugging or testing or for an SME setup,
every involved component can reside in a docker container [58] or in virtual machines [59],
and the overall architecture can reside in a single machine. At the other end of the spec-
trum, where we can find setups with huge amounts of resources, the setup can be clustered
using Kubernetes [60] or via cloud using AWS [61] or Azure [61]. From the components
perspective, the type of deployment is transparent and seamless.
Interaction components
HMI External access gateway
Database Data ML
preprocessing Components
Components
Data Model Numerical
Normalization Correlation Communications
Text Clustering
Normalization Authentication management
b
Similarity
b
b
NLP
Neural
Networks
b
b
b
Data collectors
SIEMs Logs PCAPs OSINT
Figure 1. Proposed architecture. Groups of components from bottom to top and from left to right:
Sections 4.1–4.8.
To achieve that goal, components must be completely decoupled, only knowing the
existence of others on a per-needs basis on an orchestrated schema and communicating
on standardized and predefined interfaces and mechanisms. That way, inner features
of the component are completely isolated to the rest, and flexibility and decoupling can
be reached.
This is one outstanding feature of the architecture that can provide flexibility and
scalability for easily adapting to different and dynamically changing scenarios, depending
on needs and resources. In addition, being able to provide flexibility also makes the
architecture optimum for all kinds of Critical Infrastructures, deploying only the modules
required for each specific one.
Big Data Cogn. Comput. 2023, 7, 65 6 of 26
Another essential feature that must enforce the proposed architecture is the capability
of providing High Availability (HA) [62] to guarantee service continuity (one of the main
concerns of Critical Infrastructures) even in degraded conditions. To achieve that goal,
load-balancing schemas are proposed within the component orchestrator, and, for the
key elements (tagged as crucial through the following exposition) whose service must be
guaranteed at all stakes for the rest to be able to work, backup instances should be ready
in the background to replace the running ones if any issue is detected, therefore avoiding
overall system service interruption.
Security is a crucial concern for any cyber security tool. Therefore, the architecture will
establish security mechanisms to provide Agreed Security Service Levels in terms of security
guarantees. Initially, these Security Service Levels Agreements (SSLAs) will be oriented
to the capability of exchanging messages among components, and each component will
ensure the authenticity [63] of the transmission; in short, the source’s identity is confirmed
and the requested action is allowed.
Another key part of the architecture is the interconnection within platforms imple-
menting it or even with external sources. It does not matter how complex the developed
architecture is; if the Section 4.6.2 is deployed, the implemented system will never lose the
capability of being interconnected and sharing all kind of knowledge.
If several systems are deployed, creating a federation, the architecture will also provide
the ability of sharing data regarding which items are the current active attacks, their input
vectors, the IoC, etc. to warn other members of the federation if the system detects similar
devices on the monitored network or even alert Threat Hunters which devices might be
compromised. This feature is very important because a cyber-attack affecting a Critical
Infrastructure can be propagated to another Critical Infrastructure [64].
In a brief summary, the proposed architecture aims to be distributed, self-adaptative,
resilient and autopoietic [65], achieving that goal by being flexible, modular, and scal-
able but never losing the main objective of solving the detected problems in a fast and
secure way.
The architecture will enforce the usage of standards at all levels to guarantee inter-
operability capabilities of the system, both in terms of data acquisition and, eventually,
data export. Moreover, the usage of standards will provide sustainability of the life-cycle
of developments, both at the hardware and the software faces, as well as flexibility and
modularity in the selection and insertion of new elements and the replacement of existing
ones. To do so, many different standards are proposed to be implemented and they will
be specified in the corresponding sections. Among others, standard COTS (Commercial
off-the-shelf) [66] mechanisms will be enforced at several layers of the architecture.
Several data sources will be implemented and feedback from Threat Hunters will
be received in order to generate proactive security against threats. All this information,
correctly processed, can be used to measure the security levels of the analyzed Critical
Infrastructure.
4. System Architecture
The purpose of each layer is described hereunder from the bottom of Figure 1 to the top.
• PCAPs (Packet Captures, files with information about network traffic) [69].
• Threat Management Platforms (TMP), such as MISP [70].
• Incident Response Systems, such as The Hive [71] or RT-IR [72].
• Advanced Persistent Threat (APT) [73] management tools.
• OSINT (Open Source Intelligence [74]) sources, with their specific need in terms of
normalization due to the wide variety of data typologies.
In Table 1, the most interesting ECS fields can be found in order to be used with the
proposed architecture. Nevertheless, the data model is not limited to those fields, but it can
be enlarged if any component of the architecture needs it.
Coupling Elastic Search (ES) as a data repository with ECS is a widely recommended
approach due to several reasons. First and mainly, both products come from the same
source, thus guaranteeing a long-standing alignment as ECS is defined and in continuous
development by Elastic. In addition, Elastic Search is big data enabled by nature [81] and
follows HA because it can be clustered.
This will be left open for customization by administrator users to set up the data to the
approach that fits best on each data source.
Secondly, this component also allows the system to provide stored data, potentially
filtered following given requests, to any authorized external requester using one of the
standards that best fits its query.
Standard approaches such as JSON data format [101] or XML [101] will be used and
are recommended due to their widespread nature. However, proprietary schemas and
methods will be used when no other approaches are left open, as happens to be with several
proprietary products and systems.
One step further, cyber security standards will also be used in the architecture for data
exchanges. For instance, STIX (Structured Threat Information eXchange) [102] is going to
be used as it is the de facto standard for cyber threat intelligence nowadays [103]. Moreover,
widely used existing standards for cyber intelligence, such as CVE (Common Vulnerability
enumeration) [104] or the SCAP (Security Content Automation Protocol) [105] suite, are
going to be enforced and less extended usage ones would also be considered.
All the previously related standard mechanisms will be implemented in the architec-
ture for both data gathering and delivery, and one of the goals of the proposed approach is
to avoid proprietary data exchange mechanisms at all levels, if possible, and enforce stan-
dards usage. The usage of standards is mandatory for the scalability and extendability of
the platform. One example that is considered is the capability of connecting the system on
demand to external sources such as Virustotal [106], URLHaus [107], among others, which
also do provide their own APIs to request/provide data, mostly based on well-known
standards such as API REST to enrich the data processed by the platform. External data is
beneficial for aspects such as IP/URLs/fqdn, hashes/files, etc., regarding detected IoCs
with relevant intelligence from those well-known and reputed internet repositories.
Regarding the communication mechanisms, other standards such as API REST [108]
for one-shot requests or AMQP [109] to publish/subscribe messaging are to be used to
exchange data.
charge of exchanging and forwarding messages between each component and guaranteeing
their proper delivery. As a consequence, the communications broker is a crucial component.
As stated before, all components of the system must send their messages using the
communications broker and, in order to avoid the possibility of any unauthorized agent
sending or receiving messages, the access to the communications broker network will
be restricted and can be considered the first authentication factor, enforcing messages
integrity [63].
In addition, messages will be exchanged using the AMQP [109] protocol and using
several communications patterns: namely, one-to-one, one-to-many, in a broadcast manner,
etc. Not only that, components will be sending messages using a request-response or
subscription-publishing mechanism.
The usage of a communications broker provides many benefits to any distributed ar-
chitecture. First of all, there are several extended-usage platforms that are widely tested by
huge communities ensuring minimal communication issues. Moreover, the new elements
addition process is relayed in the broker procedures and usually consists in connecting the
broker following its mechanisms. Not only that, but networking issues are reduced because
each component only needs to obtain access to the communications broker endpoint, so
network administrators do not need to take care of broadcasting issues or other related prob-
lems. In addition, most brokers, if not all of them, provide real-time broadcast queues and
subscription-publishing mechanisms which allow for immediate data updates. As a side
effect, one-to-many message exchange patterns, such as those provided by communication
brokers, do yield significant bandwidth consumption reduction.
5. System Prototype
In order to validate the proposed system architecture a prototype, has been implemented.
A brief view of the different components developed are shown in Figure 2, including each
component in their corresponding layer in Figure 1, regarding the group of components.
The prototype has been evaluated using synthetic data simulating real networks and
hosts by means of a digital twin. A digital twin can be defined as a clone of physical assets
and their data in a virtualized environment simulating the cloned one. Digital twins also
allow to test the physical one at all stages of the life cycle with the associated benefits of
bugs and vulnerabilities detection [118].
Big Data Cogn. Comput. 2023, 7, 65 13 of 26
MITRE
b b b
In Figure 3 the implemented digital twin used to simulate a real Critical Infrastructure
setup is detailed, including networks and assets (workstations, servers, network hardware,
etc.) to verify the developed prototype that has been implemented using a virtualization
platform. Three networks have been created. The first one contains all the monitored
systems which will be attacked by an external actor in order to detect threats. The second
one contains all the systems that the system prototype will collect data from. Lastly, the
third network contains all the deployed components of the prototype.
5.1. Components
The components developed and deployed to verify the architecture will be described in
this section. All of the developed components used Python [119–121] as the implementation
language.
Following the same order as in previous sections, the data collectors were developed
beforehand:
• MISP [70].
• OSSIM [67].
• QRadar [68].
• The Hive [71].
• PCAPs [69].
• Syslogs.
• Raw logs.
Regarding the database, Elastic Search was chosen along with Elastic Common Schema
as the data model.
In addition, the data preprocessing components (Section 4.3) that were developed are
the following:
• Sigma Converters.
• Number Normalization.
• Text Normalization.
• One-Hot Encoders.
Big Data Cogn. Comput. 2023, 7, 65 14 of 26
WAN
Data Collector
X
Servers: [Link]/24
Inside: [Link]/24
Ubuntu Windows 10
DMZ: [Link]/24
Apache DNS
Furthermore, the developed machine learning components (Section 4.4) used for
verifying the architecture were the following:
• APT Clustering components.
• Anomaly detectors.
• NLP.
• Decision trees.
• Neural networks.
A model repository component was also used where pre-trained models were stored
in order to feed the components which require them.
Big data statistics, the hypothesis generator, ML sequence presets and data exchang-
ers components were also developed. It is considered interesting to highlight that data
exchangers were able to query data from MITRE ATT&CK [122–124] as well as export data
using STIX.
Big Data Cogn. Comput. 2023, 7, 65 15 of 26
In order to interact with the system, an HMI and an External Access Gateway were also
developed, acting as proxy to authenticate and authorize the requests before forwarding
them to the available data exchangers.
Lastly, RabbitMQ [125–127] was used as a communications broker and a compo-
nent which the OAuth 2.0 protocol implements was developed in order to manage the
authentication.
5.2. Validation
The prototype has been validated layer by layer, following the same path that the data
does, from the collection to the visualization.
The first step was to collect data from several sources. In order to do this, data
collectors for MISP, OSSIM, QRadar and The Hive were deployed and properly configured,
and, for each one of them, it was checked that the content was correctly collected and
normalized following the proposed data model.
After that, the following step was to create Machine Learning systems using the ML
Sequence Presets component. In the prototype, several ML Components along with Data
Preprocessing Components were deployed in order to be used to generate sequences by
concatenating all of those required in the order set by the ML expert. Those ML systems
were executed either for one single shot or for recurrently generating valuable information
about what is happening.
Having raw collected data and information generated by ML systems, the next step
was to test the data exchangers in the two available ways: to export data to and import
data from third parties. On one hand, using the External Access Gateway components,
data was exported to an external system using STIX. On the other hand, data was imported
from MITRE ATT&CK successfully.
As one key element of the proposed architecture, the Hypothesis Generator component
was properly configured to process all the collected data and produce knowledge to
generate valuable intelligence from those hypotheses previously checked and tuned by a
Threat Hunter using the HMI.
The last step was to analyze and visualize all the gathered data, information and
hypotheses to find threats in the monitored infrastructure. Some parts of the HMI regarding
raw and chart data visualizations will be explained hereafter.
A key of the proposed architecture is the ability of hypothesis generation, and, in order
to do this, there is a specific component called Hypothesis Generator which is in charge of
doing that specific task. The output of that component is listed at a specific visualization at
the HMI which also enables to validate generated hypotheses.
A hypothesis is a group of “Data Context” data which has been executed in a specific
order and, optionally, can be associated to some APT. Once a hypothesis has been generated,
it is shown to Threat Hunters with details containing the action chain to conduct a manual
analysis in order to determine whether it is a threat or not. In Figure 5, there is an example
of what would be seen by a Threat Hunter.
In the previous figure, we can find a graph showing the assets (brown color) connected
to the services (yellow color) they have and the vulnerabilities (sky blue color) associated
to them.
The same query to the data storage is shown in Figure 9 (i.e., assets per services per
vulnerabilities) but with a different visualization technique, in this case, circle packing. The
packing visualizations do lose the graph interconnection-display capability but provide
means to see which element encircles another. Therefore, we can see here inside an
asset (brown), its services (yellow circle), and inside each service its vulnerabilities (sky
blue disc).
Big Data Cogn. Comput. 2023, 7, 65 18 of 26
In the above snapshot, the same query is shown (assets per services per vulnerabilities)
with the same color schema (assets displayed with brown color, services with yellow color,
and vulnerabilities with sky blue color) but, in this case, elements are not encircled but laid
on a concentric set of discs, each one representing a layer.
It is remarkable to state that, in all the views, the user can interact at any time with
what is currently displayed; if the users clicks on any figure, a new window with all the
detailed information about the element is shown.
The tree map view is quite similar to the circle packing, but in this case it is repre-
senting a Hilbert space decomposition. Again, assets, their services and their associated
vulnerabilities are shown with the same color code and grouped in the shown boxes. It is
important to state that the user can interact with the visualization as they can do in all the
other visualizations.
Implemented visualizations are not limited to these examples but they are composed
of an extended range of techniques, all of them enforcing the capability of helping in
detecting patterns in complex and multi-dimensional datasets. As relevant features, we
can point out that they are graph-based and provide means to show multi-dimensional
interrelated data in a few dimensions’ graph.
5.3. Verification
After the validation process was successfully completed, a verification of the prototype
was conducted with Threat Hunters (i) to ensure that the defined architecture copes with
all the envisioned scenarios outlined in Section 2 and (ii) to validate the performance of the
prototype against other solutions in the existing state-of-the-art.
Because there are no two identical people, it is difficult to ensure that a system is good
enough for everyone, but with enough population, there can be a subjective approximation
if it is fairly good or not. The subjective verification process was split into three stages:
(i) Firstly, the implemented prototype was deployed in the networks monitored by the
Threat Hunters in charge of evaluating it. (ii) After several months (time enough to have
sufficient data in the prototype to obtain valid results through the ML components), the
prototype was used by Threat Hunters in parallel with their own systems. (iii) Lastly,
Big Data Cogn. Comput. 2023, 7, 65 20 of 26
Threat Hunters were asked to answer specific surveys (some of whose questions are shown
in Table 2) to determine how valid the system is.
Question
Does the prototype give fast access to the information considered as relevant?
Does the prototype receive updated information from external sources?
Does the prototype send information to external sources?
Does the prototype provide tools to easily create/edit/delete preprocessing components?
Does the prototype provide tools to easily create/edit/delete ML components?
Does the prototype help at the decision making process?
Is the prototype easy to use?
The survey answers showed that, generally, the prototype was useful and the proposed
architecture is strong enough to be used as a Threat Hunting tool for Critical Infrastructures.
Aside from the subjective evaluation of the prototype, some calculated metrics of
the hypothesis generator component were also calculated, whose results are presented in
Table 3.
6. Conclusions
In the previous sections, the architecture and all its features have been presented,
followed by an exhaustive overall validation and verification. The results obtained can
be used to compare given features to others from the tools and systems in the existing
state-of-the-art. This comparison has drawn the following conclusions.
Firstly, it has been pointed out that there is a need to improve the tools used by Threat
Hunters in Critical Infrastructures to improve their daily job. Among all the difficulties
that Threat Hunters must face, a critical one is the vast amount of data that they must
process with the consequent degradation in the process of situation understanding, decision
making and the associated cognitive overwhelm.
This work, alongside others existing in the state-of-the-art, aims to solve that problem
by proposing an architecture in order to help Threat Hunters by coping with the stated
problem by means of a reduction of information presented to them using a Machine
Learning approach that provides suggestions and hints about what is going on.
The current systems and tools stated in the state-of-the-art are mainly focused on
the generation of IoCs, but none of them take into account tools to help Threat Hunters
in the hypothesis generation process. As a consequence, there is gap in the generation
of hypotheses using raw and/or ML processed data to know what is going on in the
system monitored, which the proposed architecture tries to fill by enforcing hypothesis
generation as a main aid to Threat Hunters. Consequently, one of the main contributions
of the work described (and not fully found in similar solutions) is the provided capability
to Threat Hunters to be helped by ML processes in generating complex and elaborated
hypotheses about the current situation and what is more likely to happen in the near future.
Furthermore, a key aspect of this kind of system, namely, visualization, is not fully exploited
through the tools surveyed in the state-of-the-art, whereas in the proposed architecture,
Big Data Cogn. Comput. 2023, 7, 65 21 of 26
Author Contributions: Writing—original draft, M.A.L., I.P.L. and M.E.D. All authors have read and
agreed to the published version of the manuscript.
Funding: This work was supported by the European Commission’s Project PRAETORIAN (Pro-
tection of Critical Infrastructures from advanced combined cyber and physical threats) under the
Horizon 2020 Framework (Grant Agreement No. 101021274).
Data Availability Statement: The data analyzed in this study was synthetically generated. Data
sharing is not applicable to this article.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
IT Information Technology
ML Machine Learning
OS Operating System
OSINT Open Source Intelligence
OTP One Time Passwords
SDN Software-Defined Networks
SIEM Security Information and Event Management
SME Small and Medium Enterprise
SSLA Security Service Levels Agreements
TMP Threat Management Platforms
VPN Virtual Private Network
VR Virtual-Reality
References
1. PRAETORIAN. D3.1 Transitioning Risk Management, 2021. PRAETORIAN H2020 Project Deliverables. Not yet published.
2. Li, J.H. Cyber security meets artificial intelligence: A survey. Front. Inf. Technol. Electron. Eng. 2018, 19, 1462–1474. [CrossRef]
3. Falandays, J.B.; Nguyen, B.; Spivey, M.J. Is prediction nothing more than multi-scale pattern completion of the future? Brain Res.
2021, 1768, 147578. [CrossRef]
4. Federmeier, K.D. Thinking ahead: The role and roots of prediction in language comprehension. Psychophysiology 2007, 44, 491–505.
[CrossRef] [PubMed]
5. Riegler, A. The role of anticipation in cognition. In Proceedings of the AIP Conference Proceedings. Am. Inst. Phys. 2001, 573,
534–541.
6. Slattery, T.J.; Yates, M. Word skipping: Effects of word length, predictability, spelling and reading skill. Q. J. Exp. Psychol. 2018,
71, 250–259. [CrossRef] [PubMed]
7. Lehner, P.; Seyed-Solorforough, M.M.; O’Connor, M.F.; Sak, S.; Mullin, T. Cognitive biases and time stress in team decision
making. IEEE Trans. Syst. Man -Cybern.-Part Syst. Humans 1997, 27, 698–703. [CrossRef]
8. Bilge, L.; Dumitraş, T. Before we knew it: An empirical study of zero-day attacks in the real world. In Proceedings of the 2012
ACM Conference on Computer and Communications Security, Raleigh North, CA, USA, 16–18 October 2012; pp. 833–844.
9. Markowsky, G.; Markowsky, L. Visualizing cybersecurity events. In Proceedings of the International Conference on Security and
Management (SAM), Las Vegas, NV, USA, 22–25 July 2013; p. 1.
10. Young, C.S. Representing Cybersecurity Risk. In Cybercomplexity; Springer: Berlin/Heidelberg, Germany, 2022; pp. 19–24.
11. Endsley, M.R. Measurement of situation awareness in dynamic systems. Hum. Factors 1995, 37, 65–84. [CrossRef]
12. Franke, U.; Brynielsson, J. Cyber situational awareness–a systematic review of the literature. Comput. Secur. 2014, 46, 18–31.
[CrossRef]
13. Chen, S.; Guo, C.; Yuan, X.; Merkle, F.; Schaefer, H.; Ertl, T. Oceans: Online collaborative explorative analysis on network security.
In Proceedings of Eleventh Workshop on Visualization for Cyber Security, Paris, France, 10 November 2014; pp. 1–8.
14. Choi, H.; Lee, H. PCAV: Internet attack visualization on parallel coordinates. In Proceedings of the International Conference on
Information and Communications Security, Beijing, China, 10–13 December 2005; Springer: Berlin/Heidelberg, Germany, 2005;
pp. 454–466.
15. Jahromi, A.N.; Hashemi, S.; Dehghantanha, A.; Parizi, R.M.; Choo, K.K.R. An enhanced stacked LSTM method with no random
initialization for malware threat hunting in safety and time-critical systems. IEEE Trans. Emerg. Top. Comput. Intell. 2020,
4, 630–640. [CrossRef]
16. Schmitt, S.; Kandah, F.I.; Brownell, D. Intelligent threat hunting in software-defined networking. In Proceedings of the 2019 IEEE
International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 11–13 January 2019; IEEE: Piscataway, NJ, USA,
2019; pp. 1–5.
17. Schmitt, S. Advanced Threat Hunting over Software-Defined Networks in Smart Cities; University of Tennessee at Chattanooga:
Chattanooga, Tennessee, USA, 2018.
18. HaddadPajouh, H.; Dehghantanha, A.; Khayami, R.; Choo, K.K.R. A deep recurrent neural network based approach for internet
of things malware threat hunting. Future Gener. Comput. Syst. 2018, 85, 88–96. [CrossRef]
19. Raju, A.D.; Abualhaol, I.Y.; Giagone, R.S.; Zhou, Y.; Huang, S. A survey on cross-architectural IoT malware threat hunting. IEEE
Access 2021, 9, 91686–91709. [CrossRef]
20. Homayoun, S.; Dehghantanha, A.; Ahmadzadeh, M.; Hashemi, S.; Khayami, R. Know abnormal, find evil: Frequent pattern
mining for ransomware threat hunting and intelligence. IEEE Trans. Emerg. Top. Comput. 2017, 8, 341–351. [CrossRef]
21. Neto, A.J.H.; dos Santos, A.F.P. Cyber threat hunting through automated hypothesis and multi-criteria decision making. In
Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; IEEE:
Piscataway, NJ, USA, 2020; pp. 1823–1830.
22. Gonzalez-Granadillo, G.; Faiella, M.; Medeiros, I.; Azevedo, R.; Gonzalez-Zarzosa, S. ETIP: An Enriched Threat Intelligence
Platform for improving OSINT correlation, analysis, visualization and sharing capabilities. J. Inf. Secur. Appl. 2021, 58, 102715.
[CrossRef]
Big Data Cogn. Comput. 2023, 7, 65 23 of 26
23. Azevedo, R.; Medeiros, I.; Bessani, A. PURE: Generating quality threat intelligence by clustering and correlating OSINT. In
Proceedings of the 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications
(TrustCom), Rotorua, New Zealand, 5–8 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 483–490.
24. Alves, F.; Ferreira, P.M.; Bessani, A. OSINT-based Data-driven Cybersecurity Discovery. In Proceedings of the 12th Eurosys
Doctoral Conference, Porto, Portugal, 23 April 2018; pp. 1–5.
25. Kott, A.; Wang, C.; Erbacher, R.F. Cyber Defense and Situational Awareness; Springer: Berlin/Heidelberg, Germany, 2015; Volume 62.
26. Greitzer, F.L.; Noonan, C.F.; Franklin, L. Cognitive Foundations for Visual Analytics; Technical Report; Pacific Northwest National
Lab.(PNNL): Richland, WA, USA, 2011.
27. Eslami, M.; Zheng, G.; Eramian, H.; Levchuk, G. Deriving cyber use cases from graph projections of cyber data represented as
bipartite graphs. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14
December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 4658–4663.
28. Kotenko, I.; Novikova, E. Visualization of security metrics for cyber situation awareness. In Proceedings of the 2014 Ninth
International Conference on Availability, Reliability and Security, Fribourg, Switzerland, 8–12 September 2014; IEEE: Piscataway,
NJ, USA, 2014; pp. 506–513.
29. Beaver, J.M.; Steed, C.A.; Patton, R.M.; Cui, X.; Schultz, M. Visualization techniques for computer network defense. In
Proceedings of the Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security
and Homeland Defense X. SPIE, Orlando, FL, USA, 25–28 April 2011; Volume 8019, pp. 18–26.
30. Goodall, J.R.; Ragan, E.D.; Steed, C.A.; Reed, J.W.; Richardson, G.D.; Huffer, K.M.; Bridges, R.A.; Laska, J.A. Situ: Identifying and
explaining suspicious behavior in networks. IEEE Trans. Vis. Comput. Graph. 2018, 25, 204–214. [CrossRef] [PubMed]
31. Zhuo, Y.; Zhang, Q.; Gong, Z. Cyberspace situation representation based on niche theory. In Proceedings of the 2008 International
Conference on Information and Automation, Zhangjiajie, China, 20–23 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1400–1405.
32. Pike, W.A.; Scherrer, C.; Zabriskie, S. Putting security in context: Visual correlation of network activity with real-world information.
In VizSEC 2007; Springer: Berlin/Heidelberg, Germany, 2008; pp. 203–220.
33. Abraham, S.; Nair, S. Comparative analysis and patch optimization using the cyber security analytics framework. J. Def. Model.
Simul. 2018, 15, 161–180. [CrossRef]
34. Graf, R.; Gordea, S.; Ryan, H.M.; Houzanme, T. An Expert System for Facilitating an Institutional Risk Profile Definition for Cyber
Situational Awareness. In Proceedings of the ICISSP, Rome, Italy, 19–21 February 2016; pp. 347–354.
35. Lohmann, S.; Heimerl, F.; Bopp, F.; Burch, M.; Ertl, T. Concentri cloud: Word cloud visualization for multiple text documents. In
Proceedings of the 2015 19th International Conference on Information Visualisation, Barcelona, Spain, 22–24 July 2015; IEEE:
Piscataway, NJ, USA, 2015; pp. 114–120.
36. Xu, J.; Tao, Y.; Lin, H. Semantic word cloud generation based on word embeddings. In Proceedings of the 2016 IEEE Pacific
Visualization Symposium (PacificVis), Taipei, Taiwan, 19–22 April 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 239–243.
37. De Ville, B. Decision trees. Wiley Interdiscip. Rev. Comput. Stat. 2013, 5, 448–455.
38. Tak, S.; Cockburn, A. Enhanced spatial stability with hilbert and moore treemaps. IEEE Trans. Vis. Comput. Graph. 2012,
19, 141–148. [CrossRef]
39. Angelini, M.; Bonomi, S.; Lenti, S.; Santucci, G.; Taggi, S. MAD: A visual analytics solution for Multi-step cyber Attacks Detection.
J. Comput. Lang. 2019, 52, 10–24.
40. Zhong, C.; Alnusair, A.; Sayger, B.; Troxell, A.; Yao, J. AOH-map: A mind mapping system for supporting collaborative cyber
security analysis. In Proceedings of the 2019 IEEE Conference on Cognitive and Computational Aspects of Situation Management
(CogSIMA), Las Vegas, NV, USA, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 74–80.
41. Cho, S.; Han, I.; Jeong, H.; Kim, J.; Koo, S.; Oh, H.; Park, M. Cyber kill chain based threat taxonomy and its application on
cyber common operational picture. In Proceedings of the 2018 International Conference On Cyber Situational Awareness, Data
Analytics And Assessment (Cyber SA), Glasgow, Scotland, UK, 11–12 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8.
42. Kabil, A.; Duval, T.; Cuppens, N.; Comte, G.L.; Halgand, Y.; Ponchel, C. From cyber security activities to collaborative virtual
environments practices through the 3D cybercop platform. In Proceedings of the International Conference on Information
Systems Security, Funchal, Madeira, Portugal, 22–24 January 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 272–287.
43. Kopylec, J.; D’Amico, A.; Goodall, J. Visualizing cascading failures in critical cyber infrastructures. In Proceedings of the
International Conference on Critical Infrastructure Protection, Hanover, NH, USA, 18–21 March 2007; Springer: Berlin/Heidelberg,
Germany, 2007; pp. 351–364.
44. Llopis, S.; Hingant, J.; Pérez, I.; Esteve, M.; Carvajal, F.; Mees, W.; Debatty, T. A comparative analysis of visualisation techniques
to achieve cyber situational awareness in the military. In Proceedings of the 2018 International Conference on Military
Communications and Information Systems (ICMCIS), Varsoiva, Poland, 22–23 May 2018; IEEE: Piscataway, NJ, USA, 2018;
pp. 1–7.
45. Carvalho, V.S.; Polidoro, M.J.; Magalhaes, J.P. Owlsight: Platform for real-time detection and visualization of cyber threats. In
Proceedings of the 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), New York, NY,
USA, 8–10 April 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 61–66.
46. Pietrowicz, S.; Falchuk, B.; Kolarov, A.; Naidu, A. Web-Based Smart Grid Network Analytics Framework. In Proceedings of the
2015 IEEE International Conference on Information Reuse and Integration, San Francisco, CA, USA, 13–15 August 2015; IEEE:
Piscataway, NJ, USA, 2015; pp. 496–501.
Big Data Cogn. Comput. 2023, 7, 65 24 of 26
47. Matuszak, W.J.; DiPippo, L.; Sun, Y.L. Cybersave: Situational awareness visualization for cyber security of smart grid systems. In
Proceedings of the Tenth Workshop on Visualization for Cyber Security, Atlanta, GA, USA, 14 October 2013; pp. 25–32.
48. Kabil, A.; Duval, T.; Cuppens, N. Alert characterization by non-expert users in a cybersecurity virtual environment: A usability
study. In Proceedings of the International Conference on Augmented Reality, Virtual Reality and Computer Graphics, Lecce,
Italy, 7–10 September 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 82–101.
49. Kullman, K.; Cowley, J.; Ben-Asher, N. Enhancing cyber defense situational awareness using 3D visualizations. In Proceedings of
the 13th International Conference on Cyber Warfare and Security ICCWS 2018, National Defense University, Washington, DC,
USA, 8–9 March 2018; pp. 369–378.
50. Kullman, K.; Asher, N.B.; Sample, C. Operator impressions of 3D visualizations for cybersecurity analysts. In Proceedings of the
ECCWS 2019 18th European Conference on Cyber Warfare and Security, Coimbra, Portugal, 4–5 July 2019; Academic Conferences
and publishing limited: Red Hook, NY, USA, 2019; p. 257.
51. Reed, J. Threat Hunting with ML: Another Reason to SMLE. 17 February 2021. Available online: [Link]
us/blog/platform/[Link] (accessed on 28 March 2023).
52. Liang, J.; Kim, Y. Evolution of Firewalls: Toward Securer Network Using Next Generation Firewall. In Proceedings of the
2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Virutal, 26–29 January 2022; IEEE:
Piscataway, NJ, USA, 2022; pp. 752–759.
53. IBM X-Force Exchange. Available online: [Link] (accessed on 3 March 2023).
54. The Security Immune System: An Integrated Approach to Protecting Your Organization. Available online: [Link]
[Link]/pdf/[Link] (accessed on 3 March 2023).
55. Anomali ThreatStream: Automated Threat Intelligence Management at Scale. Available online: [Link]
products/threatstream (accessed on 3 March 2023).
56. Wang, B.; Najjar, L.; Xiong, N.N.; Chen, R.C. Stochastic optimization: Theory and applications. J. Appl. Math. 2013, 2013, 949131.
[CrossRef]
57. McCall, J. Genetic algorithms for modelling and optimisation. J. Comput. Appl. Math. 2005, 184, 205–222. [CrossRef]
58. Jangla, K. Docker Compose. In Accelerating Development Velocity Using Docker; Springer: Berlin/Heidelberg, Germany, 2018;
pp. 77–98.
59. Li, Y.; Li, W.; Jiang, C. A survey of virtual machine system: Current technology and future trends. In Proceedings of the 2010
Third International Symposium on Electronic Commerce and Security, Guangzhou, China, 29–31 July 2010; IEEE: Piscataway, NJ,
USA, 2010; pp. 332–336.
60. Medel, V.; Rana, O.; Bañares, J.Á.; Arronategui, U. Modelling performance & resource management in kubernetes. In Proceedings
of the 9th International Conference on Utility and Cloud Computing, Shanghai, Chine, 6–9 December 2016; pp. 257–262.
61. Kotas, C.; Naughton, T.; Imam, N. A comparison of Amazon Web Services and Microsoft Azure cloud platforms for high
performance computing. In Proceedings of the 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas,
NV, USA, 12–14 January 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–4.
62. Gray, J.; Siewiorek, D.P. High-availability computer systems. Computer 1991, 24, 39–48. [CrossRef]
63. Wilson, K.S. Conflicts among the pillars of information assurance. IT Prof. 2012, 15, 44–49. [CrossRef]
64. Rinaldi, S.M.; Peerenboom, J.P.; Kelly, T.K. Identifying, understanding, and analyzing critical infrastructure interdependencies.
IEEE Control Syst. Mag. 2001, 21, 11–25.
65. Fleissner, S.; Baniassad, E. A commensalistic software system. In Proceedings of the Companion to the 21st ACM SIGPLAN
Symposium on Object-Oriented Programming Systems, Languages, and Applications, Portland, OR, USA, 22–26 October 2006;
pp. 560–573.
66. Torchiano, M.; Jaccheri, L.; Sørensen, C.F.; Wang, A.I. COTS products characterization. In Proceedings of the 14th International
Conference on Software Engineering and Knowledge Engineering, Ischia, Italy, 15–19 July 2002; pp. 335–338.
67. Coppolino, L.; D’Antonio, S.; Formicola, V.; Romano, L. Integration of a System for Critical Infrastructure Protection with the
OSSIM SIEM Platform: A dam case study. In Proceedings of the International Conference on Computer Safety, Reliability, and
Security, Naples, Italy, 19–22 September 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 199–212.
68. Cerullo, G.; Formicola, V.; Iamiglio, P.; Sgaglione, L. Critical Infrastructure Protection: Having SIEM technology cope with
network heterogeneity. arXiv 2014, arXiv:1404.7563.
69. Veselý, V. Extended Comparison Study on Merging PCAP Files. ElectroScope 2012, 2012, 1–6.
70. Wagner, C.; Dulaunoy, A.; Wagener, G.; Iklody, A. Misp: The design and implementation of a collaborative threat intelligence
sharing platform. In Proceedings of the 2016 ACM on Workshop on Information Sharing and Collaborative Security, Vienna,
Austria, 24 October 2016; pp. 49–56.
71. Groenewegen, A.; Janssen, J.S. TheHive Project: The Maturity of an Open-Source Security Incident Response Platform; SNE/OS3;
University of Amsterdam: Amsterdam, The Netherlands, 2021.
72. Gonashvili, M. Knowledge Management for Incident Response Teams; Masaryk University: Brno, Czech Republic, 2019.
73. Cole, E. Advanced Persistent Threat: Understanding the Danger and How to Protect Your Organization; Syngress: Oxford, UK, 2012.
74. Tabatabaei, F.; Wells, D. OSINT in the Context of Cyber-Security. Open Source Intell. Investig. 2016, 1, 213–231.
75. Verhoef, R. Sigma Rules! The Generic Signature Format for SIEM Systems. 19 June 2020. Available online: [Link]
diary/rss/26258 (accessed on 7 February 2023).
Big Data Cogn. Comput. 2023, 7, 65 25 of 26
76. Ömer. What Is Sigma? Threat Hunting in Siem Products with Sigma Rules–Example Sigma Rules. 21 March 2021. Available
online: [Link]
sigma-rules/ (accessed on 7 February 2023).
77. Naik, N.; Jenkins, P.; Savage, N.; Yang, L.; Boongoen, T.; Iam-On, N.; Naik, K.; Song, J. Embedded YARA rules: Strengthening
YARA rules utilising fuzzy hashing and fuzzy rules for malware analysis. Complex Intell. Syst. 2021, 7, 687–702. [CrossRef]
78. Naik, N.; Jenkins, P.; Savage, N.; Yang, L. Cyberthreat Hunting-Part 1: Triaging ransomware using fuzzy hashing, import hashing
and YARA rules. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA,
USA, 23–26 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6.
79. Knuth, D.E. The Art of Computer Programming, 2nd ed.; Sorting and Searching; Addison Wesley Longman Publishing Co., Inc.:
Boston, MA, USA, 1998; Volume 3.
80. Gianvecchio, S.; Burkhalter, C.; Lan, H.; Sillers, A.; Smith, K. Closing the Gap with APTs Through Semantic Clusters and
Automated Cybergames. In Proceedings of the Security and Privacy in Communication Networks, Orlando, FL, USA, 23–25
October 2019; Chen, S., Choo, K.K.R., Fu, X., Lou, W., Mohaisen, A., Eds.; Springer International Publishing: Cham, Switzerland,
2019; pp. 235–254.
81. Divya, M.S.; Goyal, S.K. ElasticSearch: An advanced and quick search technique to handle voluminous data. Compusoft 2013,
2, 171.
82. Hancock, J.T.; Khoshgoftaar, T.M. Survey on categorical data for neural networks. J. Big Data 2020, 7, 28. [CrossRef]
83. Schetinin, V.; Schult, J. A neural-network technique to learn concepts from electroencephalograms. Theory Biosci. 2005, 124, 41–53.
[CrossRef]
84. Gallant, S.I.; Gallant, S.I. Neural Network Learning and Expert Systems; MIT Press: Cambridge, MA, USA, 1993.
85. Murthy, S.K.; Kasif, S.; Salzberg, S. A system for induction of oblique decision trees. J. Artif. Intell. Res. 1994, 2, 1–32. [CrossRef]
86. Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [CrossRef]
87. Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: A new data clustering algorithm and its applications. Data Min. Knowl. Discov.
1997, 1, 141–182. [CrossRef]
88. Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: An efficient data clustering method for very large databases. ACM Sigmod Rec.
1996, 25, 103–114. [CrossRef]
89. Khan, K.; Rehman, S.U.; Aziz, K.; Fong, S.; Sarasvady, S. DBSCAN: Past, present and future. In Proceedings of the Fifth
International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), Chennai, India,
17–19 February 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 232–238.
90. Çelik, M.; Dadaşer-Çelik, F.; Dokuz, A.Ş. Anomaly detection in temperature data using DBSCAN algorithm. In Proceedings of
the 2011 International Symposium on Innovations in Intelligent Systems and Applications, Istanbul, Turkey, 15–18 June 2011;
IEEE: Piscataway, NJ, USA, 2011; pp. 91–95.
91. Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data
Mining, Pisa, Italy, 15–19 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 413–422.
92. Ding, Z.; Fei, M. An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window.
IFAC Proc. Vol. 2013, 46, 12–17. [CrossRef]
93. Amer, M.; Goldstein, M.; Abdennadher, S. Enhancing one-class support vector machines for unsupervised anomaly detection. In
Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, Chicago, Illinois, 11 August 2013; pp. 8–15.
94. Hejazi, M.; Singh, Y.P. One-class support vector machines approach to anomaly detection. Appl. Artif. Intell. 2013, 27, 351–366.
[CrossRef]
95. Ukwen, D.O.; Karabatak, M. Review of NLP-based Systems in Digital Forensics and Cybersecurity. In Proceedings of the 2021
9th International Symposium on Digital Forensics and Security (ISDFS), Elazig, Turkey, 28–29 June 2021; IEEE: Piscataway, NJ,
USA, 2021; pp. 1–9.
96. Georgescu, T.M. Natural language processing model for automatic analysis of cybersecurity-related documents. Symmetry 2020,
12, 354. [CrossRef]
97. Mathews, S.M. Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review.
In Proceedings of the Intelligent Computing-Proceedings of the Computing Conference, London, UK, 16–17 July 2019; Springer:
Berlin/Heidelberg, Germany, 2019; pp. 1269–1292.
98. Al-Omari, M.; Rawashdeh, M.; Qutaishat, F.; Alshira’H, M.; Ababneh, N. An intelligent tree-based intrusion detection model for
cyber security. J. Netw. Syst. Manag. 2021, 29, 20. [CrossRef]
99. Sarker, I.H. Deep cybersecurity: A comprehensive overview from neural network and deep learning perspective. SN Comput. Sci.
2021, 2, 154.
100. Fang, H. Managing data lakes in big data era: What’s a data lake and why has it became popular in data management ecosystem.
In Proceedings of the 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems
(CYBER), Shenyang, China, 8–12 June 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 820–824.
101. Goyal, G.; Singh, K.; Ramkumar, K. A detailed analysis of data consistency concepts in data exchange formats (JSON & XML). In
Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida,
India, 5–6 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 72–77.
Big Data Cogn. Comput. 2023, 7, 65 26 of 26
102. Barnum, S. Standardizing cyber threat intelligence information with the structured threat information expression (stix). Mitre
Corp. 2012, 11, 1–22.
103. Riesco, R.; Villagrá, V.A. Leveraging cyber threat intelligence for a dynamic risk framework. Int. J. Inf. Secur. 2019, 18, 715–739.
[CrossRef]
104. Na, S.; Kim, T.; Kim, H. A study on the classification of common vulnerabilities and exposures using naïve bayes. In Proceedings
of the International Conference on Broadband and Wireless Computing, Communication and Applications, Asan, Republic of
Korea, 5–7 November 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 657–662.
105. Radack, S.; Kuhn, R. Managing security: The security content automation protocol. IT Prof. 2011, 13, 9–11. [CrossRef]
106. VirusTotal: Analyse Suspicious Files, Domains, IPs and URLs to Detect Malware and Other Breaches, Automatically Share Them
with the Security Community. Available online: [Link] (accessed on 3 March 2023).
107. URLhaus: Malware URL Exchange. Available online: [Link] (accessed on 3 March 2023).
108. Masse, M. REST API Design Rulebook: Designing Consistent RESTful Web Service Interfaces; O’Reilly Media, Inc.’: Sebastopol, CA,
USA, 2011.
109. Naik, N. Choice of effective messaging protocols for IoT systems: MQTT, CoAP, AMQP and HTTP. In Proceedings of the 2017
IEEE International Systems Engineering Symposium (ISSE), Vienna, Austria, 11–13 October 2017; IEEE: Piscataway, NJ, USA,
2017; pp. 1–7.
110. Sandhu, R.S.; Coyne, E.J.; Feinstein, H.L.; Youman, C.E. Role-based access control models. Computer 1996, 29, 38–47. [CrossRef]
111. Tomasek, M.; Cerny, T. On web services ui in user interface generation in standalone applications. In Proceedings of the 2015
Conference on Research in Adaptive and Convergent Systems, Prague, Czech Republic, 9–12 October 2015; pp. 363–368.
112. Montesi, F.; Weber, J. Circuit breakers, discovery, and API gateways in microservices. arXiv 2016, arXiv:1609.05830.
113. Xu, R.; Jin, W.; Kim, D. Microservice security agent based on API gateway in edge computing. Sensors 2019, 19, 4905. [CrossRef]
[PubMed]
114. Jeong, J.; Chung, M.Y.; Choo, H. Integrated OTP-based user authentication scheme using smart cards in home networks. In
Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008), Big Island, HI, USA, 7–10
January 2008; IEEE: Piscataway, NJ, USA, 2008; p. 294.
115. Zhao, S.; Hu, W. Improvement on OTP authentication and a possession-based authentication framework. Int. J. Multimed. Intell.
Secur. 2018, 3, 187–203. [CrossRef]
116. Bihis, C. Mastering OAuth 2.0; Packt Publishing Ltd.: Birmingham, UK, 2015.
117. Hardt, D. The OAuth 2.0 Authorization Framework. RFC 6749, RFC Editor, 2012. Available online: [Link]
rfc/[Link] (accessed on 28 March 2023).
118. Haag, S.; Anderl, R. Digital twin–Proof of concept. Manuf. Lett. 2018, 15, 64–66. [CrossRef]
119. Srinath, K. Python–the fastest growing programming language. Int. Res. J. Eng. Technol. 2017, 4, 354–357.
120. Nelli, F. Python Data Analytics: Data Analysis and Science Using PANDAs, Matplotlib and the Python Programming Language; Apress:
Sebastopol, CA, USA, 2015.
121. Hao, J.; Ho, T.K. Machine learning made easy: A review of scikit-learn package in python programming language. J. Educ. Behav.
Stat. 2019, 44, 348–361. [CrossRef]
122. Al-Shaer, R.; Spring, J.M.; Christou, E. Learning the associations of mitre att & ck adversarial techniques. In Proceedings of the
2020 IEEE Conference on Communications and Network Security (CNS), Virtual, 28–30 June 2020; IEEE: Piscataway, NJ, USA,
2020; pp. 1–9.
123. Alexander, O.; Belisle, M.; Steele, J. MITRE ATT&CK for Industrial Control Systems: Design and Philosophy; The MITRE Corporation:
Bedford, MA, USA, 2020.
124. Ahmed, M.; Panda, S.; Xenakis, C.; Panaousis, E. MITRE ATT&CK-driven cyber risk assessment. In Proceedings of the 17th
International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; pp. 1–10.
125. Roy, G.M. RabbitMQ in Depth; Simon and Schuster: New York, NY, USA, 2017.
126. Ionescu, V.M. The analysis of the performance of RabbitMQ and ActiveMQ. In Proceedings of the 2015 14th RoEduNet
International Conference-Networking in Education and Research (RoEduNet NER), Craiova, Romania, 24–26 September 2015;
IEEE: Piscataway, NJ, USA, 2015; pp. 132–137.
127. Rostanski, M.; Grochla, K.; Seman, A. Evaluation of highly available and fault-tolerant middleware clustered architectures using
RabbitMQ. In Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland,
7–10 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 879–884.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.