AI Threats Landscape and Cybersecurity
AI Threats Landscape and Cybersecurity
L A ND S C A P E
RE PO RT
NAVIGATING THE
RISE OF AI RISKS
AI THREAT LANDSCAPE 2025
2024
Table of Contents
Foreword 03
What’s New in AI 10
Resources 49
About HiddenLayer 51
1
AI THREAT LANDSCAPE 2025
2024
Foreword
Artificial intelligence is no longer an emerging force – it is an embedded reality shaping economies,
industries, and societies at an unparalleled scale. Every mission, organization, and individual has felt its
impact, with AI driving efficiency, automation, and problem-solving breakthroughs. Yet, as its influence
expands, so too do the risks. The past year has emphasized a critical truth: the greatest threat to AI is not
the technology itself but the people who exploit it.
The AI landscape is evolving rapidly, with open-source models and smaller, more accessible architectures
accelerating innovation and risk. These advancements lower the barrier to entry, allowing more
organizations to leverage AI but they also widen the attack surface, making AI systems more susceptible
to manipulation, data poisoning, and adversarial exploitation. Meanwhile, hyped new model trends like
DeepSeek are introducing unprecedented risks and impacting geopolitical power dynamics.
Artificial intelligence remains the most vulnerable technology ever deployed at scale. Its security
challenges extend far beyond code, impacting every phase of its lifecycle from training and development
to deployment and real-world operations. Adversarial AI threats are evolving, blending traditional
cybersecurity tactics with new, AI-specific attack methods.
In this report, we explore the vulnerabilities introduced by these developments and their real-world
consequences for commercial and federal sectors. We provide insights from IT security and data science
leaders actively defending against these threats, along with predictions informed by HiddenLayer’s
hands-on experience in AI security. Most importantly, we highlight the advancements in security controls
essential for protecting AI in all its forms.
As AI continues to drive progress, securing its future is a responsibility shared by developers, data scien-
tists, and security professionals alike. This report is a crucial resource for understanding and mitigating AI
risks in a rapidly shifting landscape.
We are proud to present the second annual HiddenLayer AI Threat Landscape Report, expanding on last
year’s insights and charting the path forward for securing AI.
3
Security for AI Survey
Insights at a Glance
AI has become indispensable to modern business, powering critical functions and driving innovation. However, as
organizations increasingly rely on AI, traditional security measures have struggled to keep up with the growing
sophistication of threats.
The 2025 survey results highlight this tension: while many AI’s Critical Role in
IT leaders recognize AI’s central role in their company’s
success, there’s more work to implement comprehensive Business Success
security measures. Issues like shadow AI, ownership
debates, and limited security tool adoption contribute to
the challenges. However, the survey results show an of IT leaders reported
89%
optimistic shift toward prioritizing AI security, with that most or all AI
organizations investing more in defenses, governance
models in production are
critical to their
frameworks, transparency, and resources to address
business’s success.
emerging threats.
4
AI THREAT LANDSCAPE 2025
2024
Financial Gain
of IT leaders reported
Business Disruption
74%
to definitely know if
they had an AI breach
in 2024 (up from 67%
reporting last year). Disclosure &
Transparency of AI
75% Breaches
say AI attacks have increased or remained the of IT leaders strongly
same from the previous year. agree that companies
21% Third-Party
Applications
Top 3 Third-Party Gen AI Applications
Currently In Use at Organizations:
Freelance Hackers
5
AI THREAT LANDSCAPE 2025
2024
34%
32%
51%
17%
21%
51% North America 21% South America 34% Europe 17% Africa 32% Asia 14% Unknown
46%
Only 32% of IT leaders are
6
AI THREAT LANDSCAPE 2025
2024
AI Governance
Frameworks & Policies Transparency & Ethical
Oversight
96%
of IT leaders have a
67%
of companies have a formal framework for dedicated ethics
securing AI and ML models. committee or person
overseeing AI ethics.
81%
of organizations have implemented an AI
governance committee. 98%
of organizations plan to make AI security
Top 3 Frameworks Used to Secure AI Include: practices partially transparent.
99%
consider securing
AI a high priority in
have internal debate about 2025.
7
2024 AI Threat
Landscape Timeline
AI tech milestones Risks related to the use of AI
Release of new adversarial tools and
New AI security measures
techniques, disclosure of new
and legislation
vulnerabilities in ML tooling
JAN LeftoverLocals: Listening to LLM responses through leaked GPU local memory
FEB Researchers demonstrate an attack against the Hugging Face conversion bot
FEB Six critical vulnerabilities providing a full attack chain found in ClearML
FEB Path traversal and out-of-bound read vulnerabilities disclosed in ONNXserialization format
MAR First model-stealing technique that extracts precise information from LLMs
APR Arbitrary code execution and command injection vulnerabilities found in AWS Sagemaker
JUN Knowledge Return Oriented Prompting - new LLM prompt injection technique
JUN Agility Robotics' Digit humanoid robot deployed in production at large factories
8
AI THREAT LANDSCAPE 2025
2024
JUL Coalition for Secure AI established under the OASIS global standards body
JUL NIST expands its AIRMF with the Generative Artificial Intelligence Profile
JUL Critical vulnerability in Wyze camera enables researchers to bypass the embedded
AI's object detection
SEP U.S., UK, and EU sign the Council of Europe’s Framework Convention on AI
SEP Microsoft shuts down first cybercriminal service providing users with access
to jailbroken GenAI
SEP Ten arbitrary code execution vulnerabilities and one critical WebUI vulnerability
disclosed in MindsDB
SEP Wiz finds critical NVIDIA AI vulnerability in containers using NVIDIA GPUs
OCT Lawsuit filed against Character.ai states that AI companion chatbot to blame
for teenager’s suicide
NOV GEMA sues OpenAI for copyright infringement over use of song lyrics in AI training
DEC Major AI supply chain attack using dependency compromise affects Ultralytics
DEC Arbitrary code execution while scanning keras HDF5 models found in Bosch AIShield
DEC Apple Intelligence found generating fake news attributed to the BBC
DEC Shadowcast - a new technique of stealthy data poisoning attacks against vision-language
models, presented at NeurIPS
9
What’s New in AI
The past year brought significant advancements in AI across multiple domains, including multimodal models,
retrieval-augmented generation (RAG), humanoid robotics, and agentic AI.
10
AI THREAT LANDSCAPE 2025
2024
11
AI THREAT LANDSCAPE 2025
2024
12
PART 1
KEY STAT
illicit tasks, from enhancing their phishing campaigns and
TIME SPENT ADDRESSING RISK financial scams to generating malicious code and
On average, IT leaders spend 46% of their time on automating attacks to spreading political misinformation.
AI addressing risk or security
13
AI THREAT LANDSCAPE 2025
2024
All this brings an incredible boost to both the quantity of Prediction from last year: “Deepfakes will be
attacks and their success rate. In the past year, we saw increasingly used in scam and disinformation”
several sophisticated phishing campaigns against Gmail
users using AI voice.
14
AI THREAT LANDSCAPE 2025
2024
Highly personalized exploits and attack DEEP AND DARK WEB CHATTER
scenarios tailored to particular victims
The dark web has long been recognized as a space
where adversaries can automate scanning
for vulnerabilities in targeted systems where communities form outside the boundaries of
societal norms. A subset of these communities
focuses on the exploitation of emerging
technologies. In forums reviewed within these
ecosystems, we have found a large number of posts
In September 2024, HP Wolf Security identified a were dedicated to leveraging well-known legitimate
cybercriminal campaign in which AI-generated or malicious AI services to facilitate illicit operations.
code was used as the initial payload. In the first
stage of the attack, the adversary targeted their
victims with malicious scripts designed to
download and execute further info-stealing The dark web discussions around the malicious use
malware. These scripts, written in either VBScript or of AI focused on three categories:
JavaScript, exhibited all the signs of being
AI-generated: explanatory comments, specific
function names, and specific code structure. A few Cyber attack techniques: Posts that outline
months earlier, Proofpoint researchers made the the use of AI to enhance phishing
campaigns, malware development, and other
same conclusion about malicious PowerShell
offensive tactics.
scripts used in another campaign by a threat actor
known as TA547. This proves that adversaries are Deepfakes creation: Discussions focused on
already automating the generation of at least the utilizing AI to bypass verification processes
simpler components in their toolsets. AI is also or create deceptive identities.
likely helping the attackers with obfuscation and
Creation of illicit material: Discussions
mutation of malware, making it more difficult to
about bypassing GenAI guardrails to
detect and attribute. generate content that violates legal and
ethical standards.
15
AI THREAT LANDSCAPE 2025
2024
16
AI THREAT LANDSCAPE 2025
2024
17
AI THREAT LANDSCAPE 2025
2024
18
PART 2
Adversarial Machine Learning Attacks - attacks against AI algorithms aimed to alter the model’s behavior,
evade AI-based detection, or steal the underlying technology
Generative AI System Attacks - attacks against AI’s filters and restrictions intended to generate harmful or
illegal content
Supply Chain Attacks - attacks against ML platforms, libraries, models, and other ML artifacts, whose goal is
to deliver traditional malware
19
AI THREAT LANDSCAPE 2025
2024
Model Deception: Adversaries perform model evasion attacks, in which specially crafted inputs exploit model
vulnerabilities to trigger misclassifications or bypass detection systems.
Model Corruption: Adversaries manipulate the training or continual learning process through data poisoning or model
backdoor attacks to compromise the model’s behavior while maintaining outward legitimacy.
Model and Data Exfiltration: Adversaries use model theft and privacy attacks to steal the model’s functionality or
sensitive training data, endangering intellectual property and data privacy.
These objectives manifest through various attack vectors, exploiting different aspects of machine learning systems'
architecture and operation.
MODEL EVASION
In model evasion attacks, an adversary intentionally manipulates the input to a model to fool it into making an
incorrect prediction. These attacks commonly target classifiers, i.e., models that predict the class labels or categories
for the given data, and can be used, for instance, to bypass AI-based detection, authentication/authorization, or visual
recognition systems.
Input
Decision
Process
Training Training Trained
Data Process Model Prediction
TRAINING PRODUCTION
20
AI THREAT LANDSCAPE 2025
2024
Early evasion techniques focused on minimally perturbed These advancements in evasion techniques across diffusion
adversarial examples, inputs modified so slightly that models, malware detection, and automotive systems
humans wouldn't notice the difference, which caused the demonstrate a concerning trend: adversarial attacks are
model to produce an attacker-desired outcome. Recent becoming increasingly sophisticated and domain-adaptive.
approaches have evolved beyond simple disturbances, The ability of these attacks to bypass various types of
manipulating semantic features and natural variations that defenses while maintaining naturalistic appearances poses
models should be robust against. Rather than relying on a significant challenge for AI security practitioners. The
imperceptible noise, advanced attackers exploit the need for comprehensive cross-domain defense strategies
fundamental limitations of how AI systems process and becomes paramount as AI systems continue to be deployed
interpret inputs, creating adversarial examples that appear in critical infrastructure and security-sensitive applications.
completely natural while reliably triggering specific
misclassifications across different deployment
KEY STAT
environments.
CRITICALITY OF AI MODELS TO BUSINESS
SUCCESS
Several recent research advances highlight
these sophisticated techniques.
21
AI THREAT LANDSCAPE 2025
2024
Input
Decision
Process
Training Training Trained
Data Process Model
Prediction
22
AI THREAT LANDSCAPE 2025
2024
AI Algorithm Backdooring
Cond.
Trigger module
Manipulated
Image
TURTLE
PREDICTION
CAT
Benign Image
also demonstrated that the safety filters of the HiddenLayer researchers discovered a novel
model can be removed by fine-tuning the model method for creating backdoors in neural network
on a very small number of adversarially crafted
models. Using this technique, dubbed
training samples. This research underlines the fact
ShadowLogic, an adversary can implant codeless,
that the immense efforts put into building GenAI
stealthy backdoors in models of any modality by
guardrails can be easily bypassed by simply
fine-tuning the model. manipulating the graph representation of the
model’s architecture. Backdoors created using this
technique will persist through fine-tuning, meaning
foundation models can be hijacked to trigger
ShadowLogic & Graph Backdoors
attacker-defined behavior in any downstream
application when a trigger input is received, making
AI models are serialized (i.e., saved in a form that can be
this attack technique a high-impact AI supply chain
stored or transmitted) using different file formats. Many of
risk. A trigger can be defined in many ways but
these formats utilize a graph representation to store the
must be specific to the model's modality. For
model structure. In machine learning, a graph is a
example, in an image classifier, the trigger must be
mathematical representation of the various computational
part of an image, such as a subset of pixels with
operations in a neural network. It describes the topological
particular values, or with an LLM, a specific
control flow that a model will follow in its typical operation.
keyword, or a sentence.
Graph-based formats include TensorFlow, ONNX, CoreML,
and OpenVino.
Much like with code in a compiled executable, an adversary The emergence of backdoors like ShadowLogic in
can specify a set of instructions for the model to execute computational graphs introduces a whole new class of
and inject these instructions into the file containing the model vulnerabilities that do not require traditional code
model's graph structure. Malicious instructions can override execution exploits. Unlike standard software backdoors that
the outcome of the model’s typical logic employing rely on executing malicious code, these backdoors are
attacker-controlled ‘shadow logic,’ and therefore embedded within the very structure of the model, making
compromising the model's reliability. Adversaries can craft them more challenging to detect and mitigate.
such payloads that will let them control the model's outputs
by triggering a specific behavior.
23
AI THREAT LANDSCAPE 2025
2024
Model graphs are commonly used for image classification models and real-time object detection systems that identify and
locate objects within images or video frames. In the United States, the Customs and Border Patrol (CBP) depends on image
classification and real-time object detection systems to protect the country at every point of entry, every day. AI backdoors of
this nature could enable contraband to go un-detected, weapons to pass screening or allow a terrorist to pass a CBP port of
entry without ever being flagged. The implications for national security are significant.
Inputs
Decision
Process
Reconstructed
Model
TRAINING PRODUCTION
24
AI THREAT LANDSCAPE 2025
2024
25
AI THREAT LANDSCAPE 2025
2024
Google Gemini
creating a line of nonsensical tokens, the LLM can
be fooled into outputting a confirmation message,
Google Gemini is a family of multimodal LLMs trained in usually including the information in the prompt.
many forms of media, such as text, images, audio, videos,
and code. While testing these models, HiddenLayer
researchers found multiple prompt hacking vulnerabilities, With the 2024 US elections, Google took special care to
including system prompt leakage, the ability to output ensure that the Gemini models did not generate
misinformation, and the ability to inject a model indirectly misinformation, particularly around politics. However, this
with a delayed payload via Google Drive. also was bypassed. Researchers generated fake news by
telling the bot that it was allowed to create fictional content
Although Gemini had been fine-tuned to avoid and that the content would not be used anywhere.
leaking its system prompt, it has been possible to
bypass these guardrails using synonyms and KROP - Knowledge Return Oriented Prompting
obfuscation. This attack exploited the Inverse
Scaling property of LLMs. As the models get larger, it Knowledge Return Oriented Prompting (KROP) is a novel
becomes challenging to fine-tune them on every prompt injection technique designed to bypass existing
single example of attack. Models, therefore, tend to safety measures in LLMs. Traditional defenses, such as
be susceptible to synonym attacks that the original prompt filters and alignment-based guardrails, aim to
developers may not have trained them on.
prevent malicious inputs by detecting and blocking explicit
Another successful method of leaking Gemini’s prompt injections. However, KROP circumvents these
system prompt was using patterns of repeated defenses by leveraging references from an LLM's training
uncommon tokens. This attack relies on data to construct obfuscated prompt injections. This
instruction-based fine-tuning. Most LLMs are trained method assembles "KROP Gadgets," analogous to Return
to respond to queries with a clear delineation Oriented Programming (ROP) gadgets in cybersecurity,
between the user’s input and the system prompt. By enabling attackers to manipulate LLM outputs without
direct or detectable malicious inputs.
In the academic paper that introduces this technique, researchers demonstrate the efficacy of KROP through
various examples, including bypassing content restrictions in models like DALL-E 3 and executing SQL injection
attacks via LLM-generated queries. For instance, adversaries could jailbreak the model's safeguards to generate
prohibited images by guiding the model to spit out restricted content through indirect references. KROP can also
allow attackers to produce harmful SQL commands without explicitly stating them, evading standard prompt filters.
26
AI THREAT LANDSCAPE 2025
2024
INDIRECT INJECTION
Besides traditional prompt inputs, many GenAI models now also accept external content, such as files or URLs,
making it easier for the user to share data conveniently. If an adversary controls this external content, they can
embed malicious prompts inside to perform a prompt injection attack indirectly. An indirect prompt injection
will typically be inserted into documents, images, emails, or websites, depending on what the target model has
access to.
LLM
Website
Maliocious Activity
27
AI THREAT LANDSCAPE 2025
2024
Claude is a multimodal AI assistant developed by Anthropic. With the multitude of bypass techniques, the game
Its third version was introduced in March 2024, while in between those implementing the guardrails and
October 2024, Anthropic announced an improved version those trying to break them is cat-and-mouse. The fact
3.5, together with a "groundbreaking” capability called that an adversarial prompt used successfully
Computer Use. According to the official release, this new yesterday might not work the day after has spun a
capability lets “developers direct Claude to use computers rise of automated attack solutions. These include
the way people do—by looking at a screen, moving a cursor, hacking-as-a-service schemes in which experienced
clicking buttons, and typing text.” Claude can perform adversaries provide a paid platform where users can
actions such as opening files, executing shell commands, access "jailbroken" GenAI services.
and automating workflows.
PRIVACY ATTACKS
Modern LLM solutions implement different kinds of
filters to prevent such situations. However, The rise of generative AI and foundation models has
HiddenLayer researchers proved that with a bit of introduced significant privacy and intellectual
obfuscation, it was possible to bypass Claude's property risks. Trained on massive datasets from
guardrails and run dangerous commands: all it took public and proprietary sources, these models often
was to present these commands as safe within a inadvertently memorize sensitive or copyrighted
security testing context. information, such as personally identifiable
information (PII), passwords, and proprietary content,
making them vulnerable to extraction. Their
As agentic AI becomes more widely integrated and more complexity further enables attacks like model
autonomous in its actions, the potential consequences of inversion, where adversaries infer sensitive training
such attacks also scale up. Unfortunately, there is no easy data attributes and membership inference to
fix for this vulnerability; in fact, Anthropic warns Claude's determine if specific data points were in the training
users to take serious precautions with Computer Use, set. These risks are particularly concerning in
limiting the utility of this new feature. sensitive domains like healthcare, finance, and
education, where private information may
unintentionally appear in model outputs.
28
AI THREAT LANDSCAPE 2025
2024
Research has highlighted several attacks that Released in November 2023, Microsoft Copilot
exemplify and deepen these risks: Studio is a platform for building, deploying, and
managing custom AI assistants (a.k.a. copilots).
Training Data Extraction Attacks allow The platform boosts security features, including
adversaries to reconstruct sensitive or robust authentication, data loss prevention, and
copyrighted content, such as private content guardrails for the created bots. However,
communications or proprietary datasets, these safety measures are not bulletproof. At
from model outputs. BlackHat US 2024, a former Microsoft researcher
presented 15 different ways adversaries could use
Memorization Attacks show that models Copilot bots to exfiltrate sensitive data. One of
can regurgitate rare or unique data points these techniques demonstrated a phishing attack
from their training set, including PII or containing an indirect prompt injection, allowing an
intellectual property when queried with attacker to access the victim's internal emails. The
tailored prompts. These attacks expose
adversary could then craft and send out rogue
vulnerabilities in foundational AI models
communication, posing as the victim.
and raise ethical and legal questions
about using such technologies.
Adversarial Prompting Attacks similarly Governments and regulatory bodies have started
exploit the models by manipulating them addressing these emerging risks, but significant gaps
into replicating copyrighted material or remain. By combining innovation, comprehensive
revealing sensitive information while regulation, and organizational oversight, generative AI's
sidestepping built-in protections. privacy and ethical challenges can be better managed,
fostering trust in these transformative technologies.
29
AI THREAT LANDSCAPE 2025
2024
Introduced by Amazon in April 2023 and made publicly The investigation highlighted the broader implications of
available later that year, Amazon Bedrock is a service such vulnerabilities in the age of AI-generated media. While
designed to help build and scale generative AI applications. watermarking is a promising method to verify content
It offers access to foundation models from leading AI authenticity, the study revealed its susceptibility to
companies via a single API. One family of models available advanced attacks. Model Watermarking Removal Attacks
through Bedrock is Amazon’s own Titan (now replaced by its erase evidence of origin and undermine copyright
next incarnation, Nova). Amongst others, Titan includes a enforcement, as well as trust. The ability to imperceptibly
set of models that generate images from text prompts alter images and create "authentic" forgeries raises
called Titan Image Generator. These models incorporate concerns about deepfakes and manipulating public
invisible watermarks into all generated images. Although perception. With the evolution of AI technology, the risks
embedding digital watermarks is definitely a step in the associated with its misuse also evolve, emphasizing the
right direction and can vastly help in fighting deepfakes, the importance of robust safeguards.
early implementation of the Titan Image Generator's
Although AWS addressed the issue promptly, the research
watermark system was found to be trivial to break.
highlighted that digital content authentication might prove
HiddenLayer's researchers demonstrated that by leveraging problematic.
specific image manipulation techniques, an attacker can The year 2024 saw numerous developments in attack
infer Titan's watermarks, replace them, or remove them techniques targeting both predictive and generative AI
entirely, undermining the system’s ability to ensure content models, from new model evasion methods to innovative
provenance. The researchers found they could extract and backdoors to creative prompt injection techniques. These
reapply watermarks to arbitrary images, making them are very likely to continue to develop and improve over the
appear as if they were AI-generated by Titan. Adversaries coming months and years.
could use this vulnerability to spread misinformation by
making fake images seem authentic or casting doubt on
Prediction from last year: “There will be a significant
real-world events. AWS has since patched the vulnerability, increase in adversarial attacks against AI”
ensuring its customers are no longer at risk.
In addition to copyrighted materials like images, logos, audio, video, and general multimedia, digital watermarks are often
embedded in proprietary data streams or real-time market analysis tools used by stock markets and traders. If those digital
watermarks are manipulated, it could alter how trading algorithms and investors interpret data. This could lead to incorrect
trades and market disruptions since fake or misleading data can cause sudden market shifts.
30
AI THREAT LANDSCAPE 2025
2024
The number and severity of software vulnerabilities Other platforms with serious vulnerabilities include
identified within the AI ecosystem reveal widespread MindsDB, which allowed arbitrary code execution via
issues across major ML platforms and tools. The insecure eval and pickle mechanisms, and Autolabel,
most prevalent concern in 2024 was deserialization susceptible to malicious CSV exploitation. Cleanlab faced
vulnerabilities, particularly involving pickle files, deserialization risks tied to the Datalabs module, while
which affected popular platforms like AWS Guardrails and NeMo suffered from unsafe evaluation and
Sagemaker, TensorFlow Probability, MLFlow, and arbitrary file write vulnerabilities, respectively. Bosch
MindsDB. These were accompanied by unsafe code AIShield's unsafe handling of HDF5 files enabled malicious
evaluation practices using unprotected eval() or lambda layers to execute arbitrary code.
exec() functions, as well as cross-site scripting (XSS)
and cross-site request forgery (CSRF) flaws. The Serialization security and input validation remain critical
impact of these vulnerabilities typically manifests in challenges in the AI ecosystem, with particular risks
three main ways: arbitrary code execution on victim surrounding model loading and data processing functions.
machines, data exfiltration, and web-based attacks There is a pressing need for robust security practices,
through UI components. Common attack vectors including safer deserialization methods, authentication
included malicious pickle files, crafted model files measures, and sandboxing mechanisms, to safeguard AI
(especially in HDF5 format), and harmful input data tools against increasingly sophisticated attacks.
through CSV or XML files.
In February 2024, HiddenLayer researchers Honeypots are decoy systems designed to attract
uncovered six zero-day vulnerabilities in a popular attackers and provide valuable insights into their
MLOps platform, ClearML. Encompassing path tactics in a controlled environment. Our team
traversal, improper authentication, insecure configured honeypot systems to observe potential
storage of credentials, Cross-Site Request Forgery, adversarial behavior after identifying the
Cross-Site Scripting, and arbitrary execution aforementioned vulnerabilities within MLOps
through unsafe deserialization, these vulnerabilities platforms such as ClearML and MLflow.
collectively create a full attack chain for
public-facing servers. A few months later, ten
deserialization flaws were disclosed in MLFlow, a
31
AI THREAT LANDSCAPE 2025
2024
32
AI THREAT LANDSCAPE 2025
2024
Package Confusion
33
AI THREAT LANDSCAPE 2025
34
AI THREAT LANDSCAPE 2025
2024
35
AI THREAT LANDSCAPE 2025
2024
36
AI THREAT LANDSCAPE 2025
2024
37
AI THREAT LANDSCAPE 2025
2024
As it’s still an emerging attack vector, it's difficult to assess solutions, most of which, at the moment, don't even scan
the true scale of the problem. More sophisticated targeted model files, so whatever ends up there is usually shared by
attacks will leave little to no trace in public repositories. researchers or threat actors testing early / non-sensitive
Most files on VirusTotal are uploaded by anti-malware versions of their malware.
Pickle Injection
Upload Deployment
Steganography
Lateral
Movement
PAYLOAD
38
PART 3
Advancements in Security
for AI
AI Red Teaming Evolution
ADVERSARIAL TOOLING
The need to test AI systems against adversarial attacks has
The year 2024 was all about generative AI, so the
evolved throughout the past year. The White House
focus of adversarial tooling released this year was
Executive Order on Safe, Secure, and Trustworthy
understandably on GenAI pen-testing.
Development and Use of Artificial Intelligence in October of
2023 made efforts not only to define what AI red teaming is
Many open-source AI red teaming tools are available,
but also to urge organizations to go through the process of
such as PyRIT and Garak, as well as commercial
making sure their AI systems are resilient. Other best
options, such as HiddenLayer’s Automated Red
practice frameworks, such as the NIST AI Risk Management
Teaming utility. The function of such tools is to
Framework and the upcoming EU AI Act, also have similar
quickly and reliably test an AI system against known
wording around how organizations should red-team their AI
attacks by sending a list of static or mutated prompts
systems before putting them into production.
to the target model or even dynamically crafting
prompts to achieve an attacker-specified objective.
KEY STAT
39
AI THREAT LANDSCAPE 2025
2024
40
AI THREAT LANDSCAPE 2025
2024
In June 2024, MITRE's Center for Threat-Informed Defense In 2023, OWASP released the Top 10 Machine Learning risks.
launched a new collaborative initiative called the Secure AI These controls help developers and security teams identify
research project to expand the MITRE ATLAS database and attack vectors, model threats and implement prevention
help develop strategies to mitigate risks to AI systems. The measures. These risks, paired with frameworks like ATLAS,
project aims to facilitate the rapid exchange of information clarify threats to machine learning and provide actionable
about the evolving AI threat landscape by sharing guidance.
anonymized data from AI-related incidents. Its diverse
participants include industry leaders from the technology,
communications, finance, and healthcare sectors.
In late 2024, OWASP released an updated version
of the OWASP Top 10 for LLM Applications 2025.
This list covers items such as prompt injection,
WHAT’S NEW IN OWASP
output handling, and excessive agency. This new
The Open Worldwide Application Security Project version reflects the rapidly evolving landscape of
(OWASP) is a non-profit organization and online LLM and Generative AI applications by
community that provides free guidance and reorganizing some previous vulnerabilities and
resources, such as articles, documentation, and tools adding new ones. For example, the Model Denial of
in the field of application security. The OWASP Top 10 Service and the Model Theft threats were
lists comprise the most critical security risks faced by combined into the new Unbounded Consumption
various web technologies, such as access control threat, and the Vector and Embedding
and cryptographic failures. Weaknesses threat was added, showing growing
concern over the risks associated with Retrieval
Augmented Generation (RAG) systems. A mapping
showing the relationships between the 2023 and
2025 versions of the threats is shown below.
2025 OWASP Top 10 LLMs
LLM02: Sensitive Information Disclosure OWASP also released two additional documents for
practitioners. The LLM Applications Cybersecurity
LLM03: Supply Chain and Governance Checklist provides a list of items
to consider when deploying an AI application. The
LLM and Generative AI Security Solutions
LLM04: Date and Model Poisoning
Landscape is a searchable collection of traditional
and emerging security controls for managing AI
LLM05: Improper Output Handling
application risks.
LLM09: Misinformation
41
AI THREAT LANDSCAPE 2025
2024
Cryptographic signing is a cornerstone of digital The initiative of AIBOM (also called MLBOM) aims to
security, ensuring the integrity and authenticity of translate the ideas behind SBOM into the AI ecosystem,
communications, software, and documents in enabling organizations to better understand their AI
industries like finance, healthcare, and software inventory and provide traceability and auditability. AIBOM
development. However, despite the critical role of includes information about models, training procedures,
machine learning (ML), no standardized method data pipelines, and performance and helps to implement
exists to cryptographically verify the origins or and govern AI responsibly. At the forefront of the decision
integrity of ML models and artifacts, leaving them on the AIBOM standards are NIST, OWASP, CycloneDX, and
vulnerable to tampering and trust issues. SPDX.
42
AI THREAT LANDSCAPE 2025
2024
43
AI THREAT LANDSCAPE 2025
2024
44
PART 4
Predictions and
Recommendations
Predictions for 2025
It’s time to dust off the crystal ball once again! Over the past year, AI has truly been at the forefront of cyber security,
with increased scrutiny from attackers, defenders, developers, and academia. As various forms of generative AI drive
mass AI adoption, we find that the threats are not lagging far behind, with LLMs, RAGs, Agentic AI, integrations, and
plugins being a hot topic for researchers and miscreants alike.
Integrating agentic AI will blur the lines between adversarial As deepfake technologies become more accessible, audio,
AI and traditional cyberattacks, leading to a new wave of visual, and text-based digital content trust will face
targeted threats. Expect phishing and data leakage via near-total erosion. Expect to see advances in AI
agentic systems to be a hot topic. watermarking to help combat such attacks.
45
AI THREAT LANDSCAPE 2025
2024
Organizations will integrate adversarial machine learning As hardware vendors capitalize on AI with advances in
(ML) into standard red team exercises, testing for AI bespoke chipsets and tooling to power AI technology,
vulnerabilities proactively before deployment. expect to see attacks targeting AI-capable endpoints
intensify, including:
In the 2024 threat report, we made several recommendations for organizations to consider that were similar in
concept to existing security-related control practices but built specifically for AI, such as:
Identifying and cataloging AI systems and related assets. Strengthening models to withstand adversarial attacks and
verifying their integrity.
Risk Assessment and Threat Modeling
Secure Development Practices
Evaluating potential vulnerabilities and attack vectors
specific to AI. Embedding security throughout the AI development
lifecycle.
Data Security and Privacy
Continuous Monitoring and Incident Response
Ensuring robust protection for sensitive datasets.
Establishing proactive detection and response mechanisms
for AI-related threats.
46
AI THREAT LANDSCAPE 2025
2024
These practices remain foundational as organizations navigate the continuously unfolding AI threat landscape.
Building on these recommendations, 2024 marked a turning point in the AI landscape. The rapid AI 'electrification' of
industries saw nearly every IT vendor integrate or expand AI capabilities, while service providers across sectors—from HR to
law firms and accountants—widely adopted AI to enhance offerings and optimize operations. This made 2024 the year that
AI-related third—and fourth-party risk issues became acutely apparent.
During the Security for AI Council meeting at Black Hat this year, the subject of AI third-party risk arose. Everyone in the
council acknowledged it was generally a struggle, with at least one member noting that a "requirement to notify before AI is
used/embedded into a solution” clause was added in all vendor contracts. The council members who had already been
asking vendors about their use of AI said those vendors didn’t have good answers. They “don't really know,” which is not only
surprising but also a noted disappointment. The group acknowledged traditional security vendors were only slightly better
than others, but overall, most vendors cannot respond adequately to AI risk questions. The council then collaborated to
create a detailed set of AI 3rd party risk questions. We recommend you consider adding these key questions to your existing
vendor evaluation processes going forward.
? ?
Do you scan your models for malicious
Where did your model come from? code? How do you determine if the model
is poisoned?
?
What is your threat model for AI-related
attacks? Are your threat model and
?
mitigations mapped or aligned to the Do you validate the integrity of the data
MITRE Atlas? presented by your AI system and/or model?
Remember that the security landscape—and AI technology—is dynamic and rapidly changing. It's crucial to stay informed
about emerging threats and best practices. Regularly update and refine your AI-specific security program to address new
challenges and vulnerabilities.
And a note of caution. In many cases, responsible and ethical AI frameworks fall short of ensuring models are secure before
they go into production and after an AI system is in use. They focus on things such as biases, appropriate use, and privacy.
While these are also required, don’t confuse these practices for security.
47
AI THREAT LANDSCAPE 2025
2024
HiddenLayer
Resources
PRODUCTS AND SERVICES
HiddenLayer AISec Platform
is a GenAI Protection Suite that is purpose-built to ensure the integrity of
your AI models throughout the MLOps pipeline. The Platform provides
detection and response for GenAI and traditional AI models to detect prompt
injections, adversarial AI attacks, and digital supply chain vulnerabilities.
Learn More
Learn More
Learn More
Learn More
Learn More
49
AI THREAT LANDSCAPE 2025
2024
HiddenLayer
Resources
HIDDENLAYER RESEARCH
ShadowLogic
A novel method for creating backdoors in neural network models.
50
AI THREAT LANDSCAPE 2025
2024
About HiddenLayer
HiddenLayer
a Gartner-recognized Cool Vendor for AI Security, is the leading provider of
Security for AI. Its security platform helps enterprises safeguard the machine
learning models behind their most important products. HiddenLayer is the
only company to offer turnkey security for AI that does not add unnecessary
complexity to models and does not require access to raw data and
algorithms. Founded by a team with deep roots in security and ML,
HiddenLayer aims to protect enterprise’s AI solutions from inference, bypass,
extraction attacks, and model theft. The company is backed by a group of
strategic investors, including M12, Microsoft’s Venture Fund, Moore Strategic
Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures.
REQUEST A DEMO:
https://hiddenlayer.com/book-a-demo/
AUTHORS/CONTRIBUTORS
A special thank you to the teams that made this report come to life:
51
Validating the integrity of AI models is crucial because it prevents the deployment of compromised models that could yield unreliable and insecure outputs. This can be achieved through comprehensive security audits, regular scanning for vulnerabilities, and implementing robust model verifiability frameworks. Organizations should also invest in continuous monitoring, detection and response systems such as HiddenLayer’s suite, ensuring immediate alerts and actions against any detected compromise, thus maintaining the reliability of their AI systems .
Serialization vulnerabilities in machine learning pipelines can be exploited through crafted malicious files such as pickle, HDF5, YAML, or XML. These vulnerabilities allow adversaries to execute arbitrary code or access sensitive data by embedding harmful commands within serialized data files. Consequences include unauthorized system access, data exfiltration, and execution of malicious payloads, compromising the integrity and security of ML models and systems. This threat extends to vulnerabilities in platforms like ClearlML and MLFlow, further underscoring the need for rigorous security practices .
The primary objectives of adversarial attacks against machine learning systems are model deception, model corruption, and model and data exfiltration. Model deception involves manipulating inputs to exploit model vulnerabilities, leading to incorrect predictions. Model corruption is achieved by influencing the training process, potentially through data poisoning or backdoor attacks, compromising the model's behavior while retaining outward legitimacy. Model and data exfiltration involve theft of the model's functionality or sensitive data, posing risks to intellectual property and data privacy .
Popular MLOps platforms like ClearML and MLFlow have been found vulnerable to multiple security issues, such as path traversal, improper authentication, insecure credential storage, Cross-Site Request Forgery, Cross-Site Scripting, and unsafe deserialization. These vulnerabilities create a full attack chain enabling adversaries to execute arbitrary code through malicious files and infiltrate systems. They impact AI systems by making them susceptible to unauthorized access and manipulation of data and models, endangering both data integrity and security .
AI red teaming is crucial for testing AI systems against adversarial attacks to ensure their resilience. Over the past year, its significance has been underscored by regulatory efforts such as the White House Executive Order on AI, which alongside frameworks like the NIST AI Risk Management and the EU AI Act, encourages organizations to thoroughly evaluate AI vulnerabilities before deployment. AI red teaming's evolution has been driven by increasing adversarial threats and the need for rigorous validation processes, thus becoming a vital strategy for securing AI systems .
To address the risks associated with model backdoors, the AI community should invest in creating comprehensive defenses, detection methods, and verification techniques. This involves implementing new strategies for identifying backdoors, even within graph-based architectures, and developing robust security programs that encompass vulnerabilities across AI use cases. Furthermore, establishing industry-wide best practices and frameworks aimed at enhancing model verification and assurance of reliability is essential. Agile modifications in response to emerging threats are also necessary, ensuring AI systems remain trustworthy and secure .
Model theft, also known as model extraction, occurs when an adversary replicates a machine-learning model without authorization by querying the target model and analyzing its outputs. This allows the adversary to reverse-engineer the model's functionality and potentially steal sensitive training data or intellectual property. The repercussions for organizations include loss of competitive advantage, compromised proprietary knowledge, and potential exposure of private data, negatively impacting the organization's market position and trustworthiness .
Model backdoors pose risks to AI systems by embedding vulnerabilities within the model's architecture that activate when specific trigger inputs are received. These backdoors do not require traditional code execution exploits and can be format-agnostic, making them hard to detect and capable of affecting various model architectures and domains. They can undermine the reliability of AI systems by making it impossible to trust their outputs if a backdoor is present, potentially affecting critical infrastructure and decision-making processes .
Adversarial attack advancements challenge AI security by being increasingly sophisticated and domain-adaptive. These attacks now exploit semantic features and natural variations rather than minor perturbations, allowing them to maintain natural appearances while triggering misclassifications across diverse environments. This evolution demonstrates the attacks' capability to bypass defenses in systems like diffusion models, malware detection, and automotive systems. The domain-spanning efficacy of these attacks necessitates comprehensive cross-domain defense strategies, especially as AI is increasingly deployed in security-sensitive applications .
Supply chain security is fundamental in AI because it safeguards against the introduction of malicious components or models within the AI development and deployment pipeline. Trends in this area include increasing attacks on ML artifacts, growing interest from cybercriminals, and the development of more sophisticated supply chain attack vectors. These trends necessitate robust security strategies that address vulnerabilities in the AI supply chain, ensuring that all components are secure from development through deployment .