0% found this document useful (0 votes)

26 views7 pages

Confused Pilot Attack and Mitigation

Uploaded by

erica jayasundera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views7 pages

Confused Pilot Attack and Mitigation

Uploaded by

erica jayasundera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ConfusedPilot

A Sneak Attack on AI Systems

OCTOBER 16, 2024

ERICA JAYASUNDERA
MINDVIEW AI

Classification | Confidential-External
Introduction
The digital landscape is rapidly evolving, with Artificial Intelligence (AI) playing an increasingly
prominent role. However, this growing reliance on AI introduces new vulnerabilities, as highlighted by
the recent discovery of the "ConfusedPilot" attack. Researchers at the University of Texas at Austin's
Spark Lab, led by Professor Mohit Tiwari, identified this novel cyberattack method targeting Retrieval-
Augmented Generation (RAG) based AI systems.

What are RAG-based AI systems?

RAG systems combine two techniques: retrieval and generation. They retrieve relevant information
from a vast data pool and then utilize that information to generate human-like text, code, or other
outputs. This makes them highly versatile, with applications ranging from Microsoft 365 Copilot's code
completion to chatbots and automated content creation tools.

How does ConfusedPilot work?

The attack exploits the way RAG systems reference data. Here's the breakdown:

1. Planting the Seed: The attacker introduces seemingly innocuous documents containing
carefully crafted strings into the AI system's data pool. This can be achieved through various
means, such as uploading documents to a shared workspace or exploiting vulnerabilities in data
ingestion processes.
2. Triggering the Response: When a user interacts with the AI system by posing a query, the
system retrieves relevant data to formulate its response. If the crafted strings are present in the
retrieved documents, they can manipulate the AI's output.

Classification | Confidential-External
3. Misinformation and Flawed Decisions: The attacker's strings can trick the AI into
generating misleading or incorrect responses. This could lead to critical consequences,
such as:
o Financial Losses: If an AI system used for financial analysis is manipulated, it could
provide inaccurate recommendations leading to bad investments or fraudulent
transactions.
o Operational Disruptions: A compromised AI system assisting with logistics or supply
chain management could disrupt entire operations.
o Reputational Damage: Fabricated information generated by a compromised AI used
for customer service could damage an organization's reputation.

The Attack Flow

An adversary attempting ConfusedPilot attack would likely follow these steps:

1. Data Environment Poisoning: An attacker introduces an innocuous document that contains

specifically crafted strings into the target’s environment. This could be achieved by any identity
with access to save documents or data to an environment indexed by the AI copilot.
2. Document used in Query Response: When a user makes a relevant query, the RAG system
retrieves the document containing these strings.
3. AI Copilot interprets strings as user Instructions: The document contains strings that could act
as instructions to the AI system, including:
1. Content Suppression: The malicious instructions cause the AI to disregard other
relevant, legitimate content.
2. Misinformation Generation: The AI generates a response using only the corrupted
information.

Classification | Confidential-External
3. False Attribution: The response may be falsely attributed to legitimate sources,
increasing its perceived credibility.
4. AI Copilot retains instructions: Even if the malicious document is later removed, the corrupted
information may persist in the system’s responses for a period of time.

A few illustrative examples:

Enterprise knowledge management systems:

If an attacker introduces a malicious document into the company's knowledge base copilot (perhaps
through social engineering or deliberate sabotage), they could manipulate AI-generated responses
across the organisation to spread misinformation. This could potentially influence critical business
decisions.

AI-assisted decision support systems:

In environments where AI systems are used to analyse data and provide recommendations for strategic
decisions, an attacker could inject false information that persists even after the original malicious
content is removed. This could lead to a series of poor decisions over time due to reliance on AI, with
the source of the problem remaining elusive without thorough forensic investigation.

Customer-facing AI services:

For organisations providing AI-powered services to customers, Confused Pilot becomes even more
dangerous. An attacker could potentially inject malicious data that affects the AI's responses to multiple
customers, leading to widespread misinformation, loss of trust, and potential legal liabilities.

End Users Relying on AI-generated Content:

Whether it's employees or executives, any end user using AI assistants for daily tasks or synthesising
AI-generated insights could make critically flawed decisions and unknowingly spread misinformation
throughout the organisation.

Why is ConfusedPilot concerning?

Several factors elevate the concern regarding ConfusedPilot:

• Wide Attack Surface: RAG-based AI systems are increasingly prevalent, making a large
number of organizations potentially vulnerable.
• Low Barrier to Entry: Launching a ConfusedPilot attack requires minimal technical expertise
compared to other cyberattacks.
• Persistence: Even after removing the malicious seed document, the crafted strings might linger
in cached data, making the attack persistent.
• Evasion Tactics: The attack can potentially bypass existing AI security measures designed to
detect anomalies in data or generated responses.

Classification | Confidential-External
Defending Against ConfusedPilot:

While ConfusedPilot presents a challenge, researchers and security professionals are actively
developing mitigation strategies. Here are some potential solutions:

• Data Governance: Implementing stricter controls on data entry and access can minimize the
chances of malicious content entering the system.
• Data Provenance: Tracking the source and history of data used by the AI system can help
identify suspicious or manipulated information.
• Adversarial Training: Training the AI system with examples of manipulated data can help it
recognize and resist manipulated responses.
• Continuous Monitoring: Regularly monitoring AI outputs for inconsistencies and unexpected
trends can flag potential attacks.
• User Awareness: Educating users about the potential for AI manipulation can help them
critically evaluate AI-generated responses.

The Evolving Landscape of AI Security

ConfusedPilot serves as a wake-up call for the AI development and security communities. While AI
offers immense potential, securing these systems is crucial to ensure their reliability and prevent them
from becoming a liability. Ongoing research into attack detection, data integrity, and robust AI
architectures will be essential in building a future where AI can be trusted.

Further Considerations:

This analysis provides a foundational understanding of ConfusedPilot. Here are some additional points
for exploration:

• The ethical implications of AI manipulation: ConfusedPilot highlights the potential for

malicious actors to misuse AI for disinformation campaigns or social engineering attacks.
• Regulations and standards for AI security: There's a growing need for regulations and
standards that ensure the responsible development and deployment of AI systems, with
security being a core consideration.
• The role of user trust: As reliance on AI grows, building user trust is essential. This can be
achieved through transparency about how AI systems work and demonstrably robust security
measures.

The Importance of a Layered Security Approach for RAG Systems

Retrieval-Augmented Generation (RAG) systems, which combine information retrieval and text
generation, have become increasingly prevalent in various industries. However, their growing reliance
on external data sources makes them vulnerable to cyberattacks. A layered security approach is crucial
to protect RAG systems from these threats and ensure their integrity and reliability.

Understanding the Risks

RAG systems are susceptible to several security risks:

Classification | Confidential-External
• Data Poisoning: Attackers can introduce malicious data into the system's knowledge base,
influencing the AI's responses and potentially leading to misinformation or harmful outputs.
• Model Extraction: Adversaries can extract the underlying model parameters, compromising
the system's intellectual property and potentially creating malicious copies.
• Supply Chain Attacks: Vulnerabilities in the underlying software or hardware components
used by RAG systems can be exploited to gain unauthorized access or control.

The Benefits of a Layered Security Approach

A layered security approach involves implementing multiple security controls at different levels of the
system to create a robust defense. This approach offers several benefits:

• Enhanced Resilience: By combining various security measures, a layered approach makes it

more difficult for attackers to breach the system's defenses.
• Risk Mitigation: Each layer of security can address specific vulnerabilities, reducing the overall
risk of a successful attack.
• Compliance Adherence: Many industries have strict data privacy and security regulations. A
layered security approach can help organizations comply with these requirements.
• Proactive Defense: A layered approach allows for continuous monitoring and adaptation to
emerging threats, ensuring that the system remains protected.

Key Components of a Layered Security Approach

A comprehensive layered security approach for RAG systems should include the following
components:

1. Data Security:
o Input Validation: Implement input validation to filter out malicious or unexpected
data.
o Data Encryption: Encrypt sensitive data both at rest and in transit to protect it from
unauthorized access.
o Data Masking: Mask sensitive data to prevent unauthorized disclosure.
2. Model Security:
o Model Obfuscation: Use techniques like quantization, pruning, or knowledge
distillation to make the model more difficult to reverse engineer.
o Model Monitoring: Continuously monitor the model's behavior for anomalies that
may indicate a compromise.
3. Infrastructure Security:
o Network Security: Implement firewalls, intrusion detection systems, and other
network security measures to protect the system from external threats.
o Access Controls: Restrict access to the system to authorized users only.
o Patch Management: Keep all software and hardware components up-to-date with
the latest security patches.
4. AI Security:
o Adversarial Training: Train the model to be resilient to adversarial attacks, which
aim to manipulate the system's outputs.

Classification | Confidential-External
o Explainability: Increase the transparency of the model's decision-making process to
identify and mitigate potential biases or vulnerabilities.
5. Incident Response:
o Incident Response Plan: Develop a comprehensive incident response plan to
address security breaches effectively.
o Regular Testing: Conduct regular security testing and penetration testing to identify
vulnerabilities and improve the system's resilience.

By implementing a layered security approach, organizations can significantly enhance the protection of
their RAG systems and mitigate the risks associated with their use.

Reference :

1. https://securityboulevard.com/2024/10/confusedpilot-ut-austin-symmetry-systems-uncover-
novel-attack-on-rag-based-ai-systems/
2. https://www.infosecurity-magazine.com/news/confusedpilot-attack-targets-ai/

Classification | Confidential-External

CySA 002
No ratings yet
CySA 002
760 pages
CertMaster Network+ (N10-009) Module 9 - Presentation Slides
No ratings yet
CertMaster Network+ (N10-009) Module 9 - Presentation Slides
44 pages
Engaging With Artificial Intelligence (AI)
100% (1)
Engaging With Artificial Intelligence (AI)
15 pages
GEN AI Security Genai Security Best Practices Cheat Sheet
No ratings yet
GEN AI Security Genai Security Best Practices Cheat Sheet
6 pages
Mapping Course Content To CompTIA Security+
No ratings yet
Mapping Course Content To CompTIA Security+
33 pages
Abstract NT
No ratings yet
Abstract NT
33 pages
Mapping Course Content To CompTIA Security+ (Exam SY0-601)
No ratings yet
Mapping Course Content To CompTIA Security+ (Exam SY0-601)
33 pages
Whitepaper Sample
No ratings yet
Whitepaper Sample
13 pages
Black Hat USA 2011 - Weapons of Targeted Attack: Modern Document Exploit Techniques (Paper)
No ratings yet
Black Hat USA 2011 - Weapons of Targeted Attack: Modern Document Exploit Techniques (Paper)
18 pages
Analysis of Remote Access Trojans and Network Security: Topics
No ratings yet
Analysis of Remote Access Trojans and Network Security: Topics
15 pages
Presentation 6
No ratings yet
Presentation 6
20 pages
Report AI Cyber
No ratings yet
Report AI Cyber
33 pages
Ch02-Secure Information Systems - v1
No ratings yet
Ch02-Secure Information Systems - v1
56 pages
SANS - Draft - Critical AI Security Controls V1.1
No ratings yet
SANS - Draft - Critical AI Security Controls V1.1
15 pages
CompTIA Security + Chapter 3
No ratings yet
CompTIA Security + Chapter 3
35 pages
DATA4300 Week 02 Workshop
No ratings yet
DATA4300 Week 02 Workshop
40 pages
Barcode Security Threat White Paper FINAL - Compressed
No ratings yet
Barcode Security Threat White Paper FINAL - Compressed
12 pages
RAG Attack Execution and Prevention
No ratings yet
RAG Attack Execution and Prevention
7 pages
Networks Security: Safa Khaled Abdallah
No ratings yet
Networks Security: Safa Khaled Abdallah
13 pages
CAMS Cybersecurity AI Systems
No ratings yet
CAMS Cybersecurity AI Systems
19 pages
Threats and Attacks
No ratings yet
Threats and Attacks
30 pages
AI in Cyber Security
100% (3)
AI in Cyber Security
14 pages
Google Cloud - Perspectives On Security From The Board
No ratings yet
Google Cloud - Perspectives On Security From The Board
13 pages
Digital Security in Academic Libraries
No ratings yet
Digital Security in Academic Libraries
58 pages
Generative AI - Capabilities and Risks
No ratings yet
Generative AI - Capabilities and Risks
8 pages
Upper Hand With Ai v10 - 1003710
No ratings yet
Upper Hand With Ai v10 - 1003710
36 pages
Security Word
No ratings yet
Security Word
5 pages
2.1 Application Security
No ratings yet
2.1 Application Security
35 pages
Cybersecurity 2
No ratings yet
Cybersecurity 2
29 pages
Attack Types
No ratings yet
Attack Types
30 pages
AI
No ratings yet
AI
10 pages
CompTIA Security+ (SY0-601) Learn
100% (3)
CompTIA Security+ (SY0-601) Learn
81 pages
Can Artificial Intelligence Power Future Malware?: ESET White Paper
No ratings yet
Can Artificial Intelligence Power Future Malware?: ESET White Paper
16 pages
AI Product Security A Primer For Developers
No ratings yet
AI Product Security A Primer For Developers
10 pages
Basics Threat Crypto
No ratings yet
Basics Threat Crypto
121 pages
08 Jan Palo Alto v2 82591
No ratings yet
08 Jan Palo Alto v2 82591
29 pages
ATARC AIDA Guidebook - FINAL K
No ratings yet
ATARC AIDA Guidebook - FINAL K
3 pages
White Blue Simple Modern Enhancing Sales Strategy Presentation
No ratings yet
White Blue Simple Modern Enhancing Sales Strategy Presentation
24 pages
Overview of University of Tennessee at Chattanooga
No ratings yet
Overview of University of Tennessee at Chattanooga
41 pages
3.cyber Threat Landscape-Andy Choy
No ratings yet
3.cyber Threat Landscape-Andy Choy
19 pages
AI Security
No ratings yet
AI Security
5 pages
Bluetooth Counter Measures
No ratings yet
Bluetooth Counter Measures
6 pages
CYBER SECURITY For SHIPS - Awareness of Cyber Security For Ships
No ratings yet
CYBER SECURITY For SHIPS - Awareness of Cyber Security For Ships
42 pages
Computer Security Ch2
No ratings yet
Computer Security Ch2
11 pages
Bluetooth Counter Measures
No ratings yet
Bluetooth Counter Measures
3 pages
05-Securing Network Operations, Databases, and Applications
No ratings yet
05-Securing Network Operations, Databases, and Applications
15 pages
Safety of Data Security2
No ratings yet
Safety of Data Security2
6 pages
Detection and Prevention of Cyber Defense Attacks Using Machine Learning Algorithms
No ratings yet
Detection and Prevention of Cyber Defense Attacks Using Machine Learning Algorithms
10 pages
IBM - Cybersecurity in The Era of Generative AI
No ratings yet
IBM - Cybersecurity in The Era of Generative AI
20 pages
White and Blue Illustrated Technology Cybersecurity Presentation - 20241222 - 151948 - 0000
No ratings yet
White and Blue Illustrated Technology Cybersecurity Presentation - 20241222 - 151948 - 0000
10 pages
How Artificial Intelligence Transforms Cybersecurity
No ratings yet
How Artificial Intelligence Transforms Cybersecurity
3 pages
Data Encryption and Security: Q: Find The Current Security Challenges Faced by Computer Security? Ans
No ratings yet
Data Encryption and Security: Q: Find The Current Security Challenges Faced by Computer Security? Ans
2 pages
Cyber Crime Investigation Manual
No ratings yet
Cyber Crime Investigation Manual
195 pages
Endpoint Security
No ratings yet
Endpoint Security
22 pages
CCP Notes Module-3
No ratings yet
CCP Notes Module-3
27 pages
Assignment Networking - IT Aakriti
No ratings yet
Assignment Networking - IT Aakriti
40 pages
Erica Jayasundera - Conerll Design Thinking
No ratings yet
Erica Jayasundera - Conerll Design Thinking
18 pages
Erica Jayasundera MSC in AI
No ratings yet
Erica Jayasundera MSC in AI
7 pages
Most Common Attack Vectors
No ratings yet
Most Common Attack Vectors
15 pages
ATARC AIDA Guidebook - FINAL U
No ratings yet
ATARC AIDA Guidebook - FINAL U
2 pages
Information Security Final
No ratings yet
Information Security Final
13 pages
PWC Balancing Power Protection Ai Cybersecurity
No ratings yet
PWC Balancing Power Protection Ai Cybersecurity
7 pages
Cisa
100% (3)
Cisa
264 pages
Artificial Intelligence in Cyber Security: A Study
No ratings yet
Artificial Intelligence in Cyber Security: A Study
4 pages
BMP 4005 Adriana Selaru
No ratings yet
BMP 4005 Adriana Selaru
13 pages
Sop
No ratings yet
Sop
4 pages
L3 (Mac)
No ratings yet
L3 (Mac)
28 pages
Lecture1 Intro
No ratings yet
Lecture1 Intro
31 pages
Presentation 1
No ratings yet
Presentation 1
9 pages
Beautiful - Ai - Cyber Function Target Operating Model For Government Organizations
No ratings yet
Beautiful - Ai - Cyber Function Target Operating Model For Government Organizations
13 pages
ECI LightSoft-EMS-APT-NPT Security Targetv1.8
No ratings yet
ECI LightSoft-EMS-APT-NPT Security Targetv1.8
36 pages
Rasta
No ratings yet
Rasta
5 pages
UCS422 Cyber Security Awareness Campaign
No ratings yet
UCS422 Cyber Security Awareness Campaign
13 pages
Lecture5 Dynamics
No ratings yet
Lecture5 Dynamics
39 pages
3503 Lab Survey
No ratings yet
3503 Lab Survey
30 pages
Cryptography and Compliance Pitfalls - Laggui, Jomarie S.
No ratings yet
Cryptography and Compliance Pitfalls - Laggui, Jomarie S.
1 page
NS Model A1-1,2,3,4,5
No ratings yet
NS Model A1-1,2,3,4,5
2 pages
Iibf
No ratings yet
Iibf
1 page
01 Computers, Society and Law
No ratings yet
01 Computers, Society and Law
13 pages
Making Cloud More Secure Using Blockchain Structured Analysis
No ratings yet
Making Cloud More Secure Using Blockchain Structured Analysis
3 pages
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
No ratings yet
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
18 pages
Long Term Project Based Engagements
No ratings yet
Long Term Project Based Engagements
4 pages
Background of The Study: Cyber Threat Landscape in Education
No ratings yet
Background of The Study: Cyber Threat Landscape in Education
4 pages
Site User Manual: Revision 0
No ratings yet
Site User Manual: Revision 0
15 pages
Answers Visual Symbols and The Blind Task 02
No ratings yet
Answers Visual Symbols and The Blind Task 02
6 pages
Absenteeism in Nursing
No ratings yet
Absenteeism in Nursing
5 pages
IRDA Information and Cyber Security Guidelines and Penalties 2023-2024
No ratings yet
IRDA Information and Cyber Security Guidelines and Penalties 2023-2024
7 pages
Ankit Fadia Workshop - Colleges
No ratings yet
Ankit Fadia Workshop - Colleges
6 pages
Cyber Security
No ratings yet
Cyber Security
2 pages
Forensic UAV
No ratings yet
Forensic UAV
29 pages
2016MIS013-Joomla CMS - Carnival
No ratings yet
2016MIS013-Joomla CMS - Carnival
4 pages
Script HTML Password FB
No ratings yet
Script HTML Password FB
1 page
Exam 020314
No ratings yet
Exam 020314
3 pages
Secure Coding Practices For: White Paper
No ratings yet
Secure Coding Practices For: White Paper
15 pages
Cyber Security Engineer
No ratings yet
Cyber Security Engineer
2 pages
IELTS Academic Reading Sample 1 - Population Viability Analysis
No ratings yet
IELTS Academic Reading Sample 1 - Population Viability Analysis
4 pages
Cyber Terms Glosssary
No ratings yet
Cyber Terms Glosssary
4 pages
4-90001 LANDI APOS A7 Security Policy
No ratings yet
4-90001 LANDI APOS A7 Security Policy
19 pages
Hatton National Bank
No ratings yet
Hatton National Bank
1 page
Director of Cyber Security Engineering
No ratings yet
Director of Cyber Security Engineering
1 page
Cyber Security Cvs
No ratings yet
Cyber Security Cvs
2 pages
OTP Authentication Finacle Integration Approach PDF
No ratings yet
OTP Authentication Finacle Integration Approach PDF
2 pages
Get Certified With EC Council
No ratings yet
Get Certified With EC Council
1 page
Cyber Security Engineering Manager Resume Examples & Samples 1
No ratings yet
Cyber Security Engineering Manager Resume Examples & Samples 1
3 pages
Cyber Crimes: Types of Cyber Crime
No ratings yet
Cyber Crimes: Types of Cyber Crime
2 pages
Use Case Diagram For User Login:: Username
No ratings yet
Use Case Diagram For User Login:: Username
6 pages
FortiGate 600D
No ratings yet
FortiGate 600D
6 pages
Free Proxy List
No ratings yet
Free Proxy List
2 pages