0% found this document useful (0 votes)
95 views7 pages

Safelens: Multi-Modal Deepfake Detection

Uploaded by

kalmantimohannad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views7 pages

Safelens: Multi-Modal Deepfake Detection

Uploaded by

kalmantimohannad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Department of Electronics &

Communication

Visvesvaraya Technological
University
Freq-Fusion Ideathon Idea Presentation
Theme:
Team Cryptex
TrustAI
Title:”SafeLe
ns”
[Link] Kallumanti(4PA23IC036)
[Link] K M(4PA23IC047)
[Link] Mahaloof(4PA24IC402)
[Link] S M(4PA23IC048)
INTRODUCTION

• The digital landscape is being flooded with hyper-realistic synthetic media, known as deepfakes ,AI-generated
videos, audio, and images.
• This technology poses an unprecedented threat to individual privacy, national security, financial systems, and
democratic processes.
• Current detection methods are often single mode (analyzing only video or only audio) and struggle against
rapidly evolving generative AI.
• There is a critical need for a robust, proactive, and explainable detection system.
• Our project, Safelens, addresses this by developing a multi-modal framework that analyzes visual, auditory,
and temporal inconsistencies.
• It is designed to be integrated as an API into social media platforms, video conferencing tools, and news
verification services.
• The ultimate goal is to restore and ensure digital trust by providing a shield against malicious AI-generated
content.
PROBLEM
STATEMENT
• What problem we are trying to solve?
• We are solving the problem of identifying and flagging AI-generated deepfake media that is used to spread
misinformation, commit fraud, and create non-consensual content.
• AI generated images and videos are now very easy to make which causes people to
misuse it.
• Why is it important? (relevance, real-world impact)
⚬ Relevance: The accessibility of AI generative models has made creating convincing
deepfakes easier than ever.
⚬ Real-World Impact:
■ Political: Can be used to manipulate public opinion and elections.
■ Financial: Enables sophisticated identity theft and fraud through fake video KYC.
■ Social: Harms individuals through reputational damage and cyberbullying via

synthetic imagery.
■ Trust: Erodes the public's trust in digital media and institutions.
METHODOLOG
Y
Our Solution
• Safelens is a detection system that uses separate AI models to analyze a video's visual features,
its audio track, and how well they sync together, combining all three results to accurately identify
deepfakes.
• How it addresses the problem
• It addresses the limitations of single-source detectors by using a multi-layered, fusion-based
approach. This makes it significantly harder to fool, as a forgery would need to be perfect across
visual, audio, and temporal domains simultaneously.

Enhanced Proactive Explainable Scalable


Accuracy Defense Results Integration
Multi-modal analysis Designed to evolve with Provides confidence scores
Easy API integration for
generative AI, offering future- and specific reasons for
drastically reduces false various digital platforms.
proof protection. detection.
positives and negatives.
Working

How TrustAI Works


• Break It Down: The system takes a video and separates it into moving pictures
and sound.
• Check the Video: AI analyzes the pictures for glitches real people don't have,
like weird blinking or strange skin.
• Check the Audio: AI analyzes the sound for robotic or unnatural voice
patterns.
• Check the Sync: AI checks if the person's lip movements perfectly match the
words being spoken.
• Combine the Evidence: The results from all three checks are combined to
make a single, super-accurate decision.
• Get the Result: The system gives a final answer: "Authentic" or "Deepfake
Detected."
REFERENCES

• D. Güera and E. J. Delp, "Deepfake Video Detection Using Recurrent Neural Networks..."
• Link: [Link]
• S. Agarwal, H. Farid, et al., "Protecting World Leaders Against Deep Fakes..."
• Link:
[Link]
Deep_Fakes_CVPRW_2019_paper.html
• A. Rössler, D. Cozzolino, et al., "FaceForensics++: Learning to Detect Manipulated Facial Images..."
• Link: [Link]
• Also check the project page: [Link]
• H. Khalid, S. Tariq, et al., "FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset..."
• Link: [Link]
• This is crucial for your multi-modal approach as it provides the training data for audio-visual deepfakes.
• Y. M. Said, "Deepfake Detection using Deep Learning Methods: A Systematic Review..."
• Link: [Link]
Thank
You

You might also like