Robust Deepfake Detection Leveraging EfficientNet-B3 Backbone With Binary Classification Techniques
Robust Deepfake Detection Leveraging EfficientNet-B3 Backbone With Binary Classification Techniques
Abstract - In the 1940s and 1950s, the origin of artificial intelligence and deep learning as well as the neural network were introduced
respectively. Through human-like intelligence, various applications have been developing all over the world for a long time. One of
them is ‘Deepfake’ which is the ability to use digital tools and AI to create a vaguely realistic-looking video of a human. Deepfake
is a technique for human videos or image synthesis based on artificial intelligence. Deepfakes are created by coinciding and
integrating existing data which can be in the form of images, audio, or videos in contact with reference images, audio or videos
using the concept of deep learning which is also known as generative adversarial network. In this work, we present a deepfake
detection approach that utilizes EfficientNet as the core model, augmented with a custom classification head tailored for binary
classification tasks. The model is trained on a comprehensive dataset of both authentic and manipulated images, incorporating Test-
Time Augmentation (TTA) to enhance prediction reliability. Training is carried out using Binary Cross Entropy Loss, with
optimization handled by the Adam optimiser and a learning rate scheduler. The model achieved the highest accuracy of 94.51%,
representing its ability to distinguish between real and fake content effectively.
A Deepfake initiative involving soccer icon David Beckham The papers mainly examine the application of cutting-edge
depicted him communicating in nine different languages in a public deep learning architectures such as CNNs (e.g., VGG19,
service announcement aimed at raising awareness about malaria. MobileNetV2, InceptionV3, XceptionNet) and hybrid CNN
This innovative application of Deepfake technology enabled frameworks (e.g., InceptionResnet v2 integrated with
Beckham’s message to connect with a worldwide audience in their Xception). These models excel at evaluating high-
own languages, enhancing both engagement and effectiveness. dimensional visual features and maintaining temporal
consistency in deepfake videos. They are effective in both
In the world of films, Deepfake technology has facilitated the
individual frame assessments and overall evaluations of
recreation of performances without the need for extensive re-
videos. While deep learning techniques show potential, they
shoots, conserving both time and resources. A prominent example
must continually adapt to keep pace with the growing
of its application was in *Star Wars: The Rise of Skywalker*, where
complexity of deepfake content.a)
the late Carrie Fisher was digitally recreated to finish scenes, she
was unable to complete, paying tribute to her character in the 2.2 Detection using Machine Learning and Error-Level
narrative. Analysis
In *Furious 7*, filmmakers utilized Deepfake technology to finish Deepfake detection and classification using error‑level
Paul Walker's scenes after his tragic passing during the film's analysis
production. The crew brought in Walker's brothers as stand-ins,
This study utilizes conventional machine learning classifiers,
then used Deepfake and CGI methods to overlay Paul’s face onto
including SVM and KNN, alongside image preprocessing
theirs, achieving a convincing resemblance. This approach allowed
methods such as Error-Level Analysis (ELA) to identify
for a respectful conclusion to his character’s arc, honouring his
anomalies at the pixel level.
legacy and offering fans a heartfelt goodbye. The application of
Deepfake in this context showcased its ability to pay tribe to actors By applying ELA and utilizing deep feature extraction via
posthumously while ensuring continuity in the storyline. CNNs like ResNet18, the detection of digital alterations in
images is enhanced, resulting in satisfactory accuracy rates.
b) The Bad:
Nevertheless, these methods may face challenges when
During the 2020 Delhi Assembly elections, political parties utilized dealing with deepfake content that does not exhibit obvious
Deepfake technology to modify speeches. The BJP crafted a video visual irregularities.
featuring party leader Manoj Tiwari, making it seem like he was
2.3 Detection using Statistical and Computational
addressing audiences in both Haryanvi and English—languages he
did not originally speak—to attract a wider range of voters. This Imaging Approaches
sparked worries about political deception and the genuineness of Deep Learning for Deepfakes Creation and Detection
campaign content.
This study explores statistical models like the Expectation-
Numerous women in India, including public figures, have fallen Maximization (EM) algorithm and second
prey to non-consensual explicit Deepfake videos. These fabricated
videos, frequently circulated on social media, are utilized to bully order attention networks (SAN) to improve feature clarity
and slander individuals, resulting in psychological distress and and diminish noise in facial forgeries. These approaches
damage to their reputations. prioritize enhancing the quality of authentic images, which
assists in distinguishing between genuine and artificially
Deepfake videos are frequently shared on Indian social media generated visuals. Though beneficial for image verification
platforms to promote misinformation or communal narratives, purposes, these methods may struggle with extremely
particularly during periods of political strife. Such deceptive videos realistic deepfake images, requiring incorporation with more
can provoke public anger, generate confusion, and incite violence sophisticated deep learning models for effective detection.
or disorder.
2.4 Detection using Hybrid Approaches Combining Multiple
Techniques
Theory:
GELU allows the network to account for a broader range of input
values, smoothly adjusting activations in dense layers. This smooth
transition is essential for tasks like deepfake detection, where
distinguishing real and fake images requires a sensitivity to subtle
differences in features. GELU’s differentiable, smooth curve
enables the model to learn complex patterns, enhancing its
effectiveness in processing detailed features extracted by
EfficientNet-B3.
Figure 4. Binary Classification Flowchart
Dataset:https://www.kaggle.com/datasets/peilwang/deepfak
e
[8] Mukta, Saddam & Ahmad, Jubaer & Raiaan, Mohaimenul & [18] B. Lei, K. Yu, M. Feng, M. Cui, and X. Xie,
Islam, Salekul & Azam, Sami & Ali, Mohammed Eunus & “Diffusiongan3d: Boosting text-guided 3d generation and
Jonkman, Mirjam. (2023). An Investigation of the Effectiveness of domain adaption by combining 3d gans and diffusion
Deepfake Models and Tools. Journal of Sensor and Actuator priors,” arXiv, 2023.
Networks. 12. 61. 10.3390/jsan12040061. [9] R. Kumar, P. Jaiswal,
and S. Jaiswal, "Deep insights of deepfake technology: A [19] C. Feng, Z. Chen, and A. Owens, “Self-supervised
review," ResearchGate,2021.[Online].Available: https://doi.org/10 video forensics by audio-visual anomaly detection,”
.3390/jsan12040061 in CVPR, 2023.
[10] S. T. Ikram, P. V, S. . Chambial, D. . Sood, and A. V, “A [20] D. Cozzolino, A. Pianese, M. Nießner, and
Performance Enhancement of Deepfake Video Detection through L. Verdoliva, “Audio-visual person-of-interest deepfake
the use of a Hybrid CNN Deep Learning Model”, IJECES, vol. 14, detection,” in CVPR, 2023.
no. 2, pp. 169-178, Feb. 2023.
[21] A. Ciamarra, R. Caldelli, F. Becattini, L. Seidenari, and
[11] Ruben Tolosana, Ruben Vera-Rodriguez, Julian Fierrez, A. Del Bimbo, “Deepfake detection by exploiting surface
Aythami Morales, Javier Ortega-Garcia, Deepfakes and beyond: A anomalies: the surfake approach,” in WACV, 2024.
Survey of face manipulation and fake detection,Information Fusion,
Volume 64, 2020, Pages 131-148, ISSN 1566-2535, [22]T. Wang and K. P. Chow, “Noise based deepfake
https://doi.org/10.1016/j.inffus.2020.06.014. detection via multi-head relative-interaction,” in AAAI,
2023.
[23] Z. Guo, Z. Jia, L. Wang, D. Wang, G. Yang, and N. Kasabov,
“Constructing new backbone networks via space-frequency [25] P. Zhou, L. Xie, B. Ni, and Q. Tian, “Cips-3d++: End-
interactive convolution for deepfake detection,” TIFS, 2023. to-end real-time high-resolution 3d-aware gans for gan
inversion and stylization,” TPAMI, 2023.
[24] S. Aneja, J. Thies, A. Dai, and M. Nießner, “Clipface: Text-
guided editing of textured 3d morphable models,” in SIGGRAPH,
2023..