Abstract

The document discusses Vision Transformers (ViTs), a revolutionary technology in computer vision that utilizes self-attention mechanisms to enhance image processing tasks traditionally dominated by CNNs. It covers the foundational principles of ViTs, recent advancements, applications in various domains, and challenges such as high data and computational demands. The overview aims to provide insights into ViTs' transformative capabilities and their future potential in computer vision research.

Uploaded by

aaminasiddiqui82

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views2 pages

Abstract

Uploaded by

aaminasiddiqui82

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Colloquium

COC3800

Vision Transformers:
Principles, Challenges, and
Emerging Trends

Group No.: 2

Group Details

1. Mohammad Faiz Umar (22COB107 / GM1536)

2. Eizad Hamdan (22COB154 / GL5628)
3. Aamina Siddiqui (22COB186 / GL8004)
4. Ayra Riaz Khan (22COB675 / GL4004)
Vision Transformers:
Principles, Challenges, and Emerging Trends

Abstract
Vision Transformers (ViTs) have emerged as a groundbreaking technology in computer vision, revolu-
tionizing how complex vision tasks are addressed by leveraging the self-attention mechanism. Tradition-
ally dominated by convolutional neural networks (CNNs), these tasks have significantly benefited from
ViTs’ ability to divide images into patches and process them sequentially, inheriting the scalability and
adaptability of transformers used in natural language processing. We will explore the foundational prin-
ciples that underpin ViTs, including their unique architecture, the role of the self-attention mechanism,
and the use of positional embeddings to capture spatial relationships in images.
We will review recent advancements in ViT models, focusing on innovations in their design and train-
ing methodologies, including self-supervised learning, hybrid architectures, and hierarchical approaches.
Furthermore, we will examine their applications across diverse domains such as image classification,
object detection, and semantic segmentation, as well as their performance on widely used benchmark
datasets. Despite their success, ViTs face challenges, including high data requirements and compu-
tational demands. We will discuss how these limitations are addressed through techniques such as
locality-enhancing mechanisms, efficient token processing, and integration with CNN-inspired features.
Lastly, we will delve into specific use cases, such as medical imaging and 3D object analysis, to
highlight ViTs’ practical impact and potential for further development. This comprehensive overview
aims to provide a deeper understanding of ViTs, their transformative capabilities, and their role in
shaping the future of computer vision research.

References
1. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani,
M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image is Worth 16x16
Words: Transformers for Image Recognition at Scale,” arXiv preprint arXiv:2010.11929, 2020.
[Online].

2. Y. Khan, S. U. Rehman, J. Ahmad, Z. Jan, and A. Khan, “Vision Transformers: State of the Art
and Research Challenges,” arXiv preprint arXiv:2207.03041, 2022. [Online].

3. K. Han, Y. Wang, H. Chen, E. Wang, J. Guo, C. Tang, and Y. Xu, “Recent Advances in Vision
Transformer: A Survey and Outlook of Recent Work,” arXiv preprint arXiv:2111.06079, 2021.
[Online].

Vision Transformer: Revolutionizing Computer Vision
No ratings yet
Vision Transformer: Revolutionizing Computer Vision
13 pages
ViT Transformers SEMINAR
No ratings yet
ViT Transformers SEMINAR
16 pages
Transformers For Vision A Survey On Innovative Methods For Computer Vision
No ratings yet
Transformers For Vision A Survey On Innovative Methods For Computer Vision
28 pages
Gaurav Vision Transformer
No ratings yet
Gaurav Vision Transformer
10 pages
Vision Transformers in AI: Impact & Evolution
No ratings yet
Vision Transformers in AI: Impact & Evolution
3 pages
AN IMAGE IS WORTH 16X16 WORDS TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE Hirtika Mirghani
No ratings yet
AN IMAGE IS WORTH 16X16 WORDS TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE Hirtika Mirghani
2 pages
Vision Transformer Understanding
No ratings yet
Vision Transformer Understanding
3 pages
Ai Int Arijit Dey PDF
No ratings yet
Ai Int Arijit Dey PDF
19 pages
A Survey of The Vision Transformers and Its CNN-Transformer Based Variants - Khan Et Al
No ratings yet
A Survey of The Vision Transformers and Its CNN-Transformer Based Variants - Khan Et Al
82 pages
Vision Transformer Seminar Report
No ratings yet
Vision Transformer Seminar Report
22 pages
Ai Lakshmana Sai Vision Transformer
No ratings yet
Ai Lakshmana Sai Vision Transformer
19 pages
ViTA A Vision Transformer Inference Accelerator For Edge Applications
No ratings yet
ViTA A Vision Transformer Inference Accelerator For Edge Applications
5 pages
Transformers in Computational Visual Media A Surve
No ratings yet
Transformers in Computational Visual Media A Surve
30 pages
Vision Transformers for Image Recognition
No ratings yet
Vision Transformers for Image Recognition
22 pages
Wjarr 2025 2647
No ratings yet
Wjarr 2025 2647
11 pages
NAS for Transformers: A Survey
No ratings yet
NAS for Transformers: A Survey
39 pages
Convolutional Vision Transformers
No ratings yet
Convolutional Vision Transformers
10 pages
Paper 3
No ratings yet
Paper 3
7 pages
An Overview of Vision Transformers For Image Processing A Survey
No ratings yet
An Overview of Vision Transformers For Image Processing A Survey
17 pages
Vision Transformers in Medical Imaging: A Comprehensive Review of Advancements and Applications Across Multiple Diseases
No ratings yet
Vision Transformers in Medical Imaging: A Comprehensive Review of Advancements and Applications Across Multiple Diseases
44 pages
Research Paper (2) Done
No ratings yet
Research Paper (2) Done
17 pages
Computer Vision
No ratings yet
Computer Vision
2 pages
Vision Transformers Overview
No ratings yet
Vision Transformers Overview
28 pages
Vi Transformer
No ratings yet
Vi Transformer
21 pages
Vision Transformers: A Comprehensive Survey
No ratings yet
Vision Transformers: A Comprehensive Survey
30 pages
Challenging Task
No ratings yet
Challenging Task
21 pages
Applsci 13 05521 v2
No ratings yet
Applsci 13 05521 v2
17 pages
Video Quality Assessment (VQA) Using Vision Transformers
No ratings yet
Video Quality Assessment (VQA) Using Vision Transformers
5 pages
A Survey of Visual Transformers
No ratings yet
A Survey of Visual Transformers
23 pages
CVT: Introducing Convolutions To Vision Transformers
No ratings yet
CVT: Introducing Convolutions To Vision Transformers
10 pages
Universal Vision Transformer for Detection
No ratings yet
Universal Vision Transformer for Detection
23 pages
Comprehensive Survey of Model Compression and Speed Up For Vision Transformers - Chen Et Al
No ratings yet
Comprehensive Survey of Model Compression and Speed Up For Vision Transformers - Chen Et Al
12 pages
ViT Robustness in Image Classification
No ratings yet
ViT Robustness in Image Classification
23 pages
Transformer Segmentation
No ratings yet
Transformer Segmentation
35 pages
A Survey On Visual Transformer
No ratings yet
A Survey On Visual Transformer
23 pages
AE-ViT: Enhancing Vision Transformers
No ratings yet
AE-ViT: Enhancing Vision Transformers
12 pages
A Survey On Visual Transformer
No ratings yet
A Survey On Visual Transformer
21 pages
Research Notes
No ratings yet
Research Notes
9 pages
A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
23 pages
Bhojanapalli Understanding Robustness of Transformers For Image Classification ICCV 2021 Paper
No ratings yet
Bhojanapalli Understanding Robustness of Transformers For Image Classification ICCV 2021 Paper
11 pages
Simple Vision Transformer for Localization
No ratings yet
Simple Vision Transformer for Localization
12 pages
2151 6982 1 SM
No ratings yet
2151 6982 1 SM
6 pages
Transformers in Vision & Diffusion
No ratings yet
Transformers in Vision & Diffusion
24 pages
Vision Transformers for Dense Prediction
No ratings yet
Vision Transformers for Dense Prediction
22 pages
Video Quality Assessment with Vision Transformers
No ratings yet
Video Quality Assessment with Vision Transformers
6 pages
Video Vision Transformer Models
No ratings yet
Video Vision Transformer Models
14 pages
03 - ViViT - A Video Vision Transformer
No ratings yet
03 - ViViT - A Video Vision Transformer
13 pages
Via A Novel Vision-Transformer Accelerator Based On FPGA
No ratings yet
Via A Novel Vision-Transformer Accelerator Based On FPGA
12 pages
2022 - ViTAEv2 - Zhang Et Al - Arxiv
No ratings yet
2022 - ViTAEv2 - Zhang Et Al - Arxiv
22 pages
ConvNeXt - A ConvNet For The 2020s
No ratings yet
ConvNeXt - A ConvNet For The 2020s
15 pages
Vision Transformer (ViT)
No ratings yet
Vision Transformer (ViT)
26 pages
Vision Transformers in Face ID
No ratings yet
Vision Transformers in Face ID
23 pages
Vitae: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
No ratings yet
Vitae: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
23 pages
ppt2 New
No ratings yet
ppt2 New
30 pages
A Survey of Visual Transformers
No ratings yet
A Survey of Visual Transformers
21 pages
Motion-Based Recognition
No ratings yet
Motion-Based Recognition
377 pages
Seminar
No ratings yet
Seminar
61 pages
Vision Transformers: Revolutionizing Computer Vision
No ratings yet
Vision Transformers: Revolutionizing Computer Vision
14 pages
Conf
No ratings yet
Conf
17 pages
Empowerment Through Learning in A Global World - Conference 2012
No ratings yet
Empowerment Through Learning in A Global World - Conference 2012
2 pages
Ashwin Kalaichandran
No ratings yet
Ashwin Kalaichandran
2 pages
SEM636D Wheel Loader
No ratings yet
SEM636D Wheel Loader
6 pages
1.0 1.1 1.2 1.3
No ratings yet
1.0 1.1 1.2 1.3
197 pages
Mini Project Report Sample
No ratings yet
Mini Project Report Sample
29 pages
DSP Integrated Circuits Exam 2011
No ratings yet
DSP Integrated Circuits Exam 2011
3 pages
DC Shunt Motor - Field Control Method
No ratings yet
DC Shunt Motor - Field Control Method
4 pages
Synthe Midi Pic
No ratings yet
Synthe Midi Pic
22 pages
Excel and HTML Pointers Overview
No ratings yet
Excel and HTML Pointers Overview
16 pages
ABB Ransomware Resilience - Whitepaper - 9AKK108469A3563 - Rev A
No ratings yet
ABB Ransomware Resilience - Whitepaper - 9AKK108469A3563 - Rev A
23 pages
Chapter 1 Introduction, 2018 ICC G4 Guideline For Commissioning - ICC Digital Codes
No ratings yet
Chapter 1 Introduction, 2018 ICC G4 Guideline For Commissioning - ICC Digital Codes
3 pages
Instructional Materials
100% (20)
Instructional Materials
54 pages
@digitalearn - Official Kali Linux
No ratings yet
@digitalearn - Official Kali Linux
4 pages
Next-Generation BSR How To-V26
No ratings yet
Next-Generation BSR How To-V26
15 pages
Semiconductor Reliability Methods
No ratings yet
Semiconductor Reliability Methods
16 pages
Open Rails Log
No ratings yet
Open Rails Log
7 pages
Crypto Puzzle for Tech Enthusiasts
No ratings yet
Crypto Puzzle for Tech Enthusiasts
4 pages
Artículo 4
No ratings yet
Artículo 4
11 pages
Sy04 FVN1
No ratings yet
Sy04 FVN1
32 pages
AI's Global Impact on Unemployment
No ratings yet
AI's Global Impact on Unemployment
12 pages
Unit 1: Technology in Use
100% (1)
Unit 1: Technology in Use
7 pages
PWC Us Advisory DDV It JD
No ratings yet
PWC Us Advisory DDV It JD
2 pages
440N G, (Ferrogard 1 2 20 21)
No ratings yet
440N G, (Ferrogard 1 2 20 21)
4 pages
FOUNDATION™ Fieldbus Fieldbus Basics: Presented by
No ratings yet
FOUNDATION™ Fieldbus Fieldbus Basics: Presented by
37 pages
Deeper Insights Into The Illuminati Formula-2
No ratings yet
Deeper Insights Into The Illuminati Formula-2
8 pages
Lauda E100 Immersion Thermostat Manual
No ratings yet
Lauda E100 Immersion Thermostat Manual
35 pages
Unit-1 DM
No ratings yet
Unit-1 DM
16 pages
Association Rule Examples in Big Data
No ratings yet
Association Rule Examples in Big Data
40 pages
The Ultimate Guide To Learning Vocabulary
No ratings yet
The Ultimate Guide To Learning Vocabulary
2 pages
Phrasal Verbs For Technology and Computers
No ratings yet
Phrasal Verbs For Technology and Computers
4 pages

Abstract

Uploaded by

Abstract

Uploaded by

Colloquium

1. Mohammad Faiz Umar (22COB107 / GM1536)

You might also like