Liceria & Co.
Liceria & Co.
GENERATOR
USING DEEP
LEARNING
TECHNIQUES RAHUL N MANESH
SARANG C SANTHOSH
SHYAMKUMAR S
SREEHARI E S
CONTENTS
INTRODUCTION
OBJECTIVES
ARCHITECTURE
METHODOLOGY
ALGORITHMS USED
MODEL
FLOWCHART
RESULTS
TASK IDENTIFICATION & ALLOCATION
CONCLUSION
INTRODUCTION
In modern communication, images and videos are key tools for conveying messages and
narratives effectively.
Descriptive captions greatly improve accessibility, aiding visually impaired individuals and
diverse audiences.
The media caption generator uses advanced tech like deep learning to craft contextually
fitting captions automatically.
Captions are presented as text overlays and converted into audio files for a comprehensive
accessibility approach.
OBJECTIVES
Detailed Image Descriptions
Spatial Awareness
Object Recognition
Getting the best features from the images by selecting and combining variables into features,
thus, effectively reducing the amount of data i.e., converting fixed-length informative vector for
each image using CNN.
To train the model, we will be using the 6000+ training images by generating the input and
output sequences in batches.
MODULE 4: TESTING THE MODEL AND EVALUATING -
The model will be trained and predictions are generated. The evaluation of image captioning
models is performed using metrics such as BLEU
Finally, the generated captions are converted to voice using Python library (gTTS - Google Text-
To-Speech)
ALGORITHMS USED
1.CNN(Convolutional Neural Network) -
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can
take in an input image, assign importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one from the other.
Real-Time Accessibility
Continuous Improvement
Fostering Independence
THANK YOU