Image Captioning on Flickr8k Using Encoder-Decoder Models

Download Flickr8k

Download the Flickr8k Dataset, and unzip into a flickr8k directory

Set up virtual environment

python3 -m venv venv
. venv/bin/activate

Install requirements

pip install -r requirements.txt

Download spacy vocab

python -m spacy download en_core_web_sm

Run training loop

python main.py

Choosing the model type

We provide both LSTM and GRU based models. Please see model.py and model_gru.py respectively.

Evaluating Models

We have a notebook comparing the epoch losses of each model in model_comparison_graphs.ipynb.

Please see the results/ directory for epoch loss data in csv files. We've included .ipynb notebooks for each model to analyze various metrics and run inference.

1. `resnext_gru_eval_3_layer.ipynb`
2. `resnext_lstm_eval_single_layer.ipynb`
3. `resnext_lstm_eval_3_layer.ipynb`
4. `resnext_gru_eval_single_layer.ipynb`

eval.ipynb is provided as a reference template notebook for evaluating a model.

NOTE: .pt model files/weights are available upon request. We have NOT included them in this repository due to the size of the model files exceeding GIT allowable limits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Captioning on Flickr8k Using Encoder-Decoder Models

Download Flickr8k

Set up virtual environment

Install requirements

Download spacy vocab

Run training loop

Choosing the model type

Evaluating Models

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
results		results
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
eval.ipynb		eval.ipynb
main.py		main.py
model.py		model.py
model_comparison_graphs.ipynb		model_comparison_graphs.ipynb
model_gru.py		model_gru.py
requirements.txt		requirements.txt
resnext_gru_eval_3_layer.ipynb		resnext_gru_eval_3_layer.ipynb
resnext_gru_eval_single_layer.ipynb		resnext_gru_eval_single_layer.ipynb
resnext_lstm_eval_3_layer.ipynb		resnext_lstm_eval_3_layer.ipynb
resnext_lstm_eval_single_layer.ipynb		resnext_lstm_eval_single_layer.ipynb

tkobil/image-captioning-using-encoder-decoder-models

Folders and files

Latest commit

History

Repository files navigation

Image Captioning on Flickr8k Using Encoder-Decoder Models

Download Flickr8k

Set up virtual environment

Install requirements

Download spacy vocab

Run training loop

Choosing the model type

Evaluating Models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages