This repository contains the code for the bachelor's thesis Enhancing Multimedia Retrieval with Emotion-Aware Augmented Reality Interfaces. It includes:
- A retrieval engine that supports emotion and sentiment-aware search
- An Android application used as the user interface
- Ensure you have Android Studio installed.
- Clone this repository locally.
- Put your Android device (Android version 8.1 / API 27 or higher) into Developer Mode.
- Enable USB Debugging and connect your device to your computer via USB.
- Open the project in Android Studio and run the app. It should automatically install and launch on your device.
- To connect to the university’s internal services, install the GlobalProtect VPN app and log in using your university credentials.
- Ensure Python 3.10 or higher is installed.
- Install the required Python packages:
pip install -r Backend/requirements.txt
- A system with a GPU is recommended, as multiple machine learning models are used.
- Create a database called multimedia_db (or rename it in the code) and create the schema according to DB_schema.sql.
- Start the backend insertion engine (default port is
8000):python3 emotion_enhanced_engine.py
- Run the script to insert videos into the database:
python3 upload_scripts/upload_videos.py
This process will insert all the videos and extract semantical and emotional features. Be aware that this process might take a long time.
-
After the process is complete, run the OCR insertion script:
python3 insert_OCR_data.py
-
Create the index on the embeddings:
python3 create_index_embedding.py
When all of the videos and OCR data are inserted, please start the rest of the backend processes, such that the frontend can run as intended.
- Start the search API:
python3 search_api.py- Start the sentiment API :
python3 Sentiment_Detection/emotion_api.py- Start the tinyLlama model on the server. This works best when run with Ollama.
Same steps as under User Setup. Make sure to:
- Enable USB Debugging
- Connect to the GlobalProtect VPN for the university on the mobile device.
Run the project from Android Studio, and the app should deploy to your connected device.
This project uses the following models: g
- Text-to-Emotion: michellejieli/emotion_text_classifier
- Facial Expression Recognition: trpakov/vit-face-expression
- Face Recognition: DeepFace
- Speech-to-Text: Whisper
- Encoding Images to Embeddings: CLIP
- TinyLlama: TinyLlama
- HTTP Requests Android: Volley