Sitemap

🛠️ Weekend Project: Building a Simple S3 Uploader with ELK & SQLite Logging

4 min readMay 14, 2025

--

This weekend, I gave myself a challenge — build a small Python app that uploads a document to AWS S3 and tracks everything it does in the process.

But, being me, I couldn’t stop at just uploading. I also wanted to log and persist that upload — both locally in SQLite and visually in Elasticsearch using Kibana.

Let me walk you through how it came together, with code and context.

Zoom image will be displayed

🔧 What I Wanted to Build

  • A simple Flask web app that allows file uploads.
  • Upload the file to an S3 bucket.
  • Save metadata (filename, upload time, IP, S3 URL) to:
  • 🗃️ SQLite database
  • 📊 Elasticsearch
  • Visualize logs in Kibana (with Docker Compose).

📦 Step 1: Installing Required Packages

pip install flask sqlalchemy boto3 elasticsearch python-dotenv

These libraries do the heavy lifting:

  • flask: for the web interface
  • sqlalchemy: to interact with SQLite
  • boto3: to talk to AWS S3
  • elasticsearch: to send data to Elasticsearch
  • python-dotenv: to load AWS credentials from a .env file

🧱 Step 2: Setting Up SQLite with SQLAlchemy

# models.py
from sqlalchemy import create_engine, Column, String, DateTime
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime
engine = create_engine('sqlite:///uploads.db')
Base = declarative_base()
class Upload(Base):
__tablename__ = 'uploads'
filename = Column(String, primary_key=True)
s3_url = Column(String)
upload_time = Column(DateTime, default=datetime.utcnow)
ip = Column(String)

What’s happening here?

  • We define a SQLite database called uploads.db.
  • We define a Upload model/table to store info about each file:
  • filename: the name of the file
  • s3_url: the full S3 URL after upload
  • upload_time: timestamp
  • ip: IP address of the uploader

Next, we initialize and write metadata to the DB

Session = sessionmaker(bind=engine)
session = Session()
def init_db():
Base.metadata.create_all(engine)
def save_metadata(data):
upload = Upload(**data)
session.add(upload)
session.commit()

init_db() sets up the table
save_metadata() saves the data after each upload

☁️ Step 3: Uploading Files to AWS S3

# aws.py
import boto3
import os
from dotenv import load_dotenv
load_dotenv()
AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")
BUCKET = os.getenv("S3_BUCKET_NAME")

Here, we load the AWS credentials from a .env file.

Then we initialize the S3 client:

s3 = boto3.client(
's3',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)

And define the upload function:

def upload_to_s3(file):
s3.upload_fileobj(file, BUCKET, file.filename)
return f"https://{BUCKET}.s3.amazonaws.com/{file.filename}"

This function takes in a file object, uploads it to S3, and returns its public URL.

📊 Step 4: Logging to Elasticsearch

# elk_logger.py
from elasticsearch import Elasticsearch
from datetime import datetime
es = Elasticsearch([{'host': 'localhost', 'port': 9200, 'scheme': 'http'}])

We connect to a local Elasticsearch instance running on port 9200.

Then define the log function:

def log_to_elk(metadata):
doc = {
"filename": metadata["filename"],
"s3_url": metadata["s3_url"],
"ip": metadata["ip"],
"upload_time": datetime.utcnow()
}
es.index(index="uploads", document=doc)

It stores each upload log as a document under the uploads index.

🐳 Step 5: Setting Up ELK Stack with Docker Compose

# docker-compose.yml
version: '3.7'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.6.2
container_name: es
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- 9200:9200
kibana:
image: docker.elastic.co/kibana/kibana:8.6.2
container_name: kibana
ports:
- 5601:5601
depends_on:
- elasticsearch

This YAML file spins up two containers:

  • elasticsearch: stores logs
  • kibana: visualizes logs on port 5601

To run it, just:

docker-compose up -d🔁 Final Integration with Flask

In app.py, we:

  • Accept file uploads
  • Upload to S3
  • Collect metadata
  • Log to both SQLite and Elasticsearch

Here’s a very basic structure:

from flask import Flask, request
from aws import upload_to_s3
from models import init_db, save_metadata
from elk_logger import log_to_elk
app = Flask(__name__)
init_db()
@app.route('/upload', methods=['POST'])
def upload():
file = request.files['file']
ip = request.remote_addr
s3_url = upload_to_s3(file)
metadata = {
"filename": file.filename,
"s3_url": s3_url,
"ip": ip
}
save_metadata(metadata)
log_to_elk(metadata)
return {"message": "Uploaded successfully", "url": s3_url}

🧠 What I Learned

  • Setting up AWS S3 with boto3 is straightforward when credentials are managed properly.
  • SQLAlchemy makes managing local data easy and readable.
  • Elasticsearch and Kibana offer incredible observability — even for small projects.
  • Docker Compose is perfect for local ELK setups — one command, and your stack is up.

Challenges Faced

Version Mismatch Between Python Client and Elasticsearch Server:

  • I initially installed an incompatible Elasticsearch Python client version (v9.0.1), which caused issues with request headers. The solution was to install the matching version (v8.6.2) to ensure compatibility.

Incompatibility with Python 3.13:

  • The collections module changed in Python 3.13, causing import errors. This was resolved by modifying the import statement to use collections.abc instead of collections.

Port Conflicts:

  • The default Flask port (5000) was already in use by another application. I fixed this by either killing the conflicting process or running Flask on a different port.

S3 Bucket Policy and IAM Permissions Issues:

  • The IAM user lacked proper permissions to upload to the S3 bucket. I had to add an IAM policy for the user in addition to the S3 bucket policy to allow s3:PutObject actions.

Incorrect Accept Headers in Elasticsearch Client Requests:

  • Due to version differences between the Elasticsearch client and server, I needed to specify the correct Accept headers or align the client and server versions.

Missing Python Module:

  • I encountered a ModuleNotFoundError due to missing dependencies. The fix was to install the necessary Python packages within a virtual environment.

Mixing System Python with Virtual Environment:

  • I sometimes ran scripts with the global Python environment rather than the virtual environment, which led to dependency issues. Ensuring the virtual environment was active resolved this.

Lack of Documentation:

  • Initially, there was no README or clear documentation for setting up or running the project. I created a detailed README and flowchart to outline the steps and architecture.

🎁 Bonus: Why This Is a Great DevOps Starter Project

This mini-project gave me exposure to:

✅ Flask + REST APIs
✅ AWS S3
✅ SQLAlchemy with SQLite
✅ Elasticsearch for centralized logging
✅ Docker & Docker Compose
✅ Environment management with .env

Want the code repo or a follow-up tutorial on deploying this to the cloud? Let me know!

--

--

Mukti Mishra
Mukti Mishra

Written by Mukti Mishra

DevOps & Automation Engineer | #100DaysOfDevOps | Writing about CI/CD, Docker, Cloud & Infra as Code | Learning in public

No responses yet