🛠️ Weekend Project: Building a Simple S3 Uploader with ELK & SQLite Logging

4 min readMay 14, 2025

This weekend, I gave myself a challenge — build a small Python app that uploads a document to AWS S3 and tracks everything it does in the process.

But, being me, I couldn’t stop at just uploading. I also wanted to log and persist that upload — both locally in SQLite and visually in Elasticsearch using Kibana.

Let me walk you through how it came together, with code and context.

🔧 What I Wanted to Build

A simple Flask web app that allows file uploads.
Upload the file to an S3 bucket.
Save metadata (filename, upload time, IP, S3 URL) to:
🗃️ SQLite database
📊 Elasticsearch
Visualize logs in Kibana (with Docker Compose).

📦 Step 1: Installing Required Packages

pip install flask sqlalchemy boto3 elasticsearch python-dotenv

These libraries do the heavy lifting:

flask: for the web interface
sqlalchemy: to interact with SQLite
boto3: to talk to AWS S3
elasticsearch: to send data to Elasticsearch
python-dotenv: to load AWS credentials from a .env file

🧱 Step 2: Setting Up SQLite with SQLAlchemy

# models.py
from sqlalchemy import create_engine, Column, String, DateTime
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime
engine = create_engine('sqlite:///uploads.db')
Base = declarative_base()class Upload(Base):
    __tablename__ = 'uploads'
    filename = Column(String, primary_key=True)
    s3_url = Column(String)
    upload_time = Column(DateTime, default=datetime.utcnow)
    ip = Column(String)

What’s happening here?

We define a SQLite database called uploads.db.
We define a Upload model/table to store info about each file:
filename: the name of the file
s3_url: the full S3 URL after upload
upload_time: timestamp
ip: IP address of the uploader

Next, we initialize and write metadata to the DB

Session = sessionmaker(bind=engine)
session = Session()
def init_db():
    Base.metadata.create_all(engine)def save_metadata(data):
    upload = Upload(**data)
    session.add(upload)
    session.commit()

✅ init_db() sets up the table
✅ save_metadata() saves the data after each upload

☁️ Step 3: Uploading Files to AWS S3

# aws.py
import boto3
import os
from dotenv import load_dotenv
load_dotenv()AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")
BUCKET = os.getenv("S3_BUCKET_NAME")

Here, we load the AWS credentials from a .env file.

Then we initialize the S3 client:

s3 = boto3.client(
    's3',
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)

And define the upload function:

def upload_to_s3(file):
    s3.upload_fileobj(file, BUCKET, file.filename)
    return f"https://{BUCKET}.s3.amazonaws.com/{file.filename}"

This function takes in a file object, uploads it to S3, and returns its public URL.

📊 Step 4: Logging to Elasticsearch

# elk_logger.py
from elasticsearch import Elasticsearch
from datetime import datetime
es = Elasticsearch([{'host': 'localhost', 'port': 9200, 'scheme': 'http'}])

We connect to a local Elasticsearch instance running on port 9200.

Then define the log function:

def log_to_elk(metadata):
    doc = {
        "filename": metadata["filename"],
        "s3_url": metadata["s3_url"],
        "ip": metadata["ip"],
        "upload_time": datetime.utcnow()
    }
    es.index(index="uploads", document=doc)

It stores each upload log as a document under the uploads index.

🐳 Step 5: Setting Up ELK Stack with Docker Compose

# docker-compose.yml
version: '3.7'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.6.2
    container_name: es
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
    ports:
      - 9200:9200  
  kibana:
    image: docker.elastic.co/kibana/kibana:8.6.2
    container_name: kibana
    ports:
      - 5601:5601
    depends_on:
      - elasticsearch

This YAML file spins up two containers:

elasticsearch: stores logs
kibana: visualizes logs on port 5601

To run it, just:

docker-compose up -d🔁 Final Integration with Flask

In app.py, we:

Accept file uploads
Upload to S3
Collect metadata
Log to both SQLite and Elasticsearch

Here’s a very basic structure:

from flask import Flask, request
from aws import upload_to_s3
from models import init_db, save_metadata
from elk_logger import log_to_elk
app = Flask(__name__)
init_db()@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']
    ip = request.remote_addr
    s3_url = upload_to_s3(file)    metadata = {
        "filename": file.filename,
        "s3_url": s3_url,
        "ip": ip
    }    save_metadata(metadata)
    log_to_elk(metadata)    return {"message": "Uploaded successfully", "url": s3_url}

🧠 What I Learned

Setting up AWS S3 with boto3 is straightforward when credentials are managed properly.
SQLAlchemy makes managing local data easy and readable.
Elasticsearch and Kibana offer incredible observability — even for small projects.
Docker Compose is perfect for local ELK setups — one command, and your stack is up.

Challenges Faced

Version Mismatch Between Python Client and Elasticsearch Server:

I initially installed an incompatible Elasticsearch Python client version (v9.0.1), which caused issues with request headers. The solution was to install the matching version (v8.6.2) to ensure compatibility.

Incompatibility with Python 3.13:

The collections module changed in Python 3.13, causing import errors. This was resolved by modifying the import statement to use collections.abc instead of collections.

Port Conflicts:

The default Flask port (5000) was already in use by another application. I fixed this by either killing the conflicting process or running Flask on a different port.

S3 Bucket Policy and IAM Permissions Issues:

The IAM user lacked proper permissions to upload to the S3 bucket. I had to add an IAM policy for the user in addition to the S3 bucket policy to allow s3:PutObject actions.

Incorrect Accept Headers in Elasticsearch Client Requests:

Due to version differences between the Elasticsearch client and server, I needed to specify the correct Accept headers or align the client and server versions.

Missing Python Module:

I encountered a ModuleNotFoundError due to missing dependencies. The fix was to install the necessary Python packages within a virtual environment.

Mixing System Python with Virtual Environment:

I sometimes ran scripts with the global Python environment rather than the virtual environment, which led to dependency issues. Ensuring the virtual environment was active resolved this.

Lack of Documentation:

Initially, there was no README or clear documentation for setting up or running the project. I created a detailed README and flowchart to outline the steps and architecture.

🎁 Bonus: Why This Is a Great DevOps Starter Project

This mini-project gave me exposure to:

✅ Flask + REST APIs
✅ AWS S3
✅ SQLAlchemy with SQLite
✅ Elasticsearch for centralized logging
✅ Docker & Docker Compose
✅ Environment management with .env

Want the code repo or a follow-up tutorial on deploying this to the cloud? Let me know!