Operational Analytics & Metric Investigation
Trainity SQL Project | Data Analyst Simulation
Project Description
• This project simulates a Data Analyst role at a tech firm.
• Goal: Use advanced SQL to derive insights from operational data.
• Scope: Job Data Analysis & Investigating Metric Spikes.
• Key Tasks: Analyzing trends, spotting anomalies, and reporting insights.
Tools & Technologies Used
• - MySQL Workbench 8.0 CE
• - CSV data imports
• - SQL for analysis and metrics derivation
• - Google Drive for result storage
• - GitHub for version control
Jobs Reviewed Per Hour (Nov 2020)
• Query:
• SELECT COUNT(job_id)/(30*24) AS jobs_reviewed_hourly FROM job_data;
• Output: 0.0111
• Query:
• SELECT COUNT(DISTINCT job_id)/(30*24) AS jobs_reviewed_hourly_distinct FROM job_data;
• Output: 0.0083
Throughput: 7-Day Rolling Average
• Used window functions for rolling average:
• - COUNT(job_id) per day
• - AVG OVER 6 preceding rows
• Reasoning: 7-day average smooths out anomalies better than daily spikes.
• Sample Output:
• 25-11: 1.0 | 30-11: 1.33
Language Distribution (Last 30 Days)
• Query:
• SELECT language, COUNT(*) / (SELECT COUNT(*) FROM job_data) * 100 AS percentage
• Results (approx.):
• - Persian: 37.5%
• - English, Arabic, Hindi, French, Italian: ~12.5% each
Detecting Duplicates
• Used ROW_NUMBER() OVER (PARTITION BY job_id)
• Filtered for row_num > 1 to find duplicates.
• Sample Duplicates:
• - Job ID 23 appeared multiple times with same content.
Weekly User Engagement
• Query:
• EXTRACT(WEEK FROM occurred_at), COUNT(DISTINCT user_id)
• Sample Output:
• Week 18: 791 | Week 31: 1685
User Growth Tracking
• Query includes:
• - EXTRACT(YEAR, WEEK)
• - COUNT(DISTINCT active users)
• - Cumulative SUM
• Total Active Users: 9381 (2013–2014)
Weekly Retention (Signup Cohorts)
• Used JOIN on signup and engagement week.
• Filtered for 'complete_signup' and 'engagement'.
• Calculated retention_week = engagement_week - signup_week.
Engagement By Device (Weekly)
• Grouped by year, week, and device type.
• COUNT(DISTINCT user_id) per device.
• Used to analyze device-specific engagement trends.
Email Engagement Metrics
• Defined Categories:
• - Sent: 'sent_weekly_digest', 'sent_reengagement_email'
• - Opened: 'email_open'
• - Clicked: 'email_clickthrough'
• Metrics:
• - Opening Rate ≈ SUM(opened)/SUM(sent)
• - Click Rate ≈ SUM(clicked)/SUM(sent)
Conclusion & Learnings
• Applied SQL for operational and behavioral insights.
• Analyzed engagement trends, language shares, and anomalies.
• Improved understanding of rolling metrics, retention analysis, and duplicate checks.