INSTAGRAM INFLUENCER DATA ANALYSIS
Introduction
This project presents an in-depth analysis of leading Instagram influencers, focusing on their global
reach, audience engagement, and impact across diverse countries and categories. Using robust data
cleaning, exploratory data analysis, and advanced interactive dashboard development, the project
empowers social media professionals, brands, and analysts to uncover actionable patterns in
influencer performance.
The compiled dataset includes hundreds of top Instagram influencers with detailed attributes such as:
• Channel/Account information (Name, Country, Follower tier)
• Popularity metrics (Follower counts, Total and average likes, Posts)
• Engagement scores (Calculated rates, Influence per post, Activity level)
• Country segmentation and follower distribution
To extract strategic insights, a multi-phase workflow was employed:
• Python (Pandas, Numpy): For rigorous data cleansing, transformation, and feature
engineering to ensure analysis-ready inputs.
• Power BI: For developing interactive visual dashboards, dynamic filters, and compelling data
storytelling, enabling real-time exploration of influencer trends.
Project Goals
• Analyze patterns of audience reach, engagement, and activity across top global Instagram
influencers.
• Identify market leaders based on robust quantitative indicators such as engagement scores,
influence scores, and country-wise impact.
• Detect trends and outlier behaviors among influencer tiers (Macro, Mega) and regional
groupings.
• Build an interactive dashboard enabling brands, agencies, and stakeholders to dynamically
explore influencer data and refine social media strategies.
By combining detailed statistical analysis with state-of-the-art dashboarding, this project provides a
clear, actionable view into the high-impact world of Instagram influencers—equipping decision
makers and marketers with critical insights to optimize campaign investments, track influencer ROI,
and benchmark against industry leaders.
Project Objectives
The primary objectives of this project are to deliver a comprehensive, data-driven understanding of
Instagram influencer performance at a global scale. Specifically, the project aims to:
1. Analyze Influencer Reach and Engagement Patterns
• Examine follower counts, post frequency, and like averages across top influencers.
• Identify trends in audience engagement, activity levels, and like-to-follower ratios in
different regions and content categories.
2. Evaluate Influencer Performance by Country, Tier, and Category
• Assess influencer rankings and performance segmentation by country, tier (Macro,
Mega), and engagement level.
• Compare influencer impact and visibility metrics worldwide to highlight regions or
users with outstanding influence.final_cleaned_instagram_data.csv+1
3. Explore Demographics and Segmentation
• Track distribution of influencers by follower tier and engagement category.
• Investigate patterns based on country, post volume, and influence scores to reveal key
market insights.
4. Develop a Dynamic, User-Friendly Dashboard for Ongoing Stakeholder Insights
• Create an interactive Power BI dashboard with intuitive filters for country, activity,
and engagement.
• Enable flexible data exploration and drill-down analysis for brands, marketers, and
analysts.
5. Provide Actionable Recommendations for Social Media Strategy
• Identify top-performing accounts, content types, and regions for marketing focus.
• Support evidence-based decision-making for influencer collaboration, campaign
investment, and digital strategy optimization.
Dataset Overview
The dataset used in this project provides a comprehensive record of the top Instagram influencers,
capturing detailed account, audience, and performance information across multiple countries and
influencer tiers.
• Total Influencers Covered: 200 of the world's leading Instagram accounts, representing a
variety of regions, audience sizes, and content categories.
• Total Records: Each entry represents a unique influencer, including their reach, engagement,
and content activity at a given point in time.
• Number of Features: The core dataset comprises both original and engineered attributes to
support advanced analysis:
Original Features:
1. Channel Info – Instagram handle or account name
2. Influence Score – Calculated composite score indicating influencer impact
3. Posts – Total number of posts
4. Followers – Follower count (absolute and tiered)
5. Avg Likes – Average likes per post
6. Engagement Rate – Recent audience engagement (e.g., 60-day average)
7. Total Likes – Sum of all likes accrued
8. Country – Represented country or origin
9. Likes-to-Followers Ratio – Engagement intensity index
Engineered Features:
10. Engagement Score – Advanced engagement metric for fine-tuned rankings
11. Activity Level – Categorized as High, Medium, or Low based on posting frequency
12. Follower Tier – Macro (10M–100M) or Mega (100M+) classification
13. Influence per Post – Normalized influence impact for fair comparisons
14. Country Rank – National leaderboard placement
15. Engagement Level – Segmentation by activity and resonance
16. Post-per-Million-Followers – Content frequency relative to audience size
Initial Observations:
• Data Completeness: Systematic cleaning was performed—removing incomplete rows,
normalizing names/metrics, and flagging unknown countries to ensure analytic reliability.
• Data Quality: Outliers and inconsistencies were identified and addressed for trustworthy
influencer ranking.
• Standardization Needs: Country names, influencer IDs, and tier definitions were
harmonized to enable robust comparison.
• Segmentation & Mapping: Advanced filters by country, engagement level, and follower tier
empower detailed exploration of influence dynamics across markets.
This dataset provides exceptional opportunities to analyze influencer trends by country, audience
segment, and engagement profile, enabling robust insights into Instagram’s digital influence landscape
and supporting data-driven marketing, branding, and campaign benchmarking.
Data Cleaning & Preprocessing
A critical step in ensuring the accuracy and reliability of insights was the systematic cleaning and
preprocessing of the Instagram influencer dataset. The following actions were undertaken to prepare
the data for advanced analysis and dashboard visualization
1. Missing Value Treatment
• Removed records with irretrievable missing fields (e.g., influencer name, follower count,
country, or engagement stats) to maintain data integrity
• Verified completeness of all essential inputs, especially those driving rankings and
segmentation in Power BI.
2. Data Type & Formatting Standardization
• Standardized follower counts and like statistics to absolute numeric values for proper sorting
and filtering.
• Normalized categorical values, ensuring consistency across features such as country names,
activity levels, and engagement tiers.
• Converted all influencer/channel names to a uniform casing to guarantee precise grouping and
referencing.
3. Feature Engineering
• Engagement Score: Calculated composite score based on recent activity, likes, and follower
ratios for nuanced comparison across countries and tiers.
• Activity Level & Engagement Level: Categorized influencers as High, Medium, or Low,
aiding in the dashboard’s filtering capabilities.
• Follower Tier: Classified accounts as Macro (10M–100M followers) or Mega (100M+)
according to standardized industry benchmarks.
• Country Rank & Influence per Post: Derived metrics enabling region and account-based
benchmarking in visualizations.
4. Quality Assurance
• Removed duplicate influencer entries and records with conflicting or ambiguous account
information.
• Conducted outlier detection to flag and assess extremely high/low metrics (e.g.,
extraordinarily low engagement for high-follower accounts).
• Ensured uniform segmentation filter labels in both source data and dashboard friendly fields.
By implementing these rigorous data cleaning and preprocessing steps, the Instagram influencer
dataset was transformed into a reliable, standardized, and analysis-ready format. The enhanced dataset
adds critical analytical dimensions and ensures trusted insights into influencer engagement, reach, and
performance comparisons across the global social media landscape.
Exploratory Data Analysis (EDA)
The EDA phase uncovered essential influencer performance patterns and market trends through
summary statistics and visual analytics. Key areas of analysis included:
1. Follower & Like Patterns Across Influencers
• Generated bar charts ranking the top 10 influencers by total followers (e.g., cristiano:
475.8M, kyliejenner: 366.2M).
• Created line and bar charts comparing average likes and new post average likes by
country, showing significant variations in engagement rates by geography and user
base.
2. Influencer Impact by Country & Tier
• Developed pie charts illustrating influence score distribution across countries,
dominated by the United States (33.96%) and high-ranking entries labeled "Not
Provided" and Brazil.
• Built tree maps depicting distribution of followers by country, highlighting the global
reach of top influencers.
3. Engagement & Activity Segmentation
• Visualized engagement levels by country and post volume, revealing top countries by
high engagement and activity levels.
• Used horizontal bar and donut charts to segment influencers as Macro or Mega tier
(33 Macro, 167 Mega), supporting detailed audience segmentation analysis.
4. Influencer Performance Benchmarks
• Compiled leaderboards for key metrics such as total likes (kyliejenner: 57bn,
cristiano: 29bn) and highest engagement scores.
• Charted engagement rates for top influencers over recent 60-day windows to identify
leaders in sustained audience interaction.
5. Platform-wide Pattern Discovery
• Identified highly skewed distributions, where a handful of mega-influencers dominate
overall reach, likes, and posting activity.
• Enabled cross-sectional filtering for dynamic insights by country, activity, and
engagement levels, informing Power BI dashboard design.
These findings directly shaped the visual and interactive elements of the dashboard, driving the
selection of KPIs, segmentation filters, and deep-dive capabilities for robust social media influencer
analysis.
Power BI Dashboard
The project included the design and development of two interactive Power BI dashboards that
transformed influencer analytics into a visually compelling and user-friendly interface. The dashboard
suite delivered several advanced features.
1. Interactive Filtering Capabilities
• Integrated dynamic slicers for segmenting influencers by country, engagement level, activity
tier, and follower tier (Macro vs. Mega).
• Enabled stakeholders to customize their analysis by selecting specific countries, tiers, or
performance categories.
2. Comprehensive Visualizations
• Developed clear bar and line charts illustrating follower totals, likes, engagement rates, and
posting behavior by influencer and region.
• Showcased top 10 influencer leaderboards (e.g., follower count, likes, and engagement),
sector-wise pie charts, and treemap visuals illustrating relative influence.
3. Drill-Down and Hierarchical Navigation
• Implemented easy navigation between high-level KPIs and more granular performance details
for each influencer or country.
• Supported deep-dive discovery of influencer activity, engagement rates, and platform-wide
impact without losing context of overall trends.
4. Key Performance Indicators (KPIs)
• Integrated KPI cards highlighting total influencers, total followers, cumulative likes, average
likes per post, and top-performing influencer.
• Provided concise summary metrics to enable rapid benchmarking and strategic decision-
making for social media campaign planning.
By blending interactive filters, compelling visuals, and robust drill-down features, the dashboards
empower marketers, analysts, and brand strategists to understand performance dynamics, benchmark
influencer marketing ROI, and refine their influencer engagement strategies for optimal results.
Key Findings & Patterns
The integrated dashboard analysis provides a comprehensive overview of Instagram influencer
performance by visually mapping key metrics across regions, user tiers, and engagement patterns. By
combining interactive data views and statistical summaries, the report reveals actionable insights for
strategic influencer marketing decisions and campaign optimization:
1. Geographic Dominance
• The United States emerges as the clear leader in Instagram influence, accounting for 33.96%
of top-tier influencers with 66 high-performing accounts.
• This dominance extends beyond just quantity-US influencers demonstrate superior
engagement levels and massive followings that translate to significant market influence across
multiple content categories
2. Engagement Rate Decline Pattern
• A consistent downward trend in engagement rates across different influencer tiers reveals
platform saturation effects.
• The data shows engagement dropping from 26% at peak performance levels to stabilizing
around 9-10% for most top influencers, indicating that larger follower bases don't always
correlate with proportional engagement increases.
3. Cristiano Ronaldo's Exceptional Performance
• Cristiano leads all metrics with 475.8M followers and demonstrates the highest total likes at
29B, establishing him as the ultimate benchmark for influencer reach.
• His performance significantly outpaces other top influencers like Kylie Jenner (366.2M
followers) and Leo Messi (357.3M followers), showing how sports personalities can dominate
social media influence.
4. Macro vs. Mega Influencer Distribution
• The follower tier analysis reveals 167 macro influencers (10M-100M followers) compared to
only 33 mega influencers (100M+ followers) , indicating the extreme rarity of achieving
ultra-high follower counts.
• This distribution pattern shows that the influencer landscape maintains a pyramid structure
where authentic mega-influence remains exceptionally exclusive.
\
Dashboard Visualizations
Figure 1. Instagram Influencer Insights Dashboard
Figure 2. Instagram Country Analytics Dashboard
Conclusion
This Instagram influencer analytics project transformed raw social media data into actionable insights
by leveraging comprehensive dashboard visualizations and quantitative analysis. The findings
highlight that influencer reach and engagement are highly concentrated among a small number of
mega-influencers, with Kylie Jenner and Cristiano leading in total followers and likes. Engagement
rates tend to decline as follower counts climb, but standout creators maintain dominance through
consistent, high-quality content. These results demonstrate the power of data-driven decision-making
for brands and marketers, empowering more effective influencer selection, campaign planning, and
resource allocation for optimal digital impact.
References
● Instagram data API and public profile metrics, 2025.
● Influencer Marketing Hub. “Instagram Analytics & Influencer Ranking” (2025).
● Modash. “Influencer Analysis Tools & Dashboard Best Practices” (2025).
● Later Blog. “Instagram Analytics: Key Metrics & Tools to Use in 2025” (2025).
● Storyclash. “Influencer Analytics: Key Metrics & Trends” (2024).
● Power BI Documentation. “Designing Effective Dashboards.” Microsoft, 2023.
● McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython.
O’Reilly Media, Inc., 2018.