0% found this document useful (0 votes)
38 views27 pages

AI Ad Verification on Social Media

AI case study
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views27 pages

AI Ad Verification on Social Media

AI case study
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Case Study 6 Week 7

"Let's tackle the scammers"

Detailed Reference - Week 7 - "Let's tackle the scammers"

Case Study: Verification of Ads Using AI on Social Media Platforms


1. Introduction - Understanding the Problem
In the world of digital advertising, social media sites like Facebook and Instagram are popular places for
businesses to connect with large audiences. Unfortunately, these platforms also face a significant
problem with fake ads. These deceptive ads can cause serious harm, such as financial losses for users,
security breaches, and a loss of trust in the platform. For the social media companies, fake ads can
damage their reputation, lead to legal troubles, and increase the costs of monitoring and removing such
content

1.1 Problem Statement


Social media platforms are under increasing pressure to verify the authenticity of ads to protect users and
maintain platform integrity. Users frequently encounter ads that are misleading or outright fraudulent,
leading to financial loss and mistrust. The challenge is to develop an AI-driven ad verification system that
can effectively identify and flag suspicious ads, ensuring a safer advertising environment without
compromising user experience or legitimate advertiser engagement.

1.2 Scope
● Scams on social media platforms only
● Solution is not tackling phishing spam emails

1.3 Objectives and Motivations


Enhanced Security: Implement robust measures to detect and mitigate fake ads in real time.
● E-commerce Trust: Increase user trust and engagement by creating a safe advertising space.
● Operational Efficiency: Automate ad verification to cut down on manual review resources and
costs significantly.
● Scalability: Ensure the system can handle the vast number of ads posted daily across the
platform.
● Safety: Protect users from financial fraud, data breaches, and misleading information.
● Confidence: Enable users to browse ads with confidence, knowing they are verified.
● Seamless Experience: Ensure the ad verification process does not impact the overall user
experience.
● Manual Review Emphasis: Avoid relying extensively on manual review processes unless
flagged by the AI.
● Complete Remediation: Focus on detection and intervention rather than resolving every flagged
ad issue.

2. Research: Decoding the Existing Ecosystem

2.1 What is a fake ad? And what is a scam?


Fake ads are created by cybercriminals with the intent of impersonating a brand to steal revenue or
sensitive information. Fake advertisements mimic popular brands by using their name, logo, and other
brand assets on an ad, while redirecting the consumer to a fake domain.Unintended, suspicious or
damaging activity other than the original intent of the user while engaging with the ad.

2.2 Fake Ads Discovery

2.2.1 Types of Fake Ads and their characteristics


Category Tactic How it Works Example
Scammers hack into real accounts or create fake profiles You receive a message from someone
mimicking your friends or family. They exploit your claiming to be your cousin who is
Trust & Impersonation inherent trust in these connections to make their requests stranded abroad and urgently needs
Credibility (Friends/Family) seem legitimate. money for a plane ticket home.
Scammers leverage the power of social proof to make
their scams seem credible. They might use: Fake
celebrity endorsements: Doctored images or fabricated
quotes make it appear like a popular figure supports a
scam product or service. Positive reviews: Shills or fake An ad for a weight loss product features a
Trust & Fake accounts post glowing reviews to create a sense of trust photoshopped image of a celebrity with a
Credibility Endorsements and reliability. dramatic "before and after" picture.
Scammers create a sense of urgency or limited
availability to pressure you into acting quickly before you A social media post advertises exclusive
Emotional Urgency & have time to think critically. "Act now, limited offer!" access to a new investment opportunity,
Manipulation Scarcity "Only 3 left in stock!" but the offer expires in just 24 hours.
Scammers prey on your fears to gain control and
manipulate you. Threats to expose private information: You receive a message claiming to have
They claim to have compromising photos or videos and hacked your webcam and recorded
demand money to keep them quiet. Legal threats: They embarrassing footage. They demand
Emotional Fear & threaten lawsuits or other legal action unless you comply payment to prevent the video from being
Manipulation Intimidation with their demands. leaked online.
Scammers dangle the promise of easy money or A social media post claims you can
Emotional Greed & exclusive deals to lure you in. "Get rich quick" schemes become a millionaire by following a
Manipulation Opportunity "Free giveaways" with expensive shipping costs simple online course.
Scammers use social media's data collection to
personalize their scams, making them seem more
relevant and believable. They tailor messages based
Information on your interests or online activity. For example, You see an ad for a new brand of running
Sharing & Targeting based someone who frequently posts about fitness might be shoes after recently browsing reviews for
Visibility on profile targeted with a fake weight loss product ad. athletic footwear.
Fake news and misleading content can go viral quickly,
making it difficult to distinguish truth from fiction. This
Information confusion allows scams to flourish. Social media posts A viral post claims a new detox juice can
Sharing & Spreading with fabricated stories or manipulated statistics to cure cancer, despite having no scientific
Visibility misinformation promote a scam product. backing.
Fake accounts and bots inflate popularity and spread
misinformation, making scams appear more legitimate. A new cryptocurrency launches with
Fake accounts leave positive reviews or create a thousands of positive reviews on social
Platform Fake accounts & sense of buzz around a scam product. Bots are used media, but most of the accounts are
Vulnerabilities bots to like and share scam content, increasing its visibility. actually bots.
Deceptive advertising uses misleading visuals or text to
trick you into clicking on malicious links or providing
personal information. Ads with hidden fees or An ad for a new phone case shows a
misleading product claims. Ads that take you to sleek, high-quality product, but when you
Platform Deceptive phishing websites designed to steal your login click through, the actual product is a
Vulnerabilities advertising credentials. cheap, low-quality knockoff.
2.2.2 What is ad fraud?
Online advertising fraud, or ad fraud, involves faking clicks, sales or conversion with the intention of
financial gain. Scammers utilize fake users and robots to make advertisers believe that their ads are
gaining traction and make them pay for false exposure.

Type Description Mechanism

Click Fraud Click fraud fakes the number of clicks, viewers and Bots, automated scripts or hired individuals — called
traffic on an online ad platform to deceive and gain a click farms — can generate these fraudulent clicks and
financial advantage. This type of ad fraud affects fraudulent traffic.
pay-per-click (PPC) ads.

Impression Impression fraud, or ad viewability fraud, generates Digital ad fraud called ad stacking and pixel stuffing.
Fraud fake ad views or video ad impressions without actual Imagine that you put a poster on a display board, but
human viewers. This type of mobile ad fraud affects someone intentionally puts or stacks another one on top
billed campaigns based on the number of times an ad of it. The intended audience will only see what’s on top
is displayed (CPM = cost per thousand impressions). and not the one that you actually displayed. In another
instance, you might decide to put a huge poster up to get
the attention of more people. However, someone comes
along and shrinks it to the size of a dot so that nobody
can actually see what you posted. Instead, they then stuff
that original space with other posters.

Conversion Conversion fraud fakes the number of leads or sales This type of ad fraud uses malicious bots or paid
Fraud to collect a commission or inflate performance metrics. individuals to complete forms, sign up for free trials or
make purchases using stolen credit card information.

Affiliate Ad Affiliate ad fraud is specifically related to affiliate Techniques used to engage in this type of fraud include
Fraud marketing programs, where affiliates are paid a cookie stuffing (putting data on someone’s computer to
commission for directing traffic or sales to a business. make it appear that they visited your site), using bots to
Fraudulent affiliates can generate invalid traffic or fake complete actions required for earning commissions or
conversions to earn commissions illegitimately. misrepresenting the source of the traffic to claim that they
provided leads.

2.2.3 Why have Ad scams on social media become popular?


More people are using online platforms: With social media and internet use booming, there's a larger
audience for scammers to target. This makes it more profitable for them to invest in creating deceptive
ads.
Cheaper option: Compared to television broadcasting, options on social media platforms are still much
cheaper and less easy to get through as selling something through an ad is neither unethical or illegal.
Sophisticated tactics: Scammers are getting better at making their ads look legitimate. They might use
deep fakes or impersonate influencers to trick people into trusting them.
It's hard to detect all scams: Ad platforms like Facebook and Google have a lot of ads to review, and
scammers are constantly coming up with new tactics. This makes it difficult to catch every single scam
before it reaches users.
Complexities of ad verification: Ad verification involves sophisticated tools and techniques to identify
fraudulent activity. It's an ongoing arms race between fraudsters developing new tactics and ad platforms
constantly improving detection methods. This complexity creates opportunities for scams to slip through
the cracks.
International scope: Ad fraud can operate across borders, making it difficult to track down and prosecute
those responsible. This creates a safe haven for scammers and allows them to continue their activities
with less risk.

2.2.4 Insights on a scammer behavior


● They know their targets well and behaviors e.g. demotivated, looking for an upgrade in life, got
out of a relationship and know how to attack on human vulnerabilities.
● Many scammers pretend to be well-known businesses to gain trust and make their stories seem
more believable.
● They use real-world methods to contact people and to get paid.
● A scammer will target most vulnerable segments
● Fraudsters will always follow the money. This is their only aim.
● Scammers rush you into paying or investing money.
● A scammer will hide his/her identity by not revealing face or even voice identity.
● Customer care contact would in most cases be unreachable.

2.2 Available Solutions

2.2.1 Ad Feature Extraction Solutions

We observe that the Ad Feature Extraction can happen via the approaches elaborated below

Natural Language Processing (NLP): Extract features like sentiment analysis (positive/negative language),
Text Analysis named entity recognition (identifying brands, products, people), keyword frequency (excessive use of generic
marketing terms), and part-of-speech tagging (unusual grammar patterns).
Bag-of-Words (BoW): Represent the ad content as a collection of words, capturing the overall vocabulary and
potential red flags like excessive repetition.
TF- IDF (Term Frequency-Inverse Document Frequency): This method goes beyond BoW by assigning weights
to words based on their importance within the ad and rarity across the ad corpus. It helps identify keywords specific
to deceptive ads.
Further, language style and tone analysis, Grammar and spelling checks can be the additional features that can be
extracted from Ads.

Image Recognition: Extract features like object detection (identifying logos, people, products), scene
Visual understanding (detecting unrealistic or staged settings), and image analysis (detecting poor image quality,
excessive editing).
Analysis Optical Character Recognition (OCR): Extract text embedded within images, allowing combined analysis of text
and visuals (e.g., identifying inconsistencies between the written text and the image).

URL Analysis: Extract features like domain name structure (unusual extensions, typos, subdomain
Link Analysis inconsistencies), website legitimacy checks (blacklisted domains, domain age and registration information, SSL
certificate verification), and website content analysis (looking for known phishing patterns).

Engagement Metrics: Analyze metrics like likes, shares, comments, and user reviews to identify unusual activity
Additional patterns that might suggest inauthentic promotion.
Temporal Features: Consider the time of ad posting, frequency of ad changes (rapidly changing content could be
Features suspicious), and ad lifespan (short-lived ads might be riskier).
Advertiser Profile Analysis: Details like Account age and history, user reputation scores, social media presence
verification.

2.2.2 Marketing Tech Solutions


Data-driven Data-driven targeting uses data analytics to fine-tune This precision reduces opportunities for
Targeting and advertising efforts, ensuring that ads reach relevant fraudsters to generate revenue through fake
audiences. Defining the craft buyer personas and buyer impressions or fake clicks, as ads are less
Analytics journeys, allows to target ads to users based on their likely to be served in environments where fraud
interests and behaviors, so there’s less wastage of is rampant.
impressions. Geofencing ads that specifically target
customers within a specific location also helps

Limiting Broad Fraudsters usually target high-volume and high-competition Unusually high click-through rates or a sudden
Targeting on keywords. By focusing on niche-specific keywords, the ads spike in traffic from a specific source can be
are less appealing to scammers. With a specific audience in red flags for ad fraud since deviations from
Keywords mind, one can set clearer expectations of user behavior. these expectations are easier to spot.

Monitoring the Focus on quality leads rather than quantity. Monitor the leads Monitoring allows early detection of anomalies
Quality of generated from online advertising efforts to identify patterns such as many leads with gibberish information,
or spot the characteristics of fake or low-quality leads. identical details submitted multiple times, or
Leads leads from regions outside the target market.

Adding Extra CAPTCHA challenges prevent bots from submitting fake Routine security audits and penetration testing
Safeguards on information. also uncover potential weaknesses and allow
timely mitigation before any scammer can
Website manipulate them.

Incorporating Using machine learning and pattern recognition to detect and By analyzing vast amounts of data in real time,
Artificial mitigate fraudulent activities, such as irregular click patterns, AI can detect anomalies that would be
suspiciously high engagement rates from certain sources or impossible for humans to identify manually.
Intelligence Ad abnormal user behavior.
Tools and Apps

2.2.3 Existing AI & Machine Learning Models

Here's an overview of some existing approaches.

Approaches Methods

Random Forests: An ensemble of decision trees that vote on the classification. They're robust
to overfitting and can handle high-dimensional data well.
Gradient Boosting Machines: Builds trees sequentially, with each tree correcting errors of the
Supervised Learning Models: previous ones. XGBoost and LightGBM are popular implementations known for their speed and
These models learn from labeled data performance.
to classify ads as genuine or fake. Support Vector Machines: Work well for binary classification tasks, especially in
high-dimensional spaces. They aim to find the hyperplane that best separates the classes.
Deep Neural Networks: Can learn complex patterns from large datasets. They're versatile but
may require more data and computational resources.

Clustering (e.g., K-means, DBSCAN): Group similar ads together. Outliers or small clusters
Unsupervised Learning: These might indicate fraudulent activity.
methods detect anomalies or patterns
Autoencoders: Neural networks that compress then reconstruct data. Ads that don't
without labeled data.
reconstruct well may be anomalous.

Semi-Supervised Learning:
Useful when you have a small amount Label propagation: Spreads labels from labeled to unlabeled data points based on similarity.
of labeled data and a large amount of Self-training: The model iteratively labels unlabeled data and retrains itself.
unlabeled data.
Ensemble Methods: Combine Stacking: Train a meta-model to combine predictions from base models.
multiple models to improve overall Voting: Each model votes on the classification, with majority or weighted voting determining
performance. the final output.
CNNs: Excellent for image analysis, detecting visual patterns indicative of fake ads.
RNNs/Transformers: Process sequential data like text, understanding context and language
Deep Learning Approaches patterns.
Multimodal learning: Combines different types of data (e.g., text and images) for a more
comprehensive analysis.
Graph Neural Networks: Can capture complex relationships between advertisers, ads, and
Graph-based Models: Analyze user interactions.
relationships between entities in the ad
Node2Vec: Creates vector representations of nodes in a graph, useful for detecting suspicious
ecosystem.
patterns in advertiser networks.
Multi-armed bandits: Balance exploration (trying new strategies) and exploitation (using
Reinforcement Learning: Adapts known effective strategies).
strategies over time to maximize
Deep Q-Networks: Combine deep learning with reinforcement learning for more complex
long-term rewards.
decision-making.
Time Series Analysis: Detects ARIMA: Models the time dependencies in data.
unusual temporal patterns. LSTM networks: Can capture long-term dependencies in sequential data.
BERT, GPT: State-of-the-art models for understanding and generating human-like text.
NLP Models: Analyze text content of Named Entity Recognition: Extracts key information like names, locations, and organizations
ads.
from text.
Computer Vision Models: Analyze Siamese networks: Compare images to detect duplicates or near-duplicates.
visual content of ads. Hybrid Approaches: Combine different techniques for improved performance.
Rule-based + ML: Use expert knowledge to create rules, then ML to refine and adapt these
Hybrid Approaches: Combine rules.
different techniques for improved
Feature extraction + classification: Use deep learning for feature extraction, then traditional
performance.
classifiers for final decision-making.
Online Learning Models: Incremental learning: Update the model with each new piece of data.
Continuously update with new data. Adaptive boosting: Adjust the importance of data points based on previous errors.

2.2.4 Competitor Analysis


Following are technology based solutions
These specialized tools can detect fraudulent activities such as bot traffic, click Fraudlogix, HUMAN
Ad fraud detection farms, and domain spoofing. They can monitor ad campaigns in real-time and (formerly White Ops), and
tools identify any suspicious activity. DoubleVerify.
These platforms help identify anomalies in data patterns that indicate
Data analytics fraudulent activity. They can also identify patterns that indicate invalid or fake Google Analytics, Mixpanel,
platforms user behavior. and Adobe Analytics.
By analyzing large data sets, machine learning algorithms can detect
Machine learning and fraudulent activity patterns. These algorithms can identify patterns in user Anodot, DataVisor, and Sift
AI algorithms behavior that indicate fraudulent activity. Science.
They help clean up and validate first party data to ensure its accuracy and
Data cleansing authenticity. They also identify and remove suspicious data points indicating Experian Data Quality,
solutions fraudulent activity. Trifacta, and Tamr.
This creates a tamper-proof record of ad transactions. It can help prevent ad
Blockchain fraud by providing transparency and traceability to the advertising supply
technology chain.

Detailed Analysis: Week 7 - "Let's tackle the scammers"

2.3 Primary Research


We conducted interviews to identify key pain points for merchants, emphasizing the need for actionable
insights and comprehensive analytics:

User Interview 1 User Interview 2 (7 yrs ago) User Interview 3

The User 62 year old retired female, India 41 year old female, US 32 year old male, India

Former bank employee, SBI, retired. IT employee with 2 kids and a IT employee, married,
About the Also worked in the fraud department of husband. Understands phishing and Understands different types of online
person SBI. various scams due to work scams
experience.

She went ahead despite being - She couldn’t verify the link in google Could not verify the authenticity of
suspicious because a sense of search due to time constraints. the product as even the product
Pain Points urgency was created which made her reviews were seemingly authentic
feel anxious and question ‘What if’. and balanced with both positives and
negatives.

- The people created a sense of - It was the first link in the google - The ad was positioned along with
urgency. search for ‘Apple Customer Care’. Instagram reels, the most used
- They used the tool ‘Anydesk’ to ask - It being the first link and a feature
her to share her mobile screen. sponsored ad was considered good - The ad was very genuine with
- She had paid all the bills and was enough to trust. high-quality images and authentic
very aware of the various scams but - The behavior of the person in the details as provided for watches
Key still bought into the narrative ‘if you line acting as a customer care and online elsewhere in ecommerce
takeaways don’t call and pay as per the the slip in the hindi accent gave away websites
updated govt policies, you will lose enough hints that something is - User realized that he was scammed
your electricity’. suspicious and made the user drop when the payment went through and
- Her suspicions and her awareness the call. the company ghosted him, without
on ‘Any Desk’ could save her from the any email or order information.
scam but was not enough to avoid the
call in the first place.

2.4 Secondary Research

2.4.1 Market Analysis

2023 2022 2021 2020

Reported losses worldwide to fraud that started on social media $1.4B $1.1B $729M $237M

● In 2023, 51% of reports about fraud starting on social media identified Facebook as the social
media platform, and 22% identified Instagram. 7 Of the $1.8 billion reported lost to
investment-related fraud in 2023, $707 million was lost using cryptocurrency and $689 million
was lost using bank transfers
● Scammers were most often reaching out by email and phone calls. But people reported that they
lost the most money on scams that started on social media.
● Gift cards were the top reported payment method on several types of scams in 2023, including
romance scams, tech support scams, government impersonation scams, and scams that
impersonate people you know, like your boss or a grandchild.

2.4.2 Social Media scams in India


As per Indian cyber crime coordination center’s (I4C) latest data, most of the cyber fraud incidents in
2024 are associated with fake trading apps, loan apps, gaming apps, dating apps and algorithm
manipulation. I4C has received 20,043 trading scams amounting to Rs 14,204.83 crore and 62,687
investment scams amounting to Rs 2,225.82 crore which are significant amounts making both trading and
investment scams as serious threats. The common way a scammer approaches a victim in the
investment related scams, is by making the victims click on fake investment sites via social media.

As per April 2024 Statista report, there are more than 378 million Facebook users in India alone, making it
the leading country in terms of Facebook audience size. With an audience of this scale, it is no
surprise that the vast majority of Facebook’s revenue is generated through advertising. Almost 81.8
percent of Facebook audiences worldwide access the platform only via mobile phone.

As per Statista Dec 2023 report, small scale businesses were likely targets of cybercriminals, given that
only 24 percent of all Indian companies adequately prepared to take on cyber attacks.
As per Statista March 2024 report, Google India reported an increase in search interest across various
categories such as financial security, family and personal health. Consumers were also more aware of
online scams and fraud and laid emphasis on the trustworthiness of brands. Brands rely on the decision
that advertising and marketing are a key avenue for verticals to leverage Indias’ growing digital economy.
Paid search was among the three most important digital advertising formats, chiefly sustained by the
banking and financial services sector in 2021. Interestingly, India also ranked among the leading five
countries in the world for ad blocking.

2.5 Pain Points for E-commerce Platforms and Social Media Platforms
E-commerce platforms and social media platforms both face significant pain points related to scams,
which can lead to various negative outcomes such as financial loss, data breaches, potential legal
consequences, and eroded user trust. Here’s a breakdown of these pain points:

E-commerce Platforms Social Media Platforms

Financial Loss Chargebacks and Refunds: When scams occur, Revenue Impact: Scams can drive away advertisers
customers often demand refunds or initiate and users, leading to reduced revenue from ads and
chargebacks, resulting in financial losses for the premium services.
platform. Increased Security Costs: Enhancing security
Operational Costs: Investigating and resolving measures to combat scams requires substantial
scam-related issues incurs additional operational investment in technology and personnel.
costs.

Data Breaches Personal Information Theft: Scammers may target User Information Exposure: Scammers often use
e-commerce platforms to steal sensitive customer social media to collect personal data, which can lead
information, such as credit card details, addresses, to large-scale data breaches.
and phone numbers. Misuse of Data: Stolen data can be used for further
Intellectual Property Theft: Cybercriminals can fraudulent activities, compounding the damage.
also steal proprietary information, including supplier
details and business strategies.
Potential Legal Regulatory Fines: Failure to protect customer data Compliance Issues: Social media platforms must
Consequences adequately can result in hefty fines from regulatory comply with data protection laws. Failing to do so can
bodies (e.g., GDPR, CCPA). result in significant legal repercussions.
Lawsuits: Customers affected by scams may sue Litigation: Victims of scams may pursue legal action
the platform for negligence, leading to legal against the platform for failing to prevent fraudulent
expenses and potential settlements. activities.

Eroded User Trust Customer Dissatisfaction: If users fall victim to Loss of User Base: Repeated scams can drive
scams, they may lose trust in the platform, leading to users away from the platform, reducing engagement
decreased customer loyalty and reduced sales. and active user numbers.
Reputation Damage: Negative publicity related to Brand Damage: Public awareness of scams can
scams can tarnish the platform’s reputation, making harm the platform's brand, leading to a loss of
it difficult to attract new customers. credibility and market share.

2.6 Insights
● Roughly 90% of data breaches are caused by Phishing scams.
● 30% of respondents in a survey reported falling victim to job scams on social media.
● Also, 12% of folks reported clicking on phishing URLs on social media platforms.
● Of teenagers and young adults, about 85% fell prey to shopping scams.
● Many investment scams (up to 50%) happen via social media platforms like Instagram,
telegram, and Facebook.
● Romance scams (25%) are a popular method scammers utilize on social media.
● According to organization reports, $1.5 billion was lost due to influencer scams.
● Precaution is better than cure: Education on fake links and fake profiles has helped many
users to be suspicious and not click and fall prey for such scams.
● Human vulnerabilities being the main culprit: Emotions such as greed, loneliness,
hopelessness, sadness, anxiety are the main causes of falling prey to online scams.

3. Framing Hypothesis
3.1 User Personas

Bigger and Detailed View: Week 7 - "Let's tackle the scammers"


Persona 2 and Persona 6 are our target personas as the problems faced by them (financial loss,
identity theft) were of higher impact & cause immediate effect on the users’ life than the other personas.
Persona 7 particularly deals with eroded user trust and data breaches on the social media platform for
which they are responsible as fake ads would result in poor brand image and other legal consequences.

3.2 Prioritised Pain Points

One of the biggest impediments in curbing cyber crimes has been the lack of awareness on cyber
hygiene. Even when crimes were reported to authorities, the infrastructure and process to tackle such
cases were largely inefficient.

Another area that could ease cyber crime numbers is the expansion of the cyber security market in the
country. More investments in the sector could combat increased threats that are likely to continue with the
rollout of the 5G network and the establishment of smart cities.

The most critical pain point is the loss of user trust, as it directly impacts the platform's
Eroded User Trust user base and engagement. The negative publicity associated with scams can severely
tarnish the platform's brand image, making it difficult to attract and retain users and
advertisers.

Scammers exploiting social media can lead to significant data breaches, exposing
Data Breaches sensitive user information. This not only affects the users but also places the platform
at risk of legal repercussions and further loss of trust. Once data is compromised, it can
be used for additional fraudulent activities, compounding the harm and increasing the
platform's liability.

The departure of users and advertisers due to scams results in reduced revenue.
Financial Loss Combating scams requires substantial investment in security technologies and
personnel, adding to the financial burden on the platform.

Failing to protect user data and prevent scams can lead to significant fines from
Potential Legal regulatory bodies, especially with stringent data protection laws like GDPR and CCPA.
Consequences Affected users might take legal action against the platform, leading to costly legal
battles and potential settlements.

4. Framing Solution

4.1 Solution Overview

BakBak.ai is an advanced ML-driven system that categorizes any advertisement posted on social media
platforms into one of 3 categories - Fraud, Potentially Risk, Safe.
This is a B2B SaaS product which will help enterprises like social media platforms to regulate the content
on their platform and weed out any potentially risky ads which might harm their users and lead to
organization reputation management issues.

This will be a 3 part product solution


Part 1 - A backend service which will be deployed in enterprise servers of clients which will monitor ads
posted on the platform and flag potentially risky ads.
Part 2 - Dedicated analytics dashboard for the clients to view the ads flagged and the action taken on
them with more detailed reports.
Part 3 - A mechanism to highlight flagged ads in a non-invasive way to the users. Clients will not have to
undertake a separate development effort on the front-end side. They can simply embed our SDKs in their
code and set the design schema as per their organization standards. The rest of the details like flagging
the ads and highlighting it and necessary text to be shown to the users will be generated by BakBak.ai as
an overlay message on top of ads. This mechanism will allow easy integration of the product on client’s
end.

● Integration with Social Platforms: The system integrates seamlessly with social media
platforms’ APIs. It scans new ads, assesses risk, and provides real-time feedback to advertisers.
● Feature Extraction: The system extracts relevant features from ad content, such as text,
images, and metadata. These features serve as input to the ML model for decision-making.
● Continuous Learning & Improvement: BakBak.ai continuously learns and improves by
incorporating feedback from human reviewers (Trust & Safety team of social media platforms).
Reviewers validate the system’s predictions and provide corrective input.
● Enhanced Ad Safety: Integrates robust data collection, processing, and user engagement
mechanisms to create a safer and more trustworthy advertising environment.

Development & training machine learning model:


Supervised ML Model Training: will be adopted while training a machine learning model using labeled
datasets of known scam and legitimate ads. This provides a foundational capability to distinguish
between fraudulent and legitimate ads.

Features Set in the Model:


a. Text Extraction from Image: Utilize image recognition techniques and optical character
recognition (OCR) to detect fraudulent images and logos.
b. Colors: Analyze the use of colors in ads, categorizing them as bright, medium, dark, or pale, to
identify potentially misleading or deceptive color schemes.
c. Brand Logos: Detect the presence of brand logos such as Apple, RBI, or stock market symbols
to prevent unauthorized or deceptive use of well-known brands.
d. Brand Spellings: Detect if there are spelling mistakes in the content
e. People: Identify images of people, such as well-known figures like Modi, to detect misuse of
recognizable faces in fraudulent ads.
f. Currency Symbols: Recognize the use of currency symbols, such as dollar ($) or rupee (₹), to
identify potential fraudulent financial claims or misleading advertisements.
g. Intent of the advertisement: NLP (Natural Language Processing) is used to detect the
communication intent of the advertisement using multiple ui elements like text, images, etc.
h. URL / Deeplinks embed into the advertisement: Examine URLs and linked websites for
credibility. Check for previous scam reports or suspicious activity related to these links.
i. Custom Features (learned by model over multiple training iterations): Implement additional
features. E.g, if certain regions or user demographics are more frequently targeted by fraudulent
ads, the platform can set up specialized filters to address these regional or demographic-specific
issues.

Feedback Loop:
1. Ad Crawling: Continuously crawl and collect data on newly posted ads across the
platform.Gather features such as ad text, images, links, and metadata.
2. Human Oversight During Initial Launch: For the first few months after launching, a trust and
safety team will manually oversee the system's performance.
○ The Trust and safety team will verify each ad flagged as potential risk by the model.
○ The flagged ads will land in the Bak bak incident queue. Any analyst can pick the
incidents(flagged ads) from the queue and resolve them.
○ The genuine ads will be released on the ads servers.
○ The model will be retrained using the new data through a feedback loop to refine the
model and moderation processes based on real-world observations and insights.
3. Feedback Integration: The system incorporates feedback from social media users who report
suspicious or misleading ads.
○ Users can report the fraudulent ads missed using the “Report this ad” option.
○ Once a user reports any ad, the BAKBAK.ai model will be retrained with addition of new
inputs - user feedback.

User Journey

We have divided the user journey into 3 phases for clearly defining the interactions between different
entities involved:

Phase 1: Fraud analysis of the advertisement before making it live on users’ social feed
1. Advertisers post the ad on the social media platform (Facebook, Instagram, etc.).
2. The platform (FB per se) passes the ad to BAKBAK.ai for evaluating the fraud risk category of the
ad.
3. BAKBAK.ai using its Pre trained ML Model - on the features set including text, images, people
figures used in the advertisement - makes a decision the ad belongs to one of the categories:
Fraud, Potentially Risk or Safe

Phase 2: Users interact with the advertisement live on their social feed
As decided by BAKBAK.ai in Phase 1,
1. If the ad is Fraud, it will be immediately removed from the social media platform, thus users will
not see it on the social feed of the platform.
2. If the ad is potentially risky, a high contrast label “Potential Risk” will be appended to the ad and
shown the same way to the users.
a. If the user clicks on it once displayed on the feed, then a warning UI will be shown with
“Report this ad” and “Looks Safe” CTAs - thus, alerting them about the potential risks
associated with the advertisement.
3. If the ad is Safe, then the platform will show it as is on the users’ social feed.

Phase 3: Users who got affected by fake ads can report them:
(This phase is currently out of scope for implementation during the initial launch so we are routing the
affected users to concerned departments in the government)
Detailed and Larger view of flow chart of activities available here -
Week 7 - "Let's tackle the scammers"

A smaller flow out of the above flowchart is mentioned as below


Post-Scam Reporting Process
Step Actions Details
User Reporting - User Initiates Report: "Report Ad" button.
- Feedback Form: Describe the suspicion, attach screenshots, provide URLs.
In-App Reporting - Confirmation Message: User notified about receipt of report.
Enhancing User - Transparency: Show ad verification process summary.
Trust - Acknowledgement: Thank-you message to user.
System Actions Initial Automated - Immediate Check: Automated check to re-assess ad’s risk score.
Post-Reporting Analysis - Flagging Escalation: Adjust risk score based on preliminary check.
- Moderator Review: Thorough review by human moderators.
- Comprehensive Investigation: Review ad content, metadata, advertiser history, and
engagement metrics.
Manual Review - Audit Log: Log all actions and findings.
Actions Based on - Ad Removal/Suspension: Remove ad or suspend account if fraudulent.
Review - User Notification: Inform reporting user about the outcome.
Coordination with - Contact Information: Provide cyber crime cell contact details.
Authorities Cyber Crime Cell - Severity Assessment: Assess and escalate severe cases.
Contact - Reporting Guidelines: Offer templates for reporting to authorities.
Collaboration with - Information Sharing: Share detailed reports with law enforcement.
Law Enforcement - Regular Updates: Provide updates on investigation status.
Continuous - User Reports: Incorporate user feedback to refine models.
Improvement of ML Feedback - Moderator Insights: Use insights from manual reviews.
Model Integration - Feedback Loop: Create a continuous feedback loop.
- Data Augmentation: Enrich training dataset with new examples.
- Feature Adjustment: Update features based on new patterns.
Model Retraining - Scheduled Retraining: Regular monthly retraining sessions.
Performance - Evaluation Metrics: Track precision, recall, accuracy, false positives.
Monitoring - A/B Testing: Compare different model versions.
User Education Educational - Awareness Campaigns: Publish guides on identifying scams.
and Engagement Resources - Guidelines: Clear instructions for reporting scams.
User - Community Programs: Promote safety ambassador initiatives.
Engagement - Feedback Channels: Open channels for user suggestions.

4.2 - Data Collection and Labeling for SVM Model

Collection of Fraudulent Ads


Aspect Details
- User Reports: Ads reported by users as fraudulent or suspicious.
- Manual Reviews: Human moderators identify and flag fraudulent ads.
Sources of Data - Partner Databases: Collaboration with cybersecurity firms for scam ad databases.
Collection - Web Scraping: Automated scripts collect ads from known scam-ad hotspots.
- API Integrations: Real-time fetching of ads through social media APIs.
Data Collection - Web Crawlers: Custom-built crawlers identify and collect potential scam ads.
Pipelines - Feedback Loops: Aggregation of ads flagged through user feedback.

Feature Extraction - Text Analysis: Extract textual components like ad descriptions, titles, keywords.
- Image Analysis: Analyze graphical elements for scam indicators.
- Metadata Extraction: Collect posting time, advertiser profile, engagement metrics, and linked URLs.

Labeling of Data for SVM Model


Aspect Details
- Moderator Annotation: Human moderators label ads based on guidelines.
Manual Labeling - Expert Reviews: Domain experts review to ensure accuracy.
- Rule-Based Systems: Initial labeling using criteria like keyword lists, image patterns.
- Machine Learning Pre-Tagging: Pre-tagging ads with general-purpose ML models for moderator
Automated Labeling confirmation.
- Fraudulent Ads: Ads with deceptive content or linked to known scams.
- Legitimate Ads: Verified genuine ads.
Label Categories - Uncategorized/Unverified Ads: Ads needing further investigation.
- Inter-Rater Reliability: Consistency checks among moderators.
Quality Assurance - Audit Trails: Records of the labeling process for audits and improvements.

Data Augmentation
Aspect Details
- Generating Synthetic Ads: Creating synthetic examples based on known scam patterns.
Synthetic Data - Balancing the Dataset: Ensuring a balanced dataset to avoid model bias.
- Test Robustness: Introducing adversarial examples to test and improve the model's robustness
Adversarial Examples against new scam tactic

Data Preprocessing for SVM Model


Aspect Details
- Removing Duplicates: Ensuring no duplicate ads.
Data Cleaning - Handling Missing Values: Addressing any incomplete data fields.
- Text Normalization: Lower case conversion, stop word removal, stemming/lemmatization.
Normalization and Scaling - Feature Scaling: Normalizing numerical features for SVM compatibility.
- TF-IDF: Converting text data into numerical vectors.
Vectorization - Embeddings:Utilizing word embeddings like Word2Vec or GloVe.

4.2 System Design

- Detailed and Enlarged View of system design available here - Link


- VIdeo walkthrough of system design with voiceover - available here - Loom VIdeo Link

Broad overview snippet is as follows:


4.3 Detailed Solution

4.3.1 Functional Features


1. AI Models: Employ supervised learning, NLP, image recognition, and link analysis.
○ Details: Use diverse datasets and advanced algorithms, updated regularly to adapt to
emerging scam techniques.
2. Data Collection: Continuously crawl ad data.
○ Details: Include text, images, metadata and historical data for comprehensive
verification.
3. Scalability: Design a modular architecture for flexibility.
○ Details: Use distributed computing frameworks to handle traffic volume.
4. Real-Time Processing: Ensure low-latency ad verification.
○ Details: Utilize real-time data processing to avoid disrupting the user experience.
5. Security: Protect the AI system against exploitation by malicious actors.
○ Details: Implement robust security measures and regular audits.
4.3.2. Designing for Scalability and Robustness

Section Plan Details Benefit


Modules:
Ad Crawl Module: Handles the collection
of ad data.
Feature Extraction Module: Processes
ad data to extract relevant features.
Verification Engine: Runs the machine
learning models. Each module can be
Break the system into Alert System: Manages user developed, tested, and
interchangeable and notifications and reporting functionalities. scaled independently,
Modular independently Feedback Loop: Integrates user improving overall system
Architecture operating modules. feedback for dynamic model updates. robustness.
Data Processing:
Use distributed Framework: Implement frameworks like
systems to handle Apache Hadoop for batch processing and
large datasets and Apache Spark for real-time processing. Divide workloads among
complex computations Example: Processing features of multiple nodes, processing
Distributed across many 100,000 ads simultaneously across a large volumes of data in
Computing machines. cluster of nodes. parallel.
Frameworks: Use Apache Kafka for Ensures
Adopt real-time data real-time data streaming. near-instantaneous
streaming frameworks Functionality: Kafka handles incoming processing and verification
to manage continuous ad data streams, processes them in of ads, maintaining system
Real-Time Data data inputs and real-time, and delivers the results with speed despite high
Streaming outputs efficiently. minimal latency. volumes.
Elastic Scaling: AWS Services: AWS
Utilize cloud EC2 for scalable compute resources, Automatically scale
infrastructure AWS S3 for scalable storage, and AWS resources based on
providers like AWS, Lambda for serverless demand, maintaining
Scalable Google Cloud, or computations.Google Cloud Services: performance during traffic
Backend Azure for scalable Google Kubernetes Engine for container spikes without manual
Infrastructure backend support. orchestration, BigQuery for large-scale intervention.
data analysis.
Services: Each core functionality (e.g.,
ad crawling, feature extraction, Isolation: Failures in one
verification, alerting) will be implemented service don’t affect the
as its own microservice. entire system.Scaling:
Design the system Tools: Use tools like Docker and Individual services can be
Microservices using a microservices Kubernetes for containerization and scaled independently
Architecture approach. orchestration of microservices. based on need.
Implement load Maintains high availability
balancers to distribute and reliability by ensuring
incoming traffic evenly Tools: Nginx, HAProxy, AWS Elastic no single server gets
Load Balancing across servers. Load Balancing. overwhelmed with traffic.
Job Queues: Use job queues like
RabbitMQ or AWS SQS to manage
background tasks.
Handle long-running Processing: Offload tasks like data
tasks asynchronously analysis and model training to Prevents bottlenecks and
Asynchronous to prevent system background workers, freeing up ensures real-time task
Processing bottlenecks. resources for real-time tasks. resource availability.
Set up continuous Monitoring Tools: Prometheus for
monitoring and monitoring, Grafana for visualization.
auto-scaling Auto-Scaling: Configure auto-scaling Ensures the system adapts
Continuous mechanisms to policies based on metrics (e.g., CPU in real-time to changing
Monitoring and maintain performance utilization, memory usage) to add/remove workloads and traffic
Auto-Scaling and reliability. resources dynamically. patterns.

View larger and detailed view of relationship among the various components ensuring scalability and
robustness here - Week 7 - "Let's tackle the scammers"

4.3.3. Managing Volume and Speed


Plan Details Benefit
We will optimize the ad crawling and data collection process.
Incremental Crawling: Only collect new or updated ads, reducing unnecessary
processing.
Efficient Data Concurrent Fetching: Use concurrent API calls and threading to speed up data Reduces load and ensures timely
Collection collection. data collection.

We will develop and deploy highly optimized machine learning models.


Model Optimization: Use techniques like model pruning, quantization, and
distillation to reduce model size and improve inference time.
Optimized Distributed Training: Train models using distributed machine learning Improves model efficiency and
Algorithms frameworks like TensorFlow or PyTorch with distributed training capabilities. speeds up inference time.
We will use high-performance data storage solutions.
Databases: Implement databases optimized for read and write speeds like
Amazon DynamoDB, Google Cloud Bigtable, or NoSQL options like Cassandra.
Fast Data Indexed Storage: Ensure that data is properly indexed to speed up access and Enhances data access speed and
Storage retrieval times. system responsiveness.
We will implement caching mechanisms to reduce repetitive data processing.
Caching Layers: Use Redis or Memcached for in-memory caching of frequently Significantly reduces data retrieval
Caching accessed data. times and computational overhead.

Batch and We will use a hybrid approach for processing.


Stream Batch Processing: For non-time-critical tasks or bulk updates. Ensures efficiency and timeliness
Processing Stream Processing: For real-time, low-latency requirements. in data processing.

Example Flow for Scalable and Robust AI Ad Verification System:


1. Ad Submission:
● Ad is submitted and enters the Ad Crawl Module that collects and sends it to the Feature
Extraction Module.
2. Data Streaming:
● Ad data is streamed in real-time using Kafka, passed to Spark for processing.
3. Feature Extraction:
● Features are extracted using various microservices specialized in NLP, image analysis,
and link analysis.
4. Machine Learning Models:
● Features are sent to verification models running on distributed cloud resources, providing
real-time classification.
5. Risk Scoring:
● The ad is assigned a risk category and published back into the stream.
6. User Notification:
● If flagged, alert mechanisms notify users while logs are created for continuous monitoring
and feedback.
7. Feedback Integration:
● User feedback is collected and used to retrain models, enhancing model accuracy over
time.

Monitoring and Scalability:


● Load Balancers: Ensure traffic is evenly distributed.
● Auto-Scaling Policies: Add resources during peak loads; deallocate when idle.
● Monitoring Tools: Continuously monitor processing times, system load, and user interactions.

4.4 Non-Functional Requirements

Requirement Details Priority Rationale


Data encryption: Implement strong encryption for sensitive data like user
information and ad details.
Access controls: Restrict access to system components based on roles and
Protects user data,
permissions.
Security Regular security audits: Conduct vulnerability assessments and penetration
High brand reputation, and
legal compliance.
testing.
Incident response plan: Have a well-defined plan for handling security
breaches.
Redundancy: Implement redundant systems and data backups.
Ensures continuous
Monitoring: Continuously monitor system performance and identify potential
service for ad
Reliability issues. High
verification and user
Failover mechanisms: Have backup systems ready to take over in case of
alerts.
failures.

Load testing: Conduct performance tests to identify bottlenecks and optimize Impacts user
system performance. experience, ad
Performance Caching: Utilize caching mechanisms to improve response times.
High
delivery, and system
Asynchronous processing: Offload heavy tasks to background processes. efficiency.

AI model evaluation: Continuously evaluate the performance of AI models and


Core function of the
refine them as needed.
system directly
Accuracy Human-in-the-loop: Incorporate human review for critical cases. High
impacts trust and
False positive/negative analysis: Analyze error patterns to improve model
effectiveness.
accuracy.
User testing: Conduct usability tests to gather feedback on the user interface.
Improves user
Intuitive design: Create a user-friendly interface with clear navigation.
Usability Help and support: Provide comprehensive help documentation and support
Medium satisfaction and
reduces support costs.
channels.

Modularity: The system should be designed with modular components for easy
maintenance and updates. Facilitates future
Maintainability Testability: The system should be easily testable to identify and fix defects. Medium enhancements and
Documentation: Clear and comprehensive documentation should be available bug fixes.
for system components and processes.

Prepares for future


The system should be able to handle increasing ad volumes and verification
Scalability complexity.
Medium growth in ad volume
and complexity.
Impacts overall project
Cost-Effectiveness The system should be cost-effective to operate and maintain. Medium budget and ongoing
operations.

4.5 Design Framing

4.5.1 Information Architecture: Week 7 - "Let's tackle the scammers"


4.5.2 Mockups & Prototype
Mockup and Prototype of following is available to view.
● Functioning of tool on Facebook visible to user
● Analytics dashboard for Facebook management
● BakBak.ai website

View the mockups and prototype here -

Mockup of Facebook Mobile app


https://www.figma.com/design/ZaGWNvmuvCMFGnRlMV5fNW/BAK-BAK-on-Facebook-Case-S
tudy-6?node-id=0-1&t=7zZTN7e8DYnDZ3Up-0
Prototype of Facebook mobile app
https://www.figma.com/proto/ZaGWNvmuvCMFGnRlMV5fNW/BAK-BAK-on-Facebook-Case-Stu
dy-6?node-id=100-13&t=uyDrXuBOF1HcXfxY-0&scaling=contain&content-scaling=fixed&page-i
d=0%3A1&starting-point-node-id=100%3A13&show-proto-sidebar=1
Analytics dashboard for Facebook management
https://www.figma.com/design/ZaGWNvmuvCMFGnRlMV5fNW/BAK-BAK-on-Facebook-Case-S
tudy-6?node-id=33-6&t=uyDrXuBOF1HcXfxY-0
BakBak.ai website
https://www.figma.com/design/ZaGWNvmuvCMFGnRlMV5fNW/BAK-BAK-on-Facebook-Case-S
tudy-6?node-id=131-136&t=uyDrXuBOF1HcXfxY-0

4.7 User Stories


About the User
User Persona User Story
Persona
Safe and Relevant Ad Experience:
As a social media user, I want to see ads that are highly personalized and aligned with my
Sneha Mehta interests, so that I can discover new products and services while feeling respected and
Social Media 28 informed.
user Mumbai
Financial Analyst Protection from Misinformation:
As a social media user, I want to be confident that the ads I see are accurate and truthful,
so that I can make informed decisions without being misled.
Protect Brand Reputation:
Anjali Singh As an advertiser, I want to be immediately notified when my brand is associated with fake
34 ads, malicious content, or inappropriate placements, so I can proactively protect my brand
reputation and mitigate potential damage.
Advertiser Bangalore
E-commerce Gain Insights into Ad Fraud:
entrepreneur As an advertiser, I want to receive comprehensive and actionable insights about ad fraud
affecting my campaigns, so I can optimize my ad spend and protect my brand reputation.
Proactive Ad Verification:
As a social media platform manager, I want to implement robust, automated systems to
proactively identify and prevent fake ads from appearing on the platform, protecting the
user experience and platform reputation.
Pat Banerjee
Social media Efficient Fake Ad Complaint Resolution:
43
team managing As a social media platform manager, I want a streamlined process to efficiently investigate
Bay Area, USA and resolve user complaints about fake ads, ensuring timely responses, maintaining user
the ads on the
Social Media trust, and protecting the platform's reputation.
platform
Manager
Effective Ad Fraud Detection:
As a social media platform manager, I want to implement a robust system to proactively
identify and prevent various types of ad fraud, including click fraud, impression fraud, and
adware, to protect advertiser revenue and maintain platform integrity.

Detailed user Stories: Week 7 - "Let's tackle the scammers"

4.6 User Flow


We have considered the target personas, i.e., user personas no. 2, 6 & 7, additionally we have
also considered user persona no.5, a small business owner who also places Ads on social
media.

Based on their experience, we have defined a high-level flow that would be experienced by
each of them on the platform.

Each of these user flows are connected directly with the user stories defined above as well.
For a detailed walkthrough, kindly refer to Col. D in the User Stories sheet.

Week 7 - "Let's tackle the scammers"

5. Go To Market Plan: GTM Plan


We will start with market research by analyzing competitors, surveying potential users, and
identifying market opportunities. Next, we will develop a brand identity with a name, logo,
tagline, and branding guidelines to convey trust and innovation. Our marketing strategy will
define objectives, key messages, and value propositions aimed at acquiring initial users. We will
plan and schedule digital, social media, and PR campaigns, and establish strategic
partnerships. For user onboarding, we will develop a process with guides, FAQs, and incentives.
Launch preparation will involve coordinating all teams, finalizing materials, and ensuring app
functionality. During the launch, we will host events, run ads, and monitor performance.
Post-launch, we will track KPIs, gather feedback, and refine strategies for ongoing growth.

6. Pricing Strategy
To ensure flexibility and cater to a variety of business needs, we propose the following tiered
pricing plan for our AI-driven ad verification system.

Plan Type Features Price Range


Basic Plan ● Up to 10,000 ad verifications per month $99/month
● Access to basic reporting and analytics
● Standard customer support
● Basic fraud detection algorithms
● API access for integration with other platforms
Professional Plan ● Up to 50,000 ad verifications per month $299/month
● Advanced reporting and analytics
● Priority customer support
● Advanced fraud detection algorithms
● API access for integration with other platforms
● Monthly data export
● Customizable alert settings
Enterprise Plan ● Unlimited ad verifications per month $0.005 per ad fraud detected
● Comprehensive reporting and analytics
● Dedicated account manager and premium customer support
● Access to the latest machine learning algorithms and updates
● API access for integration with other platforms
● Weekly data export with advanced filtering options
● SLA (Service Level Agreement) with guaranteed uptime
● Personalized fraud detection model training
● Access to beta features and early releases
● On-site training and onboarding sessions

Add-On Services Payment Terms Support Services

Extra Verifications: $0.01 per additional Billing Cycle: Monthly or annual Basic Support: Email support with a
verification beyond plan limits billing options available 24-hour response time (available in
Custom Feature Development: Discounts: 10% discount for Basic Plan)
Starting at $500 per feature annual upfront payments Priority Support: Email and phone
Extended Data Storage: $50/month for Free Trial: 14-day free trial support with a 12-hour response time
additional 1TB available for all plans (available in Professional Plan)
Specialized Reports: Custom pricing Premium Support: 24/7 dedicated
based on requirements support with a 4-hour response time
(available in Enterprise Plan)

7. Understanding the Metrics

7.1 Activation metrics

Metric Definition Calculation

Customer Acquisition Cost (CAC) Cost to acquire a new social media Total sales and marketing expenses / number of new
platform customer customers

Sales Cycle Length Sales Efficiency Average time from initial contact to contract signing

Fraud Prevention ROI Solution Effectiveness (Total financial loss prevented due to fraud detection -
cost of the solution) / cost of the solution

Average Fraud Loss Prevented Solution Impact Total financial loss prevented / number of customers
per Customer

Customer Acquisition Cost (CAC) Return on Investment (Total revenue generated from prevented fraud - total
ROI sales and marketing expenses) / total sales and
marketing expenses

Customer Lifetime Value (CLTV) Customer Value Total revenue generated by a customer over their
lifetime, including fraud prevention benefits

Key Considerations

● Data Collection: Accurate and comprehensive data on fraudulent activities, financial losses, and
customer behavior is essential for calculating these metrics.
● Attribution: Clearly defining how to attribute fraud prevention to the solution can be challenging,
especially in cases where multiple fraud prevention measures are in place.
● Timeframe: Establishing appropriate timeframes for measuring these metrics is crucial for
assessing the solution's long-term impact.

By tracking these metrics, we can demonstrate the financial value of the ad fraud detection solution to
potential customers and measure the overall impact on their business.

7.2 Virality factor

Metric Definition Calculation

Fraudulent Links Reported Solution Effectiveness Total number of fraudulent links reported by the solution

Industry Fraud Reduction Solution Impact Estimated reduction in industry-wide ad fraud losses

Regulatory Compliance Industry Adherence Number of customers achieving regulatory compliance through
the solution

Competitive Advantage Market Position Number of competitive features or advantages over competitors

Thought Leadership Industry Influence Number of industry publications, conferences, or webinars


participated in

Time to Value Product Adoption Average time for customers to realize significant ROI

Market Penetration Product Reach Percentage of target market using the solution

Media Coverage Brand Visibility Number of media mentions and articles about the solution

Partner Ecosystem Industry Collaboration Number of partnerships formed with complementary solutions
7.3 Success metrics
By incorporating these calculations, we can gain a deeper understanding of the system's
performance and effectiveness in combating ad fraud.

● Detection Accuracy: Measures the overall correctness of the model in identifying


fraudulent ads.
● False Positive Rate: Indicates how often legitimate ads are incorrectly flagged as
fraudulent.
● User Engagement: Assesses how users interact with the system, particularly the alert
system.
● Reduction in Scam Incidents: Measures the decrease in reported scam incidents after
the solution implementation.
● Platform Trust: Evaluates user confidence in the platform's ability to combat ad fraud.
Legends:
TP: True Positives (correctly identified fraudulent ads)
FP: False Positives (legitimate ads incorrectly flagged as fraudulent)
TN: True Negatives (correctly identified legitimate ads)
FN: False Negatives (fraudulent ads missed by the system)

Success Metric Feature Tracking Method Calculation Time Frame


Precision = TP / (TP +
FP)
Model Performance Recall = TP / (TP + FN)
Model Evaluation Metrics Daily, Weekly, Monthly
F1-score = 2 * (Precision
Detection Accuracy * Recall) / (Precision +
Recall)
Percentage change in
Model Evaluation Metrics Weekly, Monthly,
Model Improvement F1-score compared to
( Cohort based ) Quarterly
previous period
Model Improvement Model Evaluation Metrics FP / (FP + TN) Daily, Weekly, Monthly
Number of user
False Positive Rate complaints about false
User Impact User Feedback Weekly, Monthly
positives / total number
of ads flagged
Click-through rate =
number of clicks on alerts
Alert System System Logs Daily, Weekly
/ number of alerts
displayed
User Engagement
Net Promoter Score
User Feedback User Surveys (NPS), customer Monthly, Quarterly
satisfaction score

Percentage decrease in
User Reports, Monthly, Quarterly,
System Effectiveness reported scams compared
Platform Data Annually
Reduction in Scam to previous period
Incidents Percentage decrease in
User Impact User Surveys number of users reporting Quarterly, Annually
scams
Average NPS score,
User Sentiment User Surveys customer satisfaction Quarterly, Annually
score
Platform Trust
Sentiment analysis
Social Media Listening
Brand Reputation score of social media Monthly, Quarterly
Tools
mentions
Average time taken to
Time to Detection System Performance System Logs detect a fraudulent ad Daily, Weekly
from the time it is posted
(Savings from reduced
Cost-Benefit
System Efficiency Financial Data fraud - cost of system) / Quarterly, Annually
Analysis cost of system
Number of model
Model Adaptability Model Performance System Logs updates / total time Monthly, Quarterly
period
Click-through rate on
educational content /
User Education User Awareness System Logs
number of users
Monthly, Quarterly
exposed to content

8. Risk and Mitigation Strategies

Risk Category Risk Level Risk Description Mitigation Strategies

- Implement end-to-end encryption


- Regularly update security protocols
- Use secure storage solutions
- Unauthorized access to user data - Strong authentication and authorization
Data Privacy and - Data breaches - Conduct regular security audits
Security High - Non-compliance with regulations - Ensure compliance with data protection laws
- Use advanced AI/ML algorithms
- Continuously train models
Accuracy and - Inaccurate ad detection - Implement quality control checks
Reliability Medium - Misidentification of ads - Gather user feedback
- Obtain proper licenses
- Provide clear terms of service
Legal and - Copyright infringement - Consult legal experts
Compliance High - Violating platform terms - Regularly review legal agreements
- Invest in intuitive UI/UX
- Conduct user testing
User Experience - Poor UI/UX leading to low adoption - Provide clear instructions and support
and Satisfaction Medium - Complex processes - Ensure accessibility
- Develop robust architecture
- Implement error monitoring
- App crashes and bugs - Plan for scalability
Technical Medium - Scalability challenges - Regular updates and maintenance

References

Week 7 - "Let's tackle the scammers"

You might also like