The document reviews net lift models, highlighting their importance in measuring the incremental impact of marketing campaigns rather than just gross purchase rates. It explains the different customer types in testing, and outlines statistical concepts and methods used in net lift modeling, such as weight of evidence, information value, and various regression and non-regression methods. Additionally, it discusses the effectiveness evaluation of these models and their applications in marketing and personalized medicine.
A REVIEW OF
NETLIFT MODELS
J U N E , 2 0 1 3 Z I X I A W A N G
S U M M E R Z W A N G @ G M A I L . C O M
2.
BACKGROUND
• The trueeffectiveness of a market Champaign should be measured by the incremental
impact, which are the purchases that would not have taken place in absence of the
campaign rather than the gross number of purchase.
• Traditional propensity models (response model)
maximize the gross purchase rate
• Net lift models (incremental model, uplift model or true lift model)
Maximize the incremental impact/lift
Incremental Impact=Test group purchase rate (Gross purchase rate) - Control Group
Purchase Rate (self-selection purchase rate)
3.
BACKGROUND
• There arefour types of customers in the test and
control sets based on their response to the
marketing campaign.
• We can see that the traditional propensity models
are focus on all people that make purchases which
includes both self-selectors and swing clients but
net lift models will be able to identify swing clients
who are make the most incremental sale from you
marketing campaigns.
• Net models can significantly increase the net
impact of marketing campaigns when you have a
large number of self-selectors.
4.
BASIC CONCEPT
• Thereare a few statistical concepts need to be known in order to understand the
fundamental of net lift model.
• Weight of evidence (WOE):
Describes the relationship (pattern) between a binary variable and a predictor
WOE>0 : positive impact
WOE=0: no impact
WOE<0: negative impact
Calculation methods:
Kernel density estimators: http://en.wikipedia.org/wiki/Kernel_density_estimation
Histogram estimator
5.
BASIC CONCEPT
• Informationvalue (IV):
Measure the strength of relationship
• WOE=ln(0.05/0.06)=-0.182 IV=-0.182*(0.05-0.06)=0.002
• Usually 0.02 or 0.05 are used as the cut point for the IV value to determine if the variable has significant
impact.
6.
BASIC CONCEPT
• PenalizedIV (PIV):
Measure the robustness of WOE and IV
For each bin, penalty is calculated as the difference of WOE between training and
validation sample.
Total Penalty= Sum (Penalty in each bin * (% responders-% non-responders))
Penalized IV= IV - Total Penalty. The suggested cutoff for PIV is 0.1 for variable
selection.
If the Total Penalty is relative small to IV then we can consider the variable is robust.
Only include variables that with relative large penalized IV in the final model.
• Net Weight of Evidence (NWOE):
NWOE=WOE(test)-WOE(control)
• Net Information Value (NIV):
Net IV describes the net strength.
• Penalized NIV:
measured the robustness of a variable.
7.
MODELING METHOD
• Regression-basedmethods:
1) DSM (Different score models)
Method 1: Build two separate logistic regression models
Incremental lift score= P(purchase | treatment)-P(purchase | control)
Method 2: A single logistic regression model (the bifurcated logistic model)
Logit(P(reponse|X) = a + b*X + g*treatment + l* treatment *X
score = P(response|X,treatment =1) - P(response|X,treatment =0)
1) PDM (Probability decomposition models)
When the test and control group are equally sized:
P(purchase due to treatment)=P(purchase | treatment)*(2-1/P(treatment | purchase))
Otherwise :
P(purchase due to treatment)=P(purchase | treatment)*(1+Nt/Nc*1/P(treatment |
purchase))
8.
MODELING METHOD
• Non-regressionmethods:
1) uplift Radom forest:
This method estimate personalized treatment effects by binary recursive partitioning. The
estimated personalized treatment effect is obtained by averaging the predictions of the
individual trees in the ensemble.
2) KNN( K-nearest-neighbors) classifiers
This method use the net purchase rate calculated from a nearest neighborhood of customers
form the training set to estimate the net score for observations in the validation dataset.
3) Net Naive ( and Semi-Naïve) Bayes classifier
Naive bayes classifier assumes that all predictors are conditionally independent given the
target variable Y. The net score using net naive Bayes method would just be the net weight
of evidence (NWOE). The generalized version Net naïve Bayes method rotated the WOE table
and make them more orthogonal to each other.
9.
METHOD COMPARISON
• Regression-basedMethods:
1) No attempt to maximize the incremental purchase rate directly
2) Subtracting two independent models can present a black box
3) Little control over the smoothness of the final prediction functions
• Non-regression methods:
1) Fitting the incremental purchase rate more directly
2) If NBC or SNBC methods are used, the prediction functions can be interpreted directly
and we can control the smoothness of these functions. For KNN, it's still a black box.
3) It is fitting an inherently unstable target( double variable) and can be over-fitting.
Therefore, full validation or forward validation are needed.
10.
EVALUATING THE EFFECTIVENESSOF NET MODEL
• It’s still an area of on-going research.
• A commonly used way is to use the top two deciles or top 10% as a measure of
success.
• Based on the case example provided by Kim Larsen in 12th Annual data mining
conference, net difference score with bifurcated adaptive logistic regression works the
best followed by generalized net naive bayes, net naive bayes, KNN classifier and net
difference score with two linear logistic regressions.
• For your result, the clients have the high net score are swing clients describe in slices 3
and clients with 0 or even negative score are self-selector, no purchase or do not
disturb( sleeping dog).
11.
IMPLEMENTATIONS
• A lotof examples that available online are build using the a series of macros coded in
SAS.
• There is uplift package for R using causal conditional inference trees to estimate
personalized treatment effects
• There are also R Package smbinning and R package WOE allow you to quickly
calculated the WOE and IV values.
12.
APPLICATION
• Various ofMarketing analysis:
Direct Mail
Email
Sweepstake
A/B testing
• There are different ways of to define the profit per purchase depends on your goal and
time windows. Measurements can be the net profit margin per sale, Net present value
(NPV), life time value (LTV).
• Net modeling has also been applied to personalized medicine.
13.
REFERENCE
• Net LiftModels: Optimizing the Impact of Your Marketing Efforts by SAS institute
• Net Models presentation in 12th Annual data mining conference by Kim Larsen
https://www.youtube.com/watch?v=JN3WE8IZNVY
• Analyzing Collection effectiveness using Incremental Response Modeling by Ryan
Burton etc.
http://www.mwsug.org/proceedings/2014/BI/MWSUG-2014-BI06.pdf
• What are uplift models by Jeffrey Strickland
http://www.analyticbridge.com/profiles/blogs/what-are-uplift-models