0% found this document useful (0 votes)
6 views

Tutorial PracticeQuestions Class2 2023 Solution

Uploaded by

junli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Tutorial PracticeQuestions Class2 2023 Solution

Uploaded by

junli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Practice questions class 2 - Solution


Question A

We first need to note that no discount rate/ WACC/ interest rate to use is indicated. Hence, we
make an assumption and choose our own one (any is fine). Here we simply choose 5%.

We can use our Excel sheet from the class Workbook and just change the numbers and copy/
paste the columns to extend the table from 5 periods (class example) to 7 periods.

We can find out the average churn probability using any method that provides us with the sample
average for the dummy Churn (pivot table or contingency/ summary of churn in R). The average
churn rate is: 21.26%.

Question A-a

After 5 years, 30.3% of customers of the original cohort are still with us, assuming an average
churn rate.

Question A-b

The CLV7 is 246.52 given the provided information and assuming a yearly discount rate of 5%.

Question B

We need to perform the Kaplan-Meier estimator. If we use our Excel template from the lecture,
we will need to generate a pivot table in Excel and estimate retention rates and survival
probabilities manually (as demonstrated in the class example). Alternatively, we can use the surfit
function from the R-library ‘survival’, which provides the survival probabilities directly (see R
code). If we copy over the R output, we need to ensure that we use the output for the survival
probability row and not the retention rate row!

Trap – careful: The tenure counter starts at zero! So the period of 5 would be Tenure = 4! We can
see that there are people who churned in period 0 (churn only happens after 1 period for a
subscription business, so it’s just a matter of indexing).

Question B-a

After 5 years, 88% of customers of the original cohort are still with us based on current retention
rates.

Question B-b

The CLV7 is now 426.32.


Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Question C

We need to perform log-rank tests to address this question.

Question C-a

The Chi-square statistic is 60.9 (DF=1) and the p-value is smaller than 0.001 for our log-rank test.
Because this p-value is below our threshold of 0.05, we conclude that we find evidence that the
survival curves (and thus retention behaviour) are different for male and female customers.

We need to quickly check whether our Log-rank test has great power/ can be trusted:

➢ As the curves don’t cross, we can trust the test statistic.

Question C-b

Careful – CreditScore is a continuous variable. As Log-rank tests only work for categorical variables,
we need to create groups for any continuous variables, such as CreditScore (see R-code).

We decide to split the variable into three groups (good idea is to check a histogram for the
distribution). Usually 3-4 groups will suffice (e.g. quantile split).

The Chi-square statistic is 6.2 (DF=2) and the p-value is 0.04 for our log-rank test. Because this p-
value is below our threshold of 0.05, we conclude that we find evidence that the survival curves
(and thus retention behaviour) is different for customers in our three different credit score tiers.

We need to quickly check whether our Log-rank test was feasible:


Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

➢ As the curves don’t cross, we can trust the test statistic.

Question C-c

Which variables should be used for targeting?

Contextual information: Our business is a bank. We don’t know the type of customers though
(retail vs. corporate banking). However, the variables we observe seem to suggest we that may
may have a retail bank and customers are individuals (this will be our assumption).

We could also mention that lucrative is not defined well: We assume this means more profitable
customers. If we want to find customers who are more profitable, then we need to be careful that
we only have data about retention behaviour but not costs or revenues for various groups. Hence,
we can only focus on the former and assume the remaining elements are the same across groups.

Analytical considerations: We apply the log-rank test and need to check that all assumptions hold
(e.g. no crossed curves). Our 2 tests seem OK in this regard.). We must check the survival
probabilities per category to see which group demonstrates greater loyalty/ retention rates (note:
Because the curves don’t cross, we can take the average per group):

Variable Level Churn average


Gender Female 0.264
Male 0.167

Credit score group < 550 0.238


550-699 0.212
> 700 0.201

We find that female customers stay shorter than male customers and customers with high credit
scores show greater retention rates (on average).
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Theoretical considerations: We know the information is supposed to be used to find lucrative


customers and for marketing communication targeting purposes (=segmentation/ targeting).

Even with our assumption of constant revenue and costs per groups, we need to be aware that the
final profit and CLV are given by the following formula (from the morning lecture):

evenue he ong term perspec ve captured through total )

Brand across ears

Number of products and price are variables that are independent of targeting/ segmentation. We
only focus on the savings subscription product. However, whether someone is a new or existing
customer is an essential piece of information influencing targeting and marketing communications
campaigns. Some information will not be available for new customers (who are not with our bank
yet). This is something to consider for our discussion of whether to use a proposed feature for
targeting.

HD considerations:

Data availability/ business considerations - this may be a new way of thinking for many of you, but
an important one to learn and keep in mind for our class going forward:

Gender: Salespeople in our branches can identify the gender for most clients. This is also
often information that is tracked and made available by data brokers (we learnt
this in class 1) and market research companies. Hence targeting by gender seems
possible for existing and new customers.

Credit score: We have this information typically only for existing customers. Some applicants
(prospects) who are not with us yet may also provide their credit score. However,
using credit score for a large ad campaign to attract new customers would not be
feasible. Hence, we argue that only existing customers could be targeted.

Consumer behavior/ causal considerations (always good to keep in mind for marketing): Do our
findings make sense from a theoretical perspective? Are your results plausible?

Gender: Is there some theory that suggests that women have fewer saving accounts at all
banks or for a shorter period in general? Or this just a unique finding in our data
set? Why should women have a lower need to keep a savings account for a longer
time? There may be confounder missing in our analysis or some selection bias (e.g.
males obtained different loyalty rewards in the past)? CLV analysis is descriptive
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

(or predictive) and we need to be careful about conclusions regarding covariates.


You may require more research/ information to answer this question.

Credit score: A relationship between the tendency to keep savings accounts and the credit card
score appears plausible, even intuitive at first sight. Credit scores summarise how
trustworthy the reputation is as a borrower based on past records and the current
financial situation. Customers who have high scores are financially in a better
position and hence more likely to be able to have funds for saving accounts.

Credit score is just another outcome metric to measure possible lucrative


customers or even a possible successor to our variable of interest (here savings
account subscription). That is, credit score may only increase/ improve because
customers started using the savings account. This may not be problematic for
targeting though (unless there is a selection effect again). We again need more
information about how people with different credit scores were treated in the
past.

It’s not clear what the exact causal diagram may be as one can argue about different effect
relationships. One possible diagram is depicted below.

Overall conclusion: Both gender and credit score could be interesting targeting variables
(provided that these segments were not treated differently in the past, e.g. with churn
promotions). Gender is preferrable because data is more widely available.
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Question D

The parameters for the sBG distribution according to our BFGS optimisation are 7.99 for alpha and
163.27 for beta. The respective projected CLV 10 is $569.64.

Question E

We don’t have any further information and can only use as decision criterion maximise ).
We use the retention rates for monthly data from our Pivot table from the lecture Worksheet (or
use the R output for the survival probabilities). We then estimate the CLV for 60 months based on
the information provided in class:

➢ Yields $240 each year the subscription is renewed.

➢ Yearly average ongoing costs per customer are $120.

➢ Acquisition costs per customer are $150.

➢ Yearly discount rate of 6%?

The important trick here is to transform the yearly discount factor into a monthly compound rate
based on the formula shared in the lecture:

iscount rate considera ons


compound interest rates discount rate
needs to be aligned to the
fre uency of revenue streams

The yearly rate of 6% then corresponds to a monthly rate of 0.487%. We also need to adjust the
monthly ongoing costs. We are told to assume constant/ linear costs. Hence, the monthly costs
would be 120/12= 10.

Our CLV60m (5 years = 60 month) is then 260.38 versus 292.49 AUD for yearly subscriptions if we
charge the first amount at the beginning of each period.

For charges at the end of the periods, the pattern reverses. 232.07 for yearly subscriptions and
255.35 AUD for monthly.
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Hence, we should recommend introducing yearly subscriptions charged at the beginning of each
period as the CLV is the highest.

HD considerations:

What is causing this? We have a concave downward survival curve – hence, secure more income
by charging as much as possible immediately. If we charge monthly fees and at the end, we will
obtain more revenue because the first months of year 1 are less discounted (and our customer
base is greater).

Extra: What happened if we used a standard random forest to predict the retention rate?

Because this method does not consider censoring, the retention rate is strongly underestimated.

In fact, the random forest would tell us that only 1.9% customers are still with us after 1 year
(despite accounting for individual features of each customer). See Excel file for details:

At the beginning charged:

CLV60m: -117.12
CLV5: -27.66

At the end charged:

CLV60m: -127.12
CLV5: -147.66

The result is that all CLV calculations are negative! (no matter whether we use monthly or yearly
subscriptions or whether we charge our fee at the beginning or end of each period).

Curves suggested by random forest:


Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann

Retention rate random forest


0.9

0.85

0.8

0.75

0.7

0.65
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

0.8
Survival curve random forest

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Conclusion: We can see how critical it is to account for censoring!

Note: The R packages grf and randomForestSRC offer survival random forests that account for
right-censored data in time-to event data (not part of the subject scope).

You might also like