Learn & Publish

Dedicated to its readers.

Uplift Modeling: A Gentle Introduction

Today, targeted marketing has become so proliferated that customers cannot be easily surprised with usual loyalty schemes anymore. Whether those are reward programs at Sephora, referral campaigns at Riff Raff & Co, coupons from your local supermarket, or a special one-off discount, you were offered that for a reason.

But is there any shortcut to targeting consumers who are more prone to buy after receiving a bounty from a brand? Well, that’s why we are here. And it is called Uplift Modelling.

Behind the Beast

Among other applications, this technique can shed light on the undertones of customers’ reactions, but what is uplift modeling? 

Let’s start with the basics.

To increase promotion profitability, companies turn to machine learning and its models. Thus, market segmentation is currently solved using the following approaches:

  • The look-alike model estimates the probability that a customer will perform a targeted action. It uses known positive targets (e.g., users who have installed the app) and random negative targets (sampling a small subsample of all other customers who have not had the app installed) as a training sample. The model will attempt to look for customers similar to those who have taken the targeted action.
  • The response model estimates the probability that the customer will perform the targeted action given the communication. In this case, the training sample is the data collected after some customer interaction. In contrast to the first approach, we have real-case positive and negative observations (e.g., the customer signed up for a credit card or declined).
  • The Uplift model estimates the net effect of the communication, trying to select those customers who will take the targeted action only during our interaction. The model estimates the difference in customer behavior when there is exposure and when there is no exposure.

If we break down the term, the uplift of a marketing campaign is typically defined as the difference in response rate between a treated group and a randomized control group. The treated group, in this case, refers to the individuals who were exposed to a specific marketing treatment from a brand. On the contrary, a randomized control group entails individuals who were left out of the marketing treatment. 

As you understand, a company should pay special attention to grouping the test and control units to isolate the marketing effect and gauge the performance.

What Kind of Data You Need

Before you can execute any model, you will need to fetch accurate data that factors in the following parameters:

  1. Whether a consumer has been exposed to a marketing treatment. Again, this treatment may take any form and shape from two-for-one or a special offer. 
  2. Whether a consumer has bought the item.
  3. Additional information like the age range, occupation, gender, etc. that can be useful for the modeling. 

It’s not easy to obtain this data for a company. But once you get it, you can leverage uplift modeling to whip your marketing strategy into shape. 

Why Uplift Modeling?

Although predictive analytics is the golden standard for modeling customer churn or a response to an offer, uplift modeling can give a new coat of fresh paint. 

Traditionally, marketing specialists define four subgroups in a population:

  • Sleeping dogs – customers who will be less likely to buy the product if they receive a marketing intervention. For example, this includes customers who forgot about the paid subscription. When they get a reminder about it, they will definitely opt out. But if you do not notify them, customers will still bring in money. Mathematically speaking, it’s Wi=1,Yi=0 or Wi=0,Yi=1 .
  • Lost causes – customers who will not take the targeted action regardless of communications. Interaction with such customers does not bring additional income but creates additional costs. In mathematical terms, it’s Wi=1,Yi=0 or Wi=0,Yi=0.
  • Sure things – customers who will react positively, regardless of whether or not they received a marketing intervention. Similar to the previous subgroup, such customers also spend resources. However, the expenditure is much higher in this case, because the loyal ones also use the marketing offer (discounts, coupons, etc.). In this case, it’s Wi=1,Yi=1 or Wi=0,Yi=1.
  • Persuadables – customers who respond positively to an offer, but would not perform the targeted action if they don’t get a marketing intervention. These are the people companies would like to define by this model. In mathematical terms, it’s Wi=0,Yi=0 or Wi=1,Yi=1 .

Therefore, instead of just predicting the probability of performing the target action, uplift modeling focuses on the advertising budget to be spent on customers who will perform the target action only with marketing intervention.  

Also, this approach may help companies identify whether other characteristics like demographics could contribute to the response. 

Among the most popular application areas of uplift modeling are retail and marketing. However, this technique has also found use in fundraising, clinical trials, healthcare treatment, HR, and even political campaigns. 

Traditional Response Vs. Uplift

Traditional response modeling typically takes a group of treated customers and builds a predictive model to separate likely responders from the non-responders. For that, it uses decision trees or regression analysis. 

On the contrary, uplift modeling takes groups of both treated and control customers and builds predictive models focused on incremental response. Uplift modeling also uses scoring techniques to segment customers into groups. 

Types of Uplift Models

Differential response (two models)

It is the simplest method for modeling uplift. The main idea lies in separate training to forecast the result for both groups. After that, the uplift is figured by deducting the result of those two models or taking away the coefficients of the two models:

Logit(Ptest(response | X, treatment =1)) = a + b*X + g*treatment

Logit(Pcontrol(response | X, treatment=0) ) = a + b*X

Score = Ptest(response | X, treatment =1) – Pcontrol(response | X, treatment =0)

Pros

  • Leverages basic logistic regression modeling approaches
  • Simple to execute and maintain

Cons

  • Not suitable for the target (i.e. incremental response)
  • Features modeling mistakes twice

Differential response (one model)

Another uplift modeling method that builds two logistic models

Logit(P(reponse|X) = a + b*X + g*treatment + l* treatment *X

Score = P(response|X,treatment =1) – P(response|X,treatment =0)

Pros

  • Leverages basic logistic regression modeling approaches
  • Better performance compared to the first one 
  • Effect modifications due to treatment

Cons

  • Not suitable for the target (i.e. Lift)
  • Levels up modeling complexity 
  • Calls for a trade-off between significances and sizes of parameter estimations 

Random Forest

Random Forest is one of the most popular machine learning algorithms, invented by Leo Breiman and Adele Cutler back in the last century. The algorithm combines two main ideas: Breyman’s bagging method and the random subspace method proposed by Tin Kam Ho. 

Pros

  • It is one of the most accurate learning algorithms.
  • The sampling includes randomness as a key part of the fitting procedure.
  • Adds an extra layer of randomness to reduce the variance of the ensemble.

Cons

  • A large number of trees may slow down the algorithm for real-time prediction.
  • Not suitable for categorical variables, since the algorithm is biased in favor of those attributes with more levels. 

The Bottom Line Uplift modeling is no silver bullet, and will not account for better results in all cases. Sometimes, this technique adds nothing because an uplift model ends up targeting the same customers as a conventional approach. However, in most cases, uplift models built on datasets with required attributes (like a large control group) have higher chances of yielding effective results compared with conventional models.

Learn & Publish © 2021