Segmentation models based on customer data: RFM and eRFM

Build segmentation models in Salesforce Marketing Cloud

Recency, Frequency and Monetary value are the key ingredients of purchase history. They are of great value for marketers. An RFM and/or eRFM model helps you to get to know your customers and engage with them in a more meaningful way. In this blog, we will explain the basic principles of the RFM model and which steps to take to create an eRFM, based on email data. We will also give some use cases to show you how to use the insights from your eRFM model.

What is an RFM model?

The RFM model is based on three values: Recency of purchase, Frequency of purchase and total Monetary value. Customers are scored on a 1-5 scale on each value, 1 being the lowest and 5 being the highest. These value scores should be evaluated independently, as the information you get from all three enables prioritization between the three values. Combining the values leads to personal scores that give you an overview for all three measures in order (R – F – M). With these, you can evaluate your customers based on your business values. For example: 125, 251, 512, 111, 555; the first three (125, 251, 512) all contain the same item scores (1, 2, 5) but on different values. If Monetary value is most valued in your business, the first personal score (125) would be preferred over the other two in this case.

AND AN eRFM MODEL?

A classic RFM model is based on physical or online purchases. However, with today’s email business, loyalty can be also defined within email activity. A simple adjustment to the values of the RFM leads to the eRFM. The eRFM model focuses on email engagement; Recency of last email activity, Frequency of email activity and total Monetary value (conversion). In this case Monetary value is the same for the eRFM and RFM. The eRFM gives you information about customer interaction with your emails. Customers with a high Recency and a high Frequency, but with a low Monetary value might be very interested in your business, but have not yet found the motivation to purchase from your business. Once you know who these customers are, you can address them in targeted campaigns. This can increase their motivation to purchase and would increase their Monetary value.

Let’s take a look at how you set up an eRFM model, based on an actual case; For a retailer in household goods with a large number of physical stores as well as an online store, CloseContact devised a data model based on the RFM method. We used email data to produce an eRFM. This way, we gave them a full overview of their customers’ online behaviour. The following three steps describe the process:

Step 1:

collecting the data in Salesforce Marketing Cloud

When setting up an (e)RFM model, data is your best friend. Finding out which customers are the most loyal in terms of their in-store or online behaviour is the main goal for this model. For this, you need two data sources: your order information and Salesforce Marketing Cloud’s data views. The second source is standard within SFMC, so no extra work is required here. If your order information is also already in Marketing Cloud, great! If it’s not, you have some work to do.

Salesforce Marketing Cloud works with Data Extensions, simple tables that hold data. To create an (e)RFM, you will need the following three fields in your data extension: (1) a unique identifier to map your email subscribers with your order data, (2) purchase date, and (3) purchase amount. (You can add more fields if you need to filter on certain aspects, such as specific product types, purchases by adults only etc.) Make sure you create a Data Extension holding your order information. If you are not familiar with Data Extensions, you can find out more here.

With fresh order data, you can create fresh (e)RFM data. To achieve this, set up an automation with an import activity that refreshes your order data as often as needed. Follow the links to learn to create an automation and an import activity in Salesforce Marketing Cloud.

All email related data for your eRFM model is already available in SFMC. Marketing Cloud uses data views that automatically store all email related statistics. The data views are based on the Subscriber Key. If your order data does not hold the Subscriber Key, you will need to find a relevant way to match the data views with your order data. You can, for example, choose to match on email address or customer number.

The relevant data views for an eRFM are: _Sent, _Open, and _Click. As the names suggest, _Sent holds all sending information, _Open holds all opens, and _Click holds all clicks. These three are easily combined by using the unique identifiers per send that are available in all three views: jobid, listid, batchid, subscriberid.

TIP: We strongly recommend using a deduplication rule for _Open and _Click, as the data within these views are not aggregated. The easiest way to do this is to use the Unique field within both views, which allows you to select the first activity within a send for a customer. For more accuracy, use a query that selects the last activity for each send (through max() for example).

Step 2:

scoring according to Recency, Frequency and Monetary Value

As we explained, every (e)RFM is based on a five point scale. The definition of the scale is based on quantiles. Using quantiles should divide your data in five equal groups. The word “should” is important here as a skewed distribution of scores will not lead to equal groups. Therefore, quantiles are to be used as guidelines, not set rules. Always evaluate your data to ensure that the 5 scales are as equally distributed as possible. Hardcoding certain values instead of using quantiles might be needed.

Quantiles divide data in groups of equal proportion. For example, one quantile will create two groups based on their scores, <50% and >50%. RFM scores are usually based on four quantiles representing the specific scores; < 20%, 20-40%, 40-60%, 60-80%, >80%. These quantiles can be translated to scores, if your measure ranges from 1-10, the scores associated with the quantities could be: 1-2, 3-6, 7, 8, 9-10. Using the quantities, this means that 20% of your customers score a 7. Also, 20% of your customers have a score of 3-6.

To translate this to RFM scores, this means that records with a measure of 1-2 might get a R, F or M score of 1 or 5, depending if a high score is good or a low score is good. For (e) Recency, a low score is better, whereas a high score is better for Frequency or Monetary value. However, let’s say your business is not very successful yet. You have recently opened and your most loyal customer has ordered 10 times. but the other customers follow the frequency table below:

#Orders 1 2 3 4 5 6 7 8 9 10
#Customers 72 19 3 1 1 0 2 0 1 1

If we were to calculate the quantiles of these 100 customers, the first, second, third and fourth quantile fall within 1 order. To clarify: 72% of the customers fall within 1 order. This also means that <20% of the customers fall within a “score” of 1, 20-40% of the customers fall within a “score” of 2 etc. This is the exact reason why quantiles should be used as an indication only. Especially for email related data, real-life data is rather skewed. Quantiles rely on data having a normal distribution to be able to create realistic, equal groups. If you do not have this, you (e)RFM results will not represent your data in a way an (e)RFM could.

The first step for a workable RFM therefore, is exploring your data. You need to know how your data is distributed within the individual (e)RFM parameters. With this information, you can define the (e)RFM thresholds for your data. Try to create thresholds that can divide your data in groups of relatively equal size. For the table in example above, we recommend to divide the Frequency scores as follows: 1 order = 1, 2 orders = 2, 3 orders = 3, 4-7 orders = 4, and >7 orders = 5.

Also, we recommend to re-evaluate the score after more data is accumulated. An advantage of using quantiles is flexibility: they will change if your data naturally changes. Using static values instead of quantiles will not give you that flexibility. Take this into account and if you experience big changes within your data, re-evaluate your (e)RFM thresholds. After deciding the thresholds, calculate the (e)RFM scores for each of your customers and take a good look at your data. If you were to map this, you might get something as illustrated below. As the figure suggests, records that have a Monetary value of 1 do not have a Frequency value above 2. In most contexts, this makes sense. Buying more products would increase your monetary value, especially for a large retailer in household products.

heatmap of eRFM segmentation model

A heatmap of RFM scores. The colour of the block indicates how many records are in that segment

An overview like this gives you a lot of information about how your customers purchase your products. The amount of customers in different segments also gives an insight. In the infographic above, the largest number is in the 111 group. A little over 5% of customers are in the lowest Recency, lowest Frequency and lowest Monetary value.

Step 3:

Segmentation

To create the (e)RFM scores, you have two options:

  1. Calculate the (e)RFM scores within your own database and import them into Marketing Cloud. This is recommended if you collect all information in your external database and do not have this information within Marketing Cloud (yet). You can decide to keep all data where you have it anyway. Different programing languages can be used. Here, the focus will be on SQL.
  2. Use query activities and Automation Studio to automatically calculate the (e)RFM scores within Marketing Cloud.

Either option is fine. To calculate the (e)RFM scores, use the PERCENTILE_DISC() function. This calculates the score that represents the specified percentile.

Calculating the (e)RFM scores every day might be excessive. For existing records (e)RFM scores will not vary much on a day-to-day basis, so running daily queries will only yield new customers. New customers  are usually part of a lower (e)RFM segment and might need some time before they are placed in a segment-increasing-campagne. Therefore, a weekly update should be enough.

TIP: The calculation for RFM segments might work in one query, but for eRFM segments, the query should be split up. Queries using data views are prone to time-outs. Splitting up your query prevents this and helps you to keep a clear overview of every step in the process.

One way to do this, is to first query all the relevant records (1), next calculating the Recency of Order/Email activity (2), Frequency of Order/Email activity (3), and Monetary value (4). Lastly calculate the (e)RFM score (5). The sequence of these 5 queries can easily be put into an automation.

For this you need a Data Extension with a Primary Key, for example Subscriberkey. That way, you will be able to find the (e)RFM score for each individual subscriber you are emailing. Because your Data Extension has a Primary Key, you can also easily update the Data Extension through your queries.

use cases

(e)RFM scores will give you a lot of information about your customers and calculating these scores over time gives you the ability to see changes that occur within your customer base. If you notice that a certain group of records experience a change in their (e)RFM score after being in a specific current campaign, this tells you much about the effectiveness of your campaign. Another way you can use these insights is for creating targeted campaigns to make your less engaged customers more engaged and keep your most engaged customers engaged.

Records with high eRFM scores, but low RFM scores are interested in your products, but not buying for now. You’ll need to find a way to motivate those customers to convert. Create campaigns that encourage low Monetary, but high e-Frequency customers to buy your products. For the retailer we worked with, we had access to online purchases only, not in-store ones. After our analysis, they decided to market in-store discounts for those low RFM/high eRFM records.

When creating campaigns based on (e)RFM and RFM models, it may be a good idea to exclude records with both low RFM and eRFM from commercial mailings. These records appear to be uninterested in your products and emailing them might be ineffective. Most importantly: do not forget your high eRFM/high RFM records. These are your most loyal customers.

An (e)RFM model will help you get to know your customers and communicate with them in a way that fits their needs and their expectations of you. With this knowledge, you can create more effective marketing campagnes and personalise your conversations to a higher level.

 

Need our help or advice with (e)RFM modelling? Feel free to contact us!