Bayesian Causal Analysis – an Effective Method For Measuring the Effectiveness of Marketing Campaigns

We send up to a dozen different marketing messages to our customers in a week. Often the same audience participates in many different campaigns. We use various criteria for qualifying consumers for mailings. Usually different campaigns compete for the same “best” customers. We can’t always use control groups. Sometimes, by mistake, a control group gets syndicated. In the end, we ask ourselves: which of these actions had a real positive effect and translated into important indicators for us. Doesn’t this sound like an everyday occurrence in many companies?

Traditional approaches, like A/B testing or simple statistical models, fail in more complex situations where campaigns overlap, there are effects of confounding variables, and customer behavior varies. Bayesian causal inference offers an innovative and flexible alternative that addresses these challenges. Below, we will look at how the model works, its key advantages, and the limitations to be considered.

Challenges faced by marketers

Simple methods can only be effective under ideal laboratory conditions. The reality of the market, however, is far from them. Marketing faces many challenges that make it difficult to accurately assess the effectiveness of campaigns. We can point out here, among others:

  • Campaign overlap – customers often receive several marketing messages in a short period of time, making it difficult to attribute sales to a specific campaign. – customers often receive several marketing messages in a short period of time, making it difficult to attribute sales to a specific campaign.

  • Criteria for assigning customers to campaigns – the selection of customers for a campaign is not random; higher-value customers may receive promotions more often, which can distort analysis results.

  • Customer diversity – Customer responses to campaigns vary based on preferences, demographics and buying patterns. Overlooking these differences leads to results that do not reflect the true effectiveness of the campaign.

Bayesian causal inference addresses these challenges by taking into account varying customer behavior, overlapping campaigns, and the impact of customer selection on campaign outcomes.

What is Bayesian causal analysis?

Bayesian causal analysis is a statistical method that makes it possible to assess the real impact of marketing campaigns on either sales or other key metrics (e.g., loyalty) for a marketer. This is achieved by building a model that describes the mutual influence of various factors on the generated consumer behavior. The model integrates the marketer’s prior knowledge (e.g., gleaned from analysis of previous campaigns), updates it based on available data, and then generates probability distributions for campaign effects and customer behavior.

Key Components of the Model

In a Bayesian model for causal analysis of marketing campaigns, three elements are key:

Baseline Purchase Propensity – Regardless of the campaign, every customer has a natural propensity to make their next purchase. This depends, for example, on their lifecycle stage, current attitude toward the brand, or financial situation. Properly modeling the baseline spending level, taking into account demographic, behavioral, and transactional characteristics, helps isolate the actual incremental impact of the campaign.

Campaign Impact – Each campaign included in the analysis has its probability distribution of impact on sales modeled separately. This allows an assessment of how much the campaign increases sales above the baseline spending level.

Hierarchical Structure – To ensure scalability and improve the interpretability of results, customers can be grouped into segments, such as by demographics or behavioral traits, with shared parameters for each group.

Advantages of Bayesian Causal Analysis in Marketing

  • Attribution Accuracy – modeling separate parameters for campaign impact and baseline customer behavior allows overlapping campaign effects to be disentangled, resulting in more precise attribution of sales to specific activities.
  • Resilience to Non-Random Campaign Selection – the model accounts for the selection criteria’s impact, giving it an edge over traditional methods. For example, wealthier customers might be more frequently targeted by certain campaigns. Without considering their natural tendency to spend more, the model would overestimate the actual campaign effect.
  • Accounting for Customer Heterogeneity – the Bayesian method captures the diversity in customer behaviors and characteristics. The model reflects how these factors influence both baseline purchase propensity and the effectiveness of the campaign itself (as the same campaign may affect different customer groups in different ways).
  • Precise Estimation of Uncertainty – the Bayesian method provides a full probability distribution of outcomes, offering better insight into the reliability of the estimates. This minimizes the risk of drawing incorrect conclusions from the analysis.

Limitations of the Bayesian Method

Like any methodology, Bayesian causal analysis has its limitations and challenges:

  • Sensitivity to Initial Assumptions – with limited data, the method is quite sensitive to initial assumptions about the campaign’s presumed effectiveness. However, as more data becomes available, the influence of these prior assumptions diminishes.

  • Analytical Expertise – While the results are straightforward to interpret and accessible to those without specialized data science knowledge, conducting the analysis requires advanced statistical, analytical, and programming skills.

  • Technical Requirements – the method is computationally intensive, and analyzing large consumer databases with numerous campaigns demands advanced computational infrastructure. Properly adapting the model to available data and computational capacity can mitigate risks and challenges. This highlights the importance of skilled personnel conducting the analyses for the marketing department.

Conclusion

Bayesian causal analysis provides an accurate and scalable approach to evaluating the impact of marketing campaigns on key marketer metrics (e.g., sales, loyalty). The model accounts for customer heterogeneity, selection criteria impacts, and overlapping campaign effects, offering more precise results than traditional methods. While there are limitations, such as dependence on initial assumptions and computational complexity, the method delivers valuable insights that support data-driven decision-making. This enables marketing teams to better allocate budgets, tailor campaign strategies, and optimize customer communication, ultimately enhancing marketing spend efficiency.

Profit immediately or loyalty for years? Machine learning in short- and long-term customer value managementProfit immediately or loyalty for years?

One of the key challenges in customer relationship management is balancing short- and long-term Customer Lifetime Value (CLV) perspectives. Is it better to accept a lower profit now for the potential of a higher, though uncertain, future gain? Should companies invest in customers today with the hope of reaping greater rewards in the future, and if so, how much? These are just some of the dilemmas marketers face. Traditionally, there is a strong temptation to focus on quick profits at the expense of building long-term relationships with customers. However, in today’s highly competitive marketplace, strategically combining short-term gains with long-term customer value is becoming crucial for a company’s survival and growth. Machine learning offers a unique solution to help balance these two perspectives.

Short-term vs long-term customer value

Short-term customer value refers to the immediate revenue generated from a single transaction or a brief customer interaction. Examples of activities aimed at maximizing short-term value include short promotional campaigns, aggressive sales strategies, or rapid remarketing efforts. The primary goal here is to generate quick profit, often through discounts, price cuts, or “buy now” offers designed to encourage immediate purchases.

In contrast, long-term customer value reflects the total worth a customer can bring to a company over several years. In this case, it’s not just about how much a customer spends in a single transaction, but also their loyalty to the brand, frequency of return, recommendations, and influence on other customers. Strategies focusing on long-term value aim to build lasting relationships, engage customers, and create experiences that foster satisfaction and loyalty.

Machine learning (ML) and maximizing customer value

Machine learning (ML) is revolutionizing many aspects of how companies manage customer relationships. Advanced predictive models can forecast a customer’s potential value over a specified time period, taking into account not only past and present purchases but also a customer’s interactions with the brand, their engagement in marketing campaigns, and many other variables.

With accurate predictions, companies can tailor their strategies to target specific customer segments. For instance, they might invest more in customers with high long-term potential or focus on maximizing the short-term value of customers less likely to remain loyal. The matrix below illustrates different approaches for customers based on their estimated value to the company.

The key to successfully implementing this approach lies in the precise prediction of customer value across both timeframes. Achieving this level of accuracy is nearly impossible without leveraging machine learning, especially when dealing with large consumer bases.

Machine learning capabilities

While this example divides customer value into two categories—short-term and long-term—machine learning allows for much more granular segmentation. Predictive models can be developed for any number of time horizons, tailored to the specific needs of a consumer base or industry. Moreover, machine learning is not limited to simplistic high/low classifications. In practice, it allows for far more nuanced predictions.

Moreover, it is also possible to estimate precisely how qualifying a particular customer for a particular campaign will affect his short- and long-term value. In the example below, a campaign with the code A411 for a customer with id 1411642 is very likely to increase his short-term value – the vast majority of the area to the right of the red vertical line indicating 0 (no change). However, the model estimates that the impact of this campaign on the customer’s long-term value will be rather negative – the vast majority of the area below the horizontal red line indicating 0 (no change) in the long term.

Summary

Balancing short-term and long-term customer value remains one of the greatest challenges for marketers. Thanks to machine learning, companies no longer have to choose between quick profits and long-term customer loyalty – they can pursue both simultaneously.

In practice, this means algorithms can predict which customers will bring the most long-term value, helping companies focus on relationship-building with them, while optimizing quick sales campaigns for less loyal customers. Machine learning helps prevent “marketing myopia,” where companies focus solely on short-term gains at the expense of future loyalty.

The future of marketing will rely on even more sophisticated predictive, optimization, and automation models, enabling businesses to fully maximize the potential of each customer – both in the short and long term.

Why am I not getting the expected benefits from using AI in marketing?

Interest in the topic of artificial intelligence (AI) is not waning. Almost every day there are announcements of further breakthroughs in this field. The public is most excited about the achievements of so-called generative AI. They are the most spectacular and most impressive. Talking to a computer in natural language, a computer painting pictures or creating a movie based on a script given to it appeal to the imagination. However, it is worth remembering that AI is also more mundane models that, operating in the background and without attracting as much attention, play an important role in many business processes including marketing. Sometimes, however, they do not deliver the expected benefits and their performance is sometimes disappointing. What mistakes can contribute to such situations and what can be done to avoid them?

Problem worded incorrectly

Sometimes a fundamental problem arises at the very beginning of an AI project. It is a clear disconnect between the actual business need and the definition of the problem the AI team sets out to solve. For example, a marketing team has a goal of reducing the number of departing customers. It intends to use a special action involving coupons with attractive discounts to do so. Naturally, the budget for this activity is limited. As a first step, the team wants to identify the customers most at risk of churn.

So he commissions the AI cell to develop a model that estimates the probability of leaving for each customer. Team AI does its job brilliantly. It builds a model with very high prediction accuracy. So the marketing department decides to use the model and qualify those with the highest probability of leaving for action until the budget is exhausted. The action happens. Quite a few at-risk consumers stay. Everyone has the feeling of a job well done and a budget reasonably used. But was the budget really used optimally? Could something have been done better? It turns out that yes.

Instead of the probability of leaving, one could predict the chance of a positive reaction to the action. A seemingly minor difference. However, it could yield dramatically better results. The variant used qualified those most at risk. Among them, however, were people who could not be persuaded to stay by any action. These are very often precisely the people at the top of the at-risk list – frustrated with customer service, disappointed with the quality of the product, already looking for an alternative supplier for some time. Using the budget for these consumers, slightly lower (but still high) risk consumers were left out, who were nevertheless more likely to change their decision thanks to the action. A certain portion of the budget was wasted on trying to convince those who could not be convinced. At the same time, the opportunity to convince those whose decision could still be influenced was missed. A better, more precise definition of the problem in the context of the expected business effect would have made it possible to benefit much more from the opportunities offered by the use of AI algorithms.

Inappropriate measures of success

In many situations, adopting the wrong indicators can lead to wrong decisions. They are also sometimes associated with giving up tools that. One thriving Polish company had a custom-built AI predictive system that allowed for personalized recommendation of an offer to be included in a mailing. The company had a very broad product portfolio and many competing offers. The system created was to select a communication that was relevant to the consumer and at the same time maximized the possible profit. It was also a matter of not bombarding the consumer with too many messages. The main concern was that “spammed” customers would opt out of receiving mailings. It boasted that the “unsubscribe” rate remained very low. Under intense pressure of sales performance, however, they began to see limiting the number of messages as an obstacle to achieving goals. Managers assumed that increasing the number of messages sent would bring more sales. They were only concerned about increased unsubscription rates.

A quasi-experiment was conducted to increase the number of messages while observing sales and quit rates. Sales increased and an increase in the abandonment rate was not observed. This encouraged further increases in the number of messages until the maintenance of the aforementioned AI tool was abandoned altogether. The company thus took a step backward. The model was discarded in favor of “expert” qualification of consumers for communications. The churn rate, which remained stable, kept decision-makers convinced that the number of mailings, of course, if it remained within, as they put it, “the limits of common sense,” did not discourage customers from subscribing. The voices of the data science team, which tried to convince them to take a broader view of the problem, were ignored. A schoolboy mistake was made.

Managers ignored the apparent downward trend in the open rate. The advanced model was discarded and considered an unnecessary cost. They failed to consider that consumers may be saturated to the point where they start ignoring messages from that sender. They stop opening them, and as a result, they also don’t care to click on the “unsubscribe me” link. The possibility of maintaining a long-term relationship, and the possibility of generating profits from the communication in the future, has been sacrificed for the short-term sales effect.

Mistakes in communication between marketing and AI teams

The common denominator of the two situations cited earlier is, in fact, the lack of adequate communication between the marketing team and the AI team. In the first case, more information could have been communicated to the AI team regarding the business objective and context (including budget constraints) of the project. This would have provided an opportunity for a more adequate definition of the problem and a fuller exploitation of the possibilities offered by advanced modeling. Consequently, this would have translated into better budget utilization and higher ROI. In the second case, more weight should have been given to the concerns raised by the AI team about the definition of the problem and the measure adopted. This would have avoided, costly in the long run, the wrong decision to return to old methods and reject the potential of AI.

The success of the project and the full realization of the AI opportunity requires good communication and interaction between marketing experts and AI experts. Avoiding the following mistakes can help achieve this:

  • too broadly defined business objective (“we want to reduce the number of departing customers” is several levels of detail too few),

  • vague definitions of fundamental concepts (sometimes it is a challenge to define what it means that a customer has left),

  • failure to define the context and actual business objective in the brief given to the AI team,

  • concealment by the marketing team from the AI team of deficiencies in understanding the specifics and capabilities of AI solutions,

  • related excessive expectations of the project’s results, or recognition in advance that AI cannot help solve a given marketing problem,

  • the AI team’s concealment from the marketing team of deficiencies in understanding of marketing issues and the project context,

  • the related limitation of the AI team to a literal interpretation of the brief provided,

  • The use of industry “newspeak”,

  • excessive focus on technicalities at the expense of the AI team’s loss of business perspective and the real purpose of the project.

Summary

The examples cited in the article are just the tip of the iceberg. Some may see both situations as simple, even schoolboy mistakes. That’s fine. It means that they are already at a higher level of understanding of the specifics of working with AI projects. However, there are pitfalls lurking there as well. For others, even these two cited examples may be eye-opening, make them reflect and look for similar problems in their own projects. That’s a good thing, too. It means they are taking another important step on the road to more fully realizing the potential that lies in marketing applications of AI. In any situation, it’s important to remember that good communication and cooperation between the data science/AI team and the marketing team is needed to apply AI successfully.

How to get out of the RFM trap

(R)ecency, (F)requency, (M)onetary value is a classic of marketing analysis. We all know it. Many of us use it. Each of us understands it. But do we really? RFM analysis undoubtedly has many advantages, which is why it has been in use for many years. However, it is also worth learning about its disadvantages and limitations. In order to use it properly and not try to solve problems with it that it cannot solve.

RFM in practice

The RFM approach is a customer segmentation method that is based on three main indicators:

  • Recency (last purchase): determines how (in)long ago the customer made a purchase.
  • Frequency (frequency of purchase): measures how often a customer makes purchases.
  • Monetary (value of purchases): determines how much the customer spends during each purchase.

Assuming that we have data on customer transactions, we calculate the value of the three previously mentioned indicators (R, F, M) for each customer. Then we divide the values for each indicator into groups (for example, 3), where group I is the top 1/3 of customers in terms of a given indicator, group II to the middle 1/3 of customers, and group III is the weakest 1/3 of customers. For better understanding, let’s consider the example of a particular customer. This customer last made a purchase 15 days ago, which is quite recent. He buys on average 2 times a month, which is a high frequency for this consumer base. His average receipt, however, is a mere £50, which is a very low value compared to other customers.

Thus, our example customer is in the top groups in terms of R and F and in the weakest group due to M. Thus, he can be labeled R-1, F-1, M-3. With this division, the consumer can end up in 1 of 27 segments (3R x 3F x 3M = 27). It is worth noting that for each group can also be categorized into more than 3 compartments. It all depends on how many RFM segments you want to get. As I mentioned earlier – when dividing into 3 we get 27 segments. If we divide into 4 we get 64 segments. If we divide into 5 then the number of segments will increase up to 125. If we divide into 6 we will get 216 segments, and if we decide to divide the range of each of the 3 variables into 10 compartments then the total number of segments will reach as many as 1000.

What advantages does RFM have?

Among the advantages of the RFM approach we can point out:

  • Simplicity and ease of interpretation – this is due to the small number of variables (three), the simple to explain manner in which the analysis is carried out and, consequently, the intuitive interpretation of the resulting segments.
  • Relative speed of conducting the analysis.
  • No need for specialized software.
  • Relatively low requirements as to knowledge of statistical methods.

What does the simplicity of RFM analysis lead to?

Unfortunately, listed among the advantages, simplicity is also the source of one of the most significant drawbacks of the RFM approach. Limiting the number of dimensions to 3 (R,F,M) makes it easier to interpret the results, but at the same time narrows the resulting consumer profile. Summarizing a consumer’s transactional activity using 3 numbers, is often an oversimplification. It can give a very distorted (and even falsified) picture of the customer. The following example illustrates this.

The diagram shows four customers with noticeably different buying patterns, but identical characteristics in terms of last purchase, frequency of purchase and seniority (first purchase).

The same problem also applies to the value of spending. A customer who always spends £50 and a customer who spends once £1 and once £99 will have the same average receipt value.

Another limitation of the RFM method is that it ignores non-transaction aspects of consumer behavior. For example, such as interactions with marketing communications directed to them, contacts with various points of contact with the company (e.g., complaints, calls to the call center) or demographic aspects (e.g., in some industries, frequency and spending may change with age).

Among the limitations of RFM, the focus on history is also worth mentioning. RFM summarizes consumer buying behavior from a certain point in time to a certain point in time. It helps to segment customers based on their past transactions. However, RFM says nothing about their future behavior. By itself, it has no predictive power.

There are various variations and expansions of the RFM model. New variables are added to it (e.g. RFD, RFE, RFM-I) or the way they are calculated is modified. The goal is to overcome the limitations mentioned earlier. However, they do not change the fundamental problem and that is the attempt to describe complex consumer behavior with a few aggregated numbers.

What is the RFM trap?

Thus, RFM can become a trap primarily when:

  • it is the main (or even the only) tool we use to analyze and plan communication strategies and activities.
  • RFM results are overinterpreted, i.e., conclusions are drawn from them for which the method provides no basis.

In the first case, we omit an important part of the consumer data that could be used, we work on the basis of a very narrow picture of the consumer, and we risk combining within a single segment consumers with completely different behavior patterns.

The second case is primarily about using RFM to predict future consumer behavior especially at the micro level. This can manifest itself, among other things, in the assumption that the consumer who achieves the highest values of RFM indicators will remain in the best “segment” in the future . This, of course, may be true. However, this is not determined solely by past purchases, but by many other factors that RFM, as a rule, does not take into account.

What are the alternatives to RFM?

As we noted earlier, the main limitations of using the RFM approach boil down to:

  • An oversimplified view of the consumer and thus a high risk of segmentation error.
  • Lack of predictive capabilities for consumer behavior.

Multidimensional segmentations based on machine learning are excellent for solving the first problem. In their case, the number of factors taken into account can be almost unlimited. Segments are defined algorithmically. Consumers are also assigned to them in the same way. The cost of greater detail is slightly more difficult interpretation. However, an experienced analyst can visualize segments in such a way that they are easily understood by managers.

The prediction problem is best solved with dedicated methods and algorithms for building predictive models. Such models are not only limited to the analysis of historical data, but recognize specific patterns of behavior that allow predictions of the future. Such prediction can be carried out at the level of a single consumer (rather than a segment) and allow full personalization of actions. Among the specific methods worth mentioning here are behavioral sequence models based on deep machine learning.

Summary

The well-known and popular RFM model can be a very useful tool. Provided, however, that it is properly interpreted and used with an awareness of its limitations. It’s a great tool to get a “bird’s eye view” of the consumer base. However, when you want a more detailed picture and a prediction of future consumer behavior, you should turn to tools specialized for solving such problems.