What can I do to make customers more willing to read my emails? A guide to effective email marketing

When Ray Tomlinson, an American engineer and programmer, sent the first-ever email message in late 1971, he could not have realized what applications his invention would find. And he certainly wouldn’t have thought that someone would use this type of message to convince others to buy his products. Yet, more than 50 years after the first message, email marketing remains one of the most important channels for marketing communications. The challenge, however, remains in maintaining the effectiveness of this channel. AI tools can help with this. Before using them, however, it is worth asking yourself where to start?

Effective email marketing – how to get started?

I know that what I’m about to write, especially if someone reads it out of context, will sound trite, but sometimes it’s worth going back to the fundamentals. And the fundamental principle of email marketing can be summarized like this: if the customer doesn’t open the email, he won’t know what we wanted to communicate to him. And if he doesn’t, there will be no chance to perform the action we wanted to convince him to do. The adventure of email marketing should therefore start with getting the customer to open our mailing at all. Meanwhile, marketers know from their open rate statistics that usually most emails are ignored or even deleted without being opened. Why?

In our considerations, we will omit messages that are clearly spam. If I don’t know the sender’s address, didn’t sign up to receive such emails, or they look suspicious, the smartest thing I can do is delete them as soon as possible. So we are interested in all the other emails and the answer to the question of why recipients don’t read them. Well, according to a report by SARE, among the main reasons for deleting emails without opening them, recipients indicate:

  • I get too many messages from one sender (31.9% of responses),
  • the title is not interesting (33.4%).

These reasons account for more than 65% of all cases. Interestingly and positively from the point of view of those responsible for marketing communications, both of these factors are influenced by the sender and he is able to better target and personalize his email marketing efforts. So, it can be said that just send fewer messages to customers’ inboxes and write more interesting titles , and the return on investment will be higher. Simple, right? Unfortunately, we all know that’s not quite the case. I would even venture to say that in many cases it will simply be difficult.

Running mailing campaigns. How to improve their effectiveness?

For the first reason, the difficulty is determining how much is “too much.” Is once a week too much? Or is only once a day too much? Or maybe for one user three times a week is too much, but for another even four times a week is still ok? Or maybe… Well, that’s exactly it. The different scenarios are actually endless. It is impossible to list them all, let alone test them. On top of that, after all, we shouldn’t assume that a customer’s interest and patience are invariable over time. The only way to solve this confusing problem can be AI models based on machine learning. Based on historical and constantly incoming new data, they are able to make highly accurate predictions and optimize the appropriate shipping frequency for each individual consumer. In doing so, they are constantly improving and adapting to changes in consumer expectations. They are able to catch even very subtle signals of “overheating” of target groups and recommend reducing the rate. Someone may say: but all this looks complicated and probably expensive to implement and maintain, we’d better be careful and just send emails less often. It is hard to disagree with such a position. However, this is not the optimal strategy. With customers inclined to open your messages more often and respond with a purchase, you are losing a large portion of potential revenue this way. Thus, you are not using the potential of your contact base.

Let’s move on to the second problem, which is the uninteresting title. I can already hear the voices that appear in the heads of some of the readers. – He is about to write us something about testing and AI, and after all, we do testing without AI too, and a lot of it.

SARE’s report, cited earlier, shows that about 84% of senders conduct tests before sending campaigns. However, only less than 17% have A/B/X testing. One of the companies I have been in contact with, for example, conducts tests of the message title. Three variations of the title are prepared. Then a group of about 15% of the contact base is randomly selected. This group is divided into three equal parts, each of which receives one title variant. The variant that achieves the highest open-rate in the pilot mailing is then sent to the remaining 85% of the recipients. So we have testing, we have segmentation, we have optimization, it’s ok. However, the question that comes to mind is: what did you actually test and what question did you get the answer to? Did you definitely choose the best variant of the title, or only the best among the three proposed? How do you know that there are not 20 other variants, each of which is better than the winning one among the three tested? We don’t know, but even if we wanted to test more news topics we would get small groups. So what can we do to make the title in email marketing effective and translate into higher conversions?

Email marketing vs. Artificial Intelligence

Yes, some of the readers have surely already guessed, now it’s time to write about AI. Based on deep neural learning, the model is able, by accumulating data on currently and historically sent messages, to predict the response and estimate the most likely opening rate of any title. It will even work for a title you have never sent before. It really will. And even if the model is a little less sure of its prediction in such a case it will let you know. Just remember that a title alone is not enough information. After all, going back to the fundamentals (or clichés, as you prefer), it is important not only what we say but to whom we say it, and even when we say it. The same message can be understood completely differently: one will be pleased and the other will be offended.

So an AI model that understands the title of the message, but is detached from the context, is not enough. Deprived of information about the groups of recipients, the characteristics of customers, the history of the company’s relationship with them, their purchase history, the moment. What is needed is a system that integrates data about these phenomena and provides the AI model with the appropriate context. The general scheme can be seen below.

With a properly defined, trained and calibrated model, we can test different variants of titles and get information as in the examples below. The number of variants can be arbitrary, as can the number of segments. And best of all, we don’t have to send a single email to your recipients to run the test and estimate the expected open rate. We can conduct everything using a computer simulation.

Conducting effective email marketing – summary

When asked many years later what the content of the first email ever was, Ray Tomlinson said he couldn’t remember. The most important thing was that the message arrived at the recipient’s address. The content was not important. It didn’t matter at all. In marketing communications, the exact opposite is true. The mere arrival of an email is not enough. The content of the email is important. An interesting title is also important. Because without it, a significant portion of recipients will not read the content.

3-fold increase in conversions due to targeting of mailings based on predictive model

Today’s consumers are constantly inundated with messages from various brands. Many brands send multiple messages, through multiple channels. This makes it difficult to attract and keep the consumer’s attention for a long time. At the same time, it is easy for the consumer to become tired of the communication and pay less and less attention to it. Thus, it becomes more important than ever to choose the right content, to send the most tailored message to the consumer, and to limit messages that are not interesting and only increase the risk that the consumer will become insensitive to the message.

Predictive modeling is helping to solve the problem. Systems based on machine learning are able to predict consumer interest in a particular type of message or offer with a high degree of accuracy. The article uses a concrete and current example (from May 2023) to show how to apply the aforementioned tools in practice. Due to the highest standard of confidentiality, the numbers we will present will be scaled or shown as indexes. However, they will faithfully represent the observed differences and effects.

The problem with the traditional approach to email targeting and the need for change

The organization to which the example relates, like many others, for many years used a method of so-called “maximizing revenue” from its communications base through broad and frequent mailings. That is, in practice, information about an offer was sent to all consumers who had permission to communicate through a given channel. In a few cases, using expert criteria, the communicated base was narrowed down somewhat. However, this was based on simple criteria such as: has ever bought the promoted product before, has not bought product X in the last 6 months, is in a woman over 55, etc. The results were very good for a long time, and no one saw the need to change the process used. At some point, however, a slow decline in the email open rate (the so-called “open rate”) began to be observed. The downward trend began to be pronounced. Combined with the declining number of newly acquired consumers, this led the organization to wonder if it was possible to work better with the existing base. What can be done to reverse the trend of declining interest in the communications being sent?

The decision was made to test the integration of machine learning and predictive analytics into the process of selecting consumers for mailing campaigns. We prepared a predictive modeling system that generates “tailor-made” scoring models for each campaign. The general architecture of the system is shown in the diagram below.

architecture of predictive modeling system

Use of predictive modeling in mail targeting

For the purpose of training the model, more than 100 variables from the areas listed in the diagram were used as input data. The model is built on the basis of advanced algorithms, able to cope with such a multitude of attributes and extract from them as much information as possible about the actual profile of the consumer. The final result is an estimate of the probability of interest in a given communication by each consumer. This is then used for the final selection of consumers for the campaign.

The results of the changes in the communication targeting process met (and even exceeded in some aspects) expectations. To prove the usefulness of the model, we conducted experiments. Half of the base was subjected to selection by the old way, while the other half was selected using the model’s prediction. It should be noted here that in both groups we used exactly the same emails – the same subject, exactly the same creation. Also, the timing of the mailing was the same. Therefore, none of these factors could have affected the results of the experiment. The only difference between the groups was the way consumers were selected.

Effects of using predictive modeling

In the group targeted with the model, it was possible to reduce the size of the communicated group by nearly 14 times – for every 100 communicated with traditional criteria, there are only 7 communicated according to the process based on the predictive model.

At the same time, such a small group generated similar (only about 2% lower) sales.

This was achieved by significantly higher (3 times) conversion in the group assigned to the campaign in the new way. And also a much (4 times) higher average receipt value in that group.

Narrowing the communicated group allowed to limit it to those really interested in the offer. This is evidenced by a much higher open rate (3.2x higher) and click to open rate (almost 2x higher). The click to open rate in this case is calculated as CTOR = LC/LO, where LC is the number of consumers who clicked on the link from the email, and LO is the number of consumers who opened the email. While the open rate is highly dependent on the subject line of the email, a higher CTOR indicates actual interest in the content and offer that is included in the email.

Targeting mailings based on predictive model – summary

By using an advanced data science tool in the form of a predictive model, it was possible to achieve:

  • better matching of communications to consumer interests and needs
  • a significant reduction in the number of communications in a given campaign with minimal damage to the sales result (just over 2%)
  • reduction of communication “overload” – the consumer will receive communication less frequently but it will be better tailored in the new process

The exact impact of the model and the new targeting process on the trend of open and click-through rates of mailings, can only be studied over a longer period and requires at least several months of observation. However, the first recorded results look promising and give reason to expect a reversal of the clear negative trend seen in the months before the introduction of the scoring model.

Finally, it is worth noting that an advantage of the system is the openness of its architecture to new data sources. If new variables become available, they will be automatically incorporated into the model training process and used for prediction. Another important feature of the described solution is the model’s ability to update itself as new data arrives, including data on executed campaigns and their effectiveness. As a result, the model will automatically adapt to the changing needs and behavior of consumers and their reactions to the communications sent. This guarantees the usability of the system over the long term as well.

Attributional modeling – the key to understanding the effectiveness of marketing activities

In an era of increasing number of communication channels and brand touch points, proper identification of the importance and impact of each channel is becoming increasingly important. Correctly answering the question: to what extent did the use of a given message and channel affect the achievement of a goal is crucial for optimizing activities and maximizing the return on the invested marketing budget. The problem is as important as it is difficult. However, attribution modeling and data science methods come to the rescue.

What is attribution modeling?

Attribution modeling is the process of building a model to assign value to each of the touchpoints along a customer’s conversion path. It aims to understand which marketing channels and activities contribute to achieving business goals, such as making a sale, acquiring a new customer, activating dormant customers, recruiting new loyalty program participants or increasing brand awareness. Under the term attribution model, there can be many different constructs, from very (too) simple to very complex. In general, models can be divided into: single-point, rule-based multi-point and algorithmic multi-point models.

Single-point modeling

Single-point models allocate the entire value of a conversion (or, more broadly, goal achievement) to only one point of contact. Typical approaches are first-click or last-click. These are simplistic models. They do not take into account the entire customer conversion path. They don’t take into account the interactions between different points of contact and the context in which it takes place. Their advantage is simplicity and ease of application. However, in the complex world of today’s marketing, they are too simple to reliably reflect reality.

Rule-based multi-point models

Multi-point models distribute value among different touch points along the customer path. At the same time, they are divided into rule-based and algorithmic models. The former assign value to individual contacts based on predefined rules. For example:

  • linear model – assigns equal value to each contact point encountered by the consumer on his path to conversion;
  • U-shaped model – assigns the greatest value to the first and last points of contact, intermediate points are of lesser (though non-zero) importance in this model;
  • model based on conversion time – assigns the greater value the closer the point was to the moment of conversion. In this model, the greatest weight is assigned to the last contact immediately preceding the conversion.

The advantage of rule-based models is their clarity and relative simplicity. Also that they do not omit any touch points on the path to conversion. However, their weights are given based on arbitrary rules. Justification can be found for each of them. However, it is impossible to say which one is the best. As with single-point models, their disadvantage is also that they do not take into account interactions between different points of contact and do not take context into account.

Algorithmic multipoint models

Algorithmic models, like rule-based models, assign a weight to each touchpoint along the customer path. However, instead of arbitrary rules, they use sophisticated statistical methods to determine these weights. So instead of adopting predefined rules, these models “learn rules” from real data (using machine learning methods). Such models take into account the order of contact points and interactions between them. For example, the impact of an email on conversion may be greater when it was preceded by a banner display. They also take into account context, e.g. time of year, weather, media activity of competitors, pricing. They can operate at a very detailed level, e.g. distinguish the impact of individual creative variants or where and when they are displayed.

It is hard not to agree with the statement that today these types of models are the “gold standard”. Only they make it possible to take into account the entire complexity of consumer-brand contact paths. However, behind the accuracy and benefits of algorithmic models, there are associated challenges. In particular, as to the quantity, quality and scope of the data and the analytical competence required to create them. They also have the disadvantage of limited transparency due to the complexity of the rules that govern reality and are identified by the model. Algorithmic models, however, allow for advanced simulation (what if?) of various scenarios, e.g. what if we dropped channel A altogether? what if we reduced the budget for channel B? what if we switched the order of messages in the sequence? This in turn allows you to optimize your budget and activities. The investment in this type of model can therefore more than pay for itself.

Summary

Marketing attribution models have undergone a long evolution from simple single-point models to multi-point models based on complex machine learning algorithms and statistical methods (including those based on deep artificial neural networks). In doing so, it is still an area of intensive research and experimentation both in the scientific community and among practitioners. Despite the complexities and challenges of their creation and application, they are increasingly accessible thanks to the falling costs of data collection and processing. Thus, we are entering an era where we should not ask “whether” they are worth using, but “how” to build and use them effectively.

Predicting sales in unpredictable times. Why is it important to forecast not only sales but also demand?

Unfortunately, the unstable economic situation is making it increasingly difficult to maintain a profitable retail business. Retailers must be able to predict the future with some accuracy in order to run a profitable business. Forecasting sales and demand are therefore becoming two key aspects of business planning.

Sales prediction or demand prediction – which to choose?

The terms sales prediction and demand prediction are sometimes used interchangeably. However, there is a fundamental difference between them. What does this difference refer to and which prediction should we particularly focus on? That’s what we’ll discuss in today’s article.

To begin with, it is worth taking a moment to recall the relationship between the key terms demand, sales and supply. Demand refers to the amount of products or services that customers would like to purchase in a given period. Sales, on the other hand, is the amount of products or services that were actually sold during that period. For sales to occur, there must be a supply of products or services capable of meeting demand. This is because supply is the amount of products and services supplied that are available during a given period. Therefore, there are no sales when there is no demand. However, there are also no sales when there is demand and not enough supply. Generally, therefore, we can deal with three situations:

  1. Demand = supply
    Ideal situation: customers are satisfied with the ability to meet their needs, and the company is satisfied because it sells all available inventory.
  2. Demand > supply
    Not all customers are able to satisfy their needs, while the company bears the cost of lost potential sales. Such a situation arises, for example, when there is a shortage of a particular commodity in the warehouse or on the store shelf at the time when the consumer would like to purchase it. In a competitive market, the customer can then buy a substitute product/service from a competitor.
  3. Demand < supply
    An unfavorable situation for a company that has frozen money in merchandise lingering on the shelves, loses the ability to use store space and logistical resources to supply products in demand, and runs the risk of losing the value of the product altogether (e.g., as a result of exceeding the expiration date).

Accurate demand prediction avoids situations 2 and 3, or at least minimizes their scale and associated costs. At the same time, we can identify 5 areas where demand prediction brings benefits.

Benefits of demand forecasting

  • Optimization of production and inventory
    With accurate demand prediction, a company can better predict how much product it will need for a given period This allows it to optimize production processes and control inventory levels.
  • Increase sales
    Ensuring the right amount of products in stock allows the company to increase its sales and customer satisfaction.
  • Better planning of marketing campaigns
    By having an accurate prediction of demand, a company can better plan which products (or product categories) and during what period it pays to promote.
  • Optimization of prices
    With demand prediction and knowledge of inventory, a company can optimize the price of a product to balance demand with supply and maximize profit.
  • Cost reduction
    With accurate demand prediction, a company can avoid the costs of excess inventory and unnecessary logistics costs.

Sales prediction vs. demand prediction – differences

However, what if we prepare a sales prediction instead of a demand prediction? In such a situation, we risk underestimating. As we have already noted, sales occur when demand meets supply. In a situation where supply is insufficient (lack of goods) then demand will not be met and sales will be lower than they could be. In the extreme case with a total lack of goods on the shelf, sales will be 0. A predictive sales model can correctly predict the lack of sales in such a case. However, using such a model to decide on the right product inventory will result in underestimation and loss of potential sales. To make matters worse, the accuracy rates of such a model can be very high. This is because we may be dealing with a self-fulfilling prophecy:

No goods → zero sales → model predicts no sales in the next period →
decision to not supply the product (since no sales are assumed) → no goods.

And the circle closes.

This is a potentially costly mistake at the model conception stage and a trap into which companies sometimes fall. Meanwhile, machine learning methods make it possible to build and train predictive models capable of predicting demand (and not just sales). Such models take into account a number of different factors influencing demand (including seasonality, price, weather, promotions) and can operate at any level of aggregation (product group/single product, region/store group/single store, etc.).

Summary

Accurate demand prediction is the key to success. It allows you to reduce costs, increase sales and improve customer satisfaction. However, these benefits can only be provided by the right selection of data science methods suitable for solving this kind of problem.