Quick Profits with RFM Analysis
by Arthur Middleton Hughes

 

It often comes as a shock to people new to direct marketing that the response rates are so low. Successful, profitable promotions often result from sales to 2% or less of the mailed universe. Database marketers, today however, are finding that they can greatly increase these response rates in marketing to their existing customers by use of Recency, Frequency, Monetary (RFM) analysis. The results are nothing short of amazing. Let me give you one example.

An educational products company in the South had a two million name customer database, built up from sales over a five year period. Every spring they mailed their entire list with an attractive video offer which regularly got a response rate of about 1.3%. It did not produce much profit, but moved a lot of product. Last year, one of their marketing officers went to a seminar where he learned about RFM. On his return he directed his programmers to code the customer database for RFM, creating 125 RFM "cells". He did a promotion to a representative test sample of 30,000 which produced a net loss. From that test, however, he learned the response rates of each of the 125 "cells". For his rollout, he mailed only 554,000 of the two million who were in the 34 cells that did better than break even on the test. The result: a response rate of 2.76% and a net profit of $307,000.

His experience is not unique. All across America, database marketers are waking up to the gold mine in their customer databases that can be opened up using RFM. In this article, we will explain the principles behind RFM, and detail some of the research that is currently being conducted in this field.

How RFM Works

RFM has been used in direct marketing – particularly by non-profits – for more than thirty years. It is based on both appropriate reasoning and empirical evidence of customer behavior. People who have bought from you recently are much more likely to respond to a new offer than someone who had made a purchase in the distant past. This can easily be illustrated by anyone with a customer database that includes purchase history. The database has to keep one piece of information in every customer record: the most recent discretionary purchase date. The database is sorted by that date, and the top 20% (in terms of recency) is given a code of "5". The next 20% in terms of recent purchases is coded as "4", etc. Everyone in the database now is either a 5, 4, 3, 2, or 1 in terms of recency. If you now make a test promotion to a representative sample, you will get a response that looks like this:

This graph, like the others in this article, is taken from actual results obtained on mailings done in 1995. It shows that the response rate from the top quintile (20% group) was 3.49%, while the next quintile responded at a rate of 1.29%. This clear trend of response rates by RFM quintile is almost universal across all products and services, all industries and all types of customers. It is one of the few "constants" in the marketing world. Someone who has just purchased insurance from you is much more likely to buy another policy than someone whose last insurance purchase was many months or years in the past.

Frequency Is Less Powerful

If your database keeps track of the number of transactions with your customers, you can also code your customers by frequency. Sorting the database by this number – from the most to least frequent, coding the top 20% as "5", and the less frequent quintiles as 4, 3, 2, and 1. A promotion to your customer base will produce the following results:

You will notice right away that frequent buyers respond better than less frequent buyers, but the differences are much less pronounced than those for Recency. That is why RFM is RFM instead of FRM or some other combination. Notice in particular that the lowest quintile in frequency did better than quintile number 2. Why should that be? For a simple reason. Brand new customers have a recency code of "5" but a frequency code of "1". So the lowest frequency quintile always contains the new customers – who are your best responders.

Monetary Is Almost Flat

When you code customers by the total dollar sales (average by month, year, or since the beginning of time) giving the big spenders a "5", and the others, 4, 3, 2, and 1, you will get a response rate that resembles this:

Monetary, you see, does show differences between quintiles, but they are far from as dramatic as those for Recency. Is this true for all products and services? Not necessarily. If you are selling mutual funds, you might get a much better response rate from your big spenders than the small spenders, simply because they may be in a position to buy more. But that is not necessarily a firm rule. Response does not measure ability to respond as much as willingness to open the envelope and read the contents. That willingness is not necessarily a function of the size of one’s bankroll. Why should a customer with a million dollars respond better than a customer with $10,000? It is unlikely that she will. The average millionaire is deluged with more offers than the average person, so getting through to her is really much tougher.

Putting The Three Codes Together

RFM analysis depends on recency, frequency and monetary measures, but the real power of the technique comes from combining them into a three digit RFM "cell code". Using the quintile system explained above, all customers end up with three digits in their database records. They are either 555, 554, 553, 552, 551, 545,...down to 111. There are 125 different "cells" in all. If the coding is done correctly (see side bar on sorting methods), all cells will have virtually identical numbers of customers. If your database has one million customers, each cell will contain exactly 8,000 customers. Using these three digit codes you can turn any test into a highly profitable rollout. Here’s how it is done.

Using An Nth As A Test

From your RFM coded database, pick out a test group. Let’s say that you select 30,000. If you have 125 cells, each cell will contain 240 customers. Mail an offer to these 30,000 customers, and keep track of the response rate of each cell. Here are the results of a mailing in 1995 to 30,000. The response rates varied from 8.33% down to 0.00%. The top ten cells looked like this:

555 8.33%
554 6.66%
553 5.42%
552 4.17%
551 4.58%
545 3.75%
544 5.00%
543 2.50%
542 4.17%
541 2.92%

When all the responses from all the cells were graphed and indexed for break-even, the responses looked like this:

Only 34 out of the 125 cells did better than break even. Break even means that the net revenue (profit) from sales to members of the cell exactly paid for the cost of mailing to the cell members. Once you know how each cell on the test responded to your offer you have some very powerful information: you know how each cell in your unmailed database will respond to the same offer. You make your profits by not mailing to the losing cells. Depending on the circumstances, you can double, triple or quadruple your response rates.

Contrast RFM To Demographic Modeling

Probably the greatest single advantage to RFM analysis is that anyone can do it. You don’t need to be a statistician, or to hire a modeler to do the analysis. It can all be done on a spreadsheet. The results are amazingly accurate. Take a look at the following graph which compares the predicted response rates of the 34 cells that were mailed in the rollout with the actual response rates achieved:

The bars show the response rate from each of the 34 profitable cells on the test. The lines show the response rates of these cells on the rollout. The accuracy is uncanny, and seldom, if ever, replicated on a model based on demography. Why should that be? It is because demographics provide information on what people are: their income, age, presence of children, home ownership, etc. RFM measures what people do: when they buy, how often they buy, how much they buy – of your products and services. What you are trying to predict is what people will do. Clearly any system based on customer behavior is much more likely to be accurate in predicting future customer behavior than any possible combination of demographic information.

RFM for Windows

We have been teaching the principles of RFM in our seminars at The Database Marketing Institute in Washington, Toronto and elsewhere for the past three years. Inevitably the question always comes up: "How do we do the coding of the customer database?" Our answer was always, "You have to get your programmers to do it for you."

That is easy to say, but harder to do. As we all know, asking MIS programmers to write new marketing software is like pulling teeth. They are busy with something else, and marketing always has a low priority on the MIS pecking order. Last year at the Institute, we decided to do a little R & D. We commissioned a professional software house to write a Window’s based RFM program that would do everything necessary to code customer databases for RFM, select test groups using an Nth, select records by RFM cell, and provide reports so that the marketers could do the entire job themselves without the assistance of either a statistician or a programmer. The resulting program, called RFM for Windows is now in Beta Test in twenty five marketing companies in the US, as well as in Canada, Holland, the UK and Brazil. The product will be available for interested marketers in the late spring of this year.

Using the Internet

It is interesting how we found the firm to write the software. We really didn’t know how to go about our search, so we turned to the Internet. Looking from place to place, we found something called "Software University". We figured that perhaps there was a professor there who would like to moonlight writing code, so we sent an e-mail message to Software University asking if anyone would like to write PC code for money. We got three answers the first day. The first was from the President of Software University, whose message was "Don’t post such messages here!". He deleted our e-mail message. Before he was able to do this, however, we received two other responses, one of which came from a representative of the firm that we subsequently engaged for the job. Where is this firm located? In the Kharkov in the Ukraine – in part of the old Soviet Union. What do they know about PC programming in the Ukraine? Plenty. They did the job rapidly, and came up with scores of ideas – and beautiful graphics – that we would never have thought of ourselves. I believe that the final product is better than we could have had created in the US in the time available. Incidentally, we found the firm through their agent, David Margolius of M&M Data Systems in Bellevue, Washington. (206) 869-9236.

Hard Coding of RFM Categories

News of the development of RFM for Windows has produced comment and controversy around the globe. One correspondent from London raised the question, "Why select exact quintiles? Why not make the RFM categories exact dates and numbers such as 0-3 months, 3-6 months etc.?" He has a valid point. During the past thirty years, RFM has traditionally been "hard coded" into exact quantities. Why change the system?

The answer goes to the heart of RFM. Using exact quantities assumes several things: 1) You have to know your file well in order to select the categories. 2) You have to use a programmer to write the code to do the selection. 3) You will keep this programmer busy changing categories since, over time, the number of customers in each category will shift. If you are doing your job well, the number in 0-3 months will grow so big that it becomes unwieldy. You will have to shift to 0-2 months, and readjust all your other categories at the same time.

In short, hard coding is expensive in terms of programmer time. What is the purpose of RFM? To make profits by reducing costs. If the method you have selected for coding RFM is an expensive one, you have defeated the purpose of RFM from the outset.

If you use exact quintiles, the job can be done without a programmer. You save thousands of dollars a year by that step. You let the program decide who is in each of your quintiles. Of course, using exact quintiles, there will be a certain arbitrary quality to the coding. If your dividing line between the top quintile and the next lower quintile is February 12th, for example, some people with that purchase date will lie just above and some just below the dividing line, and so will be arbitrarily a 5 or a 4. Don’t worry about it! RFM coding should be done regularly – certainly before every customer promotion – so the arbitrariness one month will be corrected in the next month when the dividing line falls on some other date.

Why Quintiles?

Another good question concerns the number of cells needed. Quintiles results in 125 cells (5 x 5 x 5). For some small business to business databases, that number may be too big. For some very large databases, it may be too small. There are really two conflicting goals of RFM: 1) you want to have as many cells as possible so that you can more accurately predict exactly who is going to respond to your offer. 2) you want to have as few cells as possible, because each test cell has to be large enough to be statistically valid, and many cells make for expensive tests. For RFM for Windows we developed a panel that reconciles these two goals:

This panel combines many RFM principles into one screen. In the first place, the break even response rate is that rate where the profit from promoting a cell exactly equals the cost of promoting the cell. These two amounts are shown in the first two boxes on the left. The break even response rate is in the next box in the middle. The minimum test cell size is shown in the center of the panel. It is produced by the formula: Min = 4 / (break even response rate). In the example shown here, the cost per piece mailed is $0.55, and the profit (after all costs for the product and fulfillment have been taken out) from a successful sale is $40. The break even response rate is 1.38%. As a result, it will require that you mail a minimum of 291 customers in each RFM cell to be sure that your results are accurate. After some experience in your particular product situation, you can adjust that up or down with the "Experience Adjustment" to get what you consider to be an accurate reading with the minimum number of test pieces mailed.

If one were to have 125 cells of 291 each, you would have to mail 36,375 for a valid test. At $0.55 each, your test would cost you $20,006. If your test budget is lower—say $16,000, then something has to give. You might well reduce your RFM cells to 100 instead of 125, which would bring your costs down to $16,005. If you decide make this change in your RFM cells, you will then have to change your divisions of R, F, and M. Here is a possible way of accomplishing this:

Changing the division for Monetary from quintiles to quartiles (as shown by the number 4 on the right of the Monetary box), reduces the RFM cells to 100 (5 X 5 X 4) and brings the test mailing within your budget. Why reduce Monetary and not Recency or Frequency? Because, as we have seen, Monetary is normally much less powerful as a discriminator, and can safely be reduced without jeopardizing your accuracy of prediction.

Who Are Your Best Customers?

You have often heard it said that 80% of your business comes from 20% of your customers. That may be true, but how do you actually measure this number? RFM gives you one way which is quite useful. Look, for instance, at a report generated by RFM results:

This report shows that for this small customer database, 85% of the total sales came from the 20% top customers with regard to frequency.

The Future of RFM.

As we have said, RFM principles are more than thirty years old. Most modern database marketers, however, have never heard of them, and are not using them in their work. They are paying high salaries to modelers to help them figure out which customers will respond best. At the Institute, we feel that RFM has an extremely bright future. As soon as marketers begin to realize that they can do this type of analysis at their PC’s, with no modeling expense, and no programming expense, while doubling or tripling their response rates, there will soon be an explosion of use in RFM. RFM for Windows is just completing its Beta phase. Readers who want to experiment with use of it in their marketing programs should call the Institute at (703) 644-4830.

When NOT to use RFM

RFM is like taking drugs. It gives you such a high that you want to use it all the time. For database marketers, this would be a tragic mistake. The result of using RFM is that you will mail to your most responsive customers, and will neglect all of the others. That means that they will get absolutely no attention, and you will eventually lose them. That is OK for some of them, who are worthless anyway. But you want to try to hang on to the rest of your customers who, with a little attention can be persuaded to move up to a more profitable RFM cell. Your marketing should be designed to encourage customers in some cells to do just that.

The second reason for being careful in your use of RFM is because your best customers may end up suffering from "file fatigue" if you mail to them too often. You don’t want to wear out your welcome with a barrage of promotions to these good people.

To put RFM into perspective, it is a valuable tool in your marketing arsenal. You should use it when you want a quick success – it will always produce quick and profitable results. But it should be combined with other programs – newsletters, reactivation mailings, etc. in a well rounded relationship building program designed to boost retention, referrals and sales. Just because it is powerful and profitable, don’t abuse it.

Sidebar

How RFM is calculated

RFM, when done correctly, results in cells of almost exactly equal size. The sorting process that produces this result is shown here:

The recency sort is done first. Each of the five recency groups is sorted by the number of transactions to create twenty-five frequency groups. Each of the 25 frequency groups is sorted by monetary amount to create the 125 final RFM cells. These cells differ in size from one another by only one customer. There are 31 different sorts in all needed to create 125 RFM cells. For a large database, this may take some time. It should be repeated, however, on a monthly basis. The previous month’s cell code should be stored in the customer record to keep track of which customers are moving up and which moving down as a result of your marketing efforts.

 


Arthur Middleton Hughes is Vice President of The Database Marketing Institute. Ltd. (Arthur.hughes@dbmarketing.com) which provides strategic advice on relationship marketing. Arthur is also Senior Strategist at e-Dialog.com (ahughes@e-Dialog.com) which provides precision e-mail marketing services for major corporations worldwide. Arthur is the author of Strategic Database Marketing 3rd ed. (McGraw Hill 2006). You may reach Arthur at (954) 767-4558 .


The articles on this web site are available to the general public to read, enjoy and for limited business use. If you want to reprint more than one or two of them for resale or use in a business or educational environment, send an email to Arthur Hughes at arthur.hughes@dbmarketing.com. He will give you permission by return email. The cost, depending on the number of copies you want to reprint, is very inexpensive.