RFM Analysis at York University

Introduction

In the for-profit world, RFM analysis is one of the most commonly used techniques for customer segmentation. Basically, the RFM analysis group customers based on their transactional history.

R stands for Recency: How recently did the customer purchase?
F stands for Frequency: How often do they make purchases?
M stands for Monetary Value: How much do they spend in total?

The RFM analysis focuses on purchase behaviors and doesn’t take affinities or non-purchasing behaviors into consideration. However, this makes the RFM analysis straightforward to identify the best customers. The idea behind RFM is quite simple and relatively easy to be accepted by stakeholders. In addition, the principles are well applicable in the non-profit world. Donors who recently gave are more likely to give again. The frequency of giving reflects donor loyalty, and is one of the most important predictors for planned giving. Monetary value is important since it reflects a donor’s capacity to give. Therefore, I recently conducted an RFM analysis as a pilot data analytics project.

Data Preparation

I collaborated with Information Services Team to get the raw data. For this pilot project, I wanted to analyze the giving data for the last 5 years, from the fiscal year 2016 to the fiscal year 2020. For the first round of analysis, my colleague helped me pull out two CSV files from our database, focusing on revenue and the funds raised respectively. Basically, there are three types of gift transactions in our database: gift payment, pledge and pledge payment. I decided to use the CSV file of revenue data instead of the funds raised, because the latter one included unpaid pledges as well. For the first round of analysis, I used Microsoft Excel to clean up the data and calculate the RFM scores. In this process, I found that the giving data was highly skewed right because it also included gifts made by organizations and through estates. I then requested another pull of giving data to exclude organizations and estates. At the same time, my supervisor pointed out that it would be great to be able to see donations by constituent type and by faculties and schools. Therefore, this time, only donations made by individuals were pulled out from our database. The raw data also included entity types as well as the beneficiary faculties and schools.

Analysis Methods

As mentioned, I have conducted two rounds of analysis: as for the first round, I used Excel as my analysis tool. For the second round, I used Python as my primary tool. Therefore, I’d like to compare both tools to better understand their advantages and disadvantages. In general, I want to put our donors’ giving data into 5 bins for each score (Recency, Frequency and Monetary Value). In this way, the top bin (5) represents frequent donors whose giving is among the most recent 20% and whose total giving is among the top 20% of all donors. After all, we want to focus on the top 20% of donors based on the well-known 80/20 rule.

Excel Analysis

For RFM analysis, I only need three fields of data: Donor ID, Gift Date and Gift Amount. I first copied and pasted the selected fields into a new spreadsheet, and then sorted it by Donor ID and Gift Date. The next step was to create a pivot table to show the most recent gift date, the count of IDs, and the sum of gift amount. Then, I used Excel's PERCENTILE function to find the bin thresholds. After that, I assigned each donor to the right bin based on the bin thresholds. Finally, I adjoined each value in three bins to create a RFM score (e.g., 4-5-5) and added the score to the original table. Next, we can segment donors based on their RFM scores. We want to make sure top donors with scores of 5-5-5 are appropriately assigned and actively cultivated. Also, we’ll pay more attention to big donors who have lower frequency and recency scores (e.g., 3-4-5). They may have made large pledges several years ago. As for planned giving, we need to focus on frequent givers with scores like 4-5-2 or 3-5-1. As for annual giving, we may want to upgrade donors with scores like 3-3-3 or 2-4-3.

Python Analysis

Compared with Excel, Python proves to be more effective with powerful data processing packages. For the RFM analysis, I used the follow packages: Pandas, Datetime, Matplotlib and Seaborn. I followed the same logic mentioned above. With Datetime, it becomes easy to find the most recent gift date. Pandas DataFrames makes data manipulation much more efficient, from selecting or replacing columns and indices to reshaping data. Matplotlib and Seaborn are used for data visualization and exploratory data analysis. Data visualization reveals repeating trends, insights and outliers that would otherwise be hard to find. It only takes about 30 lines of codes to accomplish the analysis and the result can be easily exported to CSV file.

Conclusion

The RFM analysis helps generate many useful insights about donors. Donors who give large amounts of gifts definitely display certain patterns of behaviors and characteristics. Learning about top donors will help identify similar major gift prospects in the future. In addition, it is important to tailor marketing strategies based on the donor segmentation, such as loyal donors, new donors and lapsed donors. The focus of annual giving for existing, loyal donors can be on upgrading them to leadership levels. For further improvement, I will remove top 1% records as they might skew the analysis. These top donors need to be studied separately, as they are outliers. I will also try to use K-means clustering algorithm to decide RFM clusters in the future.

About the Author

Alice Wen is joining the Division of University Advancement at the University of Toronto as a Prospect Management Analyst. Most recently, she worked as an Advancement Research Analyst at York University. Alice has over 9 years of experience in prospect research and prospect management. She has earned a Master's Degree in Library and Information Science from Western University. Before graduation, she gained the opportunity to work as a Prospect Research Intern at the Ivey Business School. Ever since then, she has worked at various types of non-profit organizations, including Carleton University, the Nature Conservancy of Canada (NCC), the Toronto Symphony Orchestra (TSO), and Diabetes Canada. She is really passionate about what she does and serves on the board of Apra Canada as the Director of Conference. She has become increasingly interested in the intersection of data analytics and fundraising and is now pursuing a Certificate in Big Data Analytics at York University.