Welcome to the Customer RFM Analysis project! 🚀
This project was built to dive deep into customer purchasing behavior using RFM (Recency, Frequency, Monetary) analysis. By analyzing over 23,000 transactions from 1,000 customers, we transform raw data into actionable marketing intelligence. 💡
Why treat all customers the same when they shop differently? 🤔
This project aims to solve that by applying RFM analysis to an e-commerce dataset. The goal is simple: Identify and segment customers based on their buying habits so marketing teams can target them with precision.
Developed as part of a Data Analytics course, this repository showcases practical, real-world skills in data cleaning, feature engineering, and data storytelling! 📊✨
The analysis runs on two primary datasets:
- 🧑🤝🧑
Customer_Master_Data.csv: The "Who" – Contains demographics like age, city, gender, marital status, and join date. - 💳
Customer_Transactions.csv: The "What & When" – Contains the gritty details of transaction dates and amounts.
The entire workflow is housed in customer_rfm_analysis.ipynb using Python's data powerhouse libraries (Pandas, NumPy, Matplotlib, Seaborn). 🐍 Here's the step-by-step breakdown:
- 🧹 Data Loading & Cleaning: Ingested the data, standardized datetime formats, and scrubbed out those pesky nulls.
- 🔗 Data Merging: Brought the two worlds together with a clean left join on
CustomerID. - 🧮 RFM Calculation:
- 🕒 Recency: How many days since their last purchase?
- 🛒 Frequency: How many times have they purchased?
- 💰 Monetary: How much have they spent in total?
- 🎯 RFM Scoring: Ranked customers from 1 to 5 across each metric using smart quantile binning (
pd.qcut). - 🧩 Customer Segmentation: Grouped customers into 7 distinct, business-ready buckets:
- 🏆 Champions
- ❤️ Loyal Customers
- 👀 Potential Loyalist
- 💸 Big Spenders
⚠️ At Risk- 💔 Lost
- 🤷♂️ Others
- 🎨 Data Visualization: Brought the numbers to life with Countplots, Barplots, Scatterplots, and a brilliant Pareto Chart showing cumulative revenue impact.
- Targeted Marketing Unlocked: The segmentation strategy clearly identifies who needs a VIP reward (🏆 Champions) and who needs a win-back email (
⚠️ At Risk). - The 80/20 Rule is Real: Our Pareto Analysis vividly highlights that a small percentage of top-tier customers are driving a disproportionate chunk of the revenue! 📈💲
- 📓
customer_rfm_analysis.ipynb: The star of the show. The full Jupyter Notebook. - 🌐
customer_rfm_analysis.html: An easy-to-read, exported version of the notebook. - 📊
.csvfiles: The raw data fueling the analysis. - 📖
README.md: You're reading it!
Want to play with the data? Awesome! Let's get you set up:
- Clone this repository to your local machine.
- Install the required Python libraries (if you don't have them already):
pip install pandas numpy matplotlib seaborn
- Launch Jupyter Notebook or Jupyter Lab:
jupyter notebook
- Open
customer_rfm_analysis.ipynband run the cells!▶️