Design and implementation of a predictive analytics system for customer behavior prediction
Arumaperuma Arachchilage, Ravindu (2026)
Arumaperuma Arachchilage, Ravindu
2026
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2026052717805
https://urn.fi/URN:NBN:fi:amk-2026052717805
Tiivistelmä
The aim of this thesis is to design and implement a predictive analytics system for customer behavior prediction. The purpose of this thesis is to develop a Python-based web application to analyze customer data and predict customer behavior. Supporting businesses in decision-making through predictive analytics and data visualization is also expected from this thesis. How to process customer data and understand behavior patterns, how to predict customer behavior using predictive analytics, and what types of information support businesses in decision-making are the main research questions of the thesis.
The research methodology of this thesis is based on a practice-oriented approach, combining theoretical knowledge with practical implementation. First, the theoretical backgrounds of predictive analytics, descriptive analytics, customer behavior analysis, machine learning, and data visualization are discussed. The development process for the predictive analytics system is then discussed. Public datasets from Kaggle are used for the implementation. Several preprocessing operations, including missing value handling, data transformation, and one-hot encoding, are applied before model training. Predictive models are developed for purchase frequency, product category, satisfaction level, and customer churn prediction using Random Forest classifiers. Flask is used for the backend web application development, and HTML, CSS, together with Chart.js, are utilized for the frontend implementation and data visualizations.
The developed system consists of three parts. Those are the customer behavior prediction system, churn prediction system, and the insights dashboard. The Insights dashboard is designed to present descriptive analytics and visual representations of customer data. It helps to understand customer behavior patterns, trends, and distributions through charts and graphs. This dashboard includes visualizations such as age distribution, gender distribution, education levels, purchase frequency, product categories, customer loyalty, income levels, and satisfaction levels.
Among the behavior prediction models, the purchase frequency prediction model achieves 77% accuracy, and the product category prediction model achieves 72% accuracy. The satisfaction level prediction model obtains an accuracy of 54%. The churn prediction model has 96% accuracy and high precision and recall. Higher churn prediction model accuracy shows that customer churn risk can be efficiently identified using machine learning methods. These predictions help to execute customer retention strategies and targeted marketing. The overall results of the system demonstrate that customer behavior analysis and business decision-making can be improved through predictive analytics and machine learning. Additionally, this thesis helps to understand customer trends, churn risks, and purchasing patterns. Furthermore, the thesis demonstrates how interactive dashboards and data visualizations help to understand customer data. As future developments, this system is planned to improve by using real-world business datasets, cloud deployment, advanced AI models, and real-time data integration.
The research methodology of this thesis is based on a practice-oriented approach, combining theoretical knowledge with practical implementation. First, the theoretical backgrounds of predictive analytics, descriptive analytics, customer behavior analysis, machine learning, and data visualization are discussed. The development process for the predictive analytics system is then discussed. Public datasets from Kaggle are used for the implementation. Several preprocessing operations, including missing value handling, data transformation, and one-hot encoding, are applied before model training. Predictive models are developed for purchase frequency, product category, satisfaction level, and customer churn prediction using Random Forest classifiers. Flask is used for the backend web application development, and HTML, CSS, together with Chart.js, are utilized for the frontend implementation and data visualizations.
The developed system consists of three parts. Those are the customer behavior prediction system, churn prediction system, and the insights dashboard. The Insights dashboard is designed to present descriptive analytics and visual representations of customer data. It helps to understand customer behavior patterns, trends, and distributions through charts and graphs. This dashboard includes visualizations such as age distribution, gender distribution, education levels, purchase frequency, product categories, customer loyalty, income levels, and satisfaction levels.
Among the behavior prediction models, the purchase frequency prediction model achieves 77% accuracy, and the product category prediction model achieves 72% accuracy. The satisfaction level prediction model obtains an accuracy of 54%. The churn prediction model has 96% accuracy and high precision and recall. Higher churn prediction model accuracy shows that customer churn risk can be efficiently identified using machine learning methods. These predictions help to execute customer retention strategies and targeted marketing. The overall results of the system demonstrate that customer behavior analysis and business decision-making can be improved through predictive analytics and machine learning. Additionally, this thesis helps to understand customer trends, churn risks, and purchasing patterns. Furthermore, the thesis demonstrates how interactive dashboards and data visualizations help to understand customer data. As future developments, this system is planned to improve by using real-world business datasets, cloud deployment, advanced AI models, and real-time data integration.
