Real-Time Data Streaming at Bank utilizing Confluent Kafka and Microsoft Fabric
Parmar, Jiteshkumar (2025)
Parmar, Jiteshkumar
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025060219242
https://urn.fi/URN:NBN:fi:amk-2025060219242
Tiivistelmä
In the rapidly evolving financial landscape, traditional batch-based data processing systems are increasingly inadequate for addressing real-time business needs such as fraud detection, customer personalization, and regulatory compliance. This thesis explores the implementation of a modern, real-time data streaming architecture within a banking context, leveraging Confluent Cloud for event-driven data ingestion and Microsoft Fabric for analytics, storage, and reporting.
A dual research methodology was employed – Case Study and Action Research – focused on the author’s real-world engagement with a Finnish bank. Quantitative and qualitative surveys were conducted among engineers, analysts, IT operations, and decision makers to assess existing platform limitations and expectations for future systems. The findings revealed strong sup-port for managed, scalable platforms with enhanced monitoring, low latency, and self-service capabilities.
A prototype was developed integrating IBM MQ source connectors into Kafka topics, with real-time enrichment, storage via Azure Data Lake Gen2, layered transformation using Fabric’s medallion architecture, and reporting through Power BI. The implementation reduced latency by over 90% in some workflows compared to legacy batch systems.
This thesis contributes a practical blueprint for banks seeking to modernize their data infrastructure. It also identifies the technical, organizational, and compliance considerations required to sustainably adopt real-time streaming architectures in regulated environments.
A dual research methodology was employed – Case Study and Action Research – focused on the author’s real-world engagement with a Finnish bank. Quantitative and qualitative surveys were conducted among engineers, analysts, IT operations, and decision makers to assess existing platform limitations and expectations for future systems. The findings revealed strong sup-port for managed, scalable platforms with enhanced monitoring, low latency, and self-service capabilities.
A prototype was developed integrating IBM MQ source connectors into Kafka topics, with real-time enrichment, storage via Azure Data Lake Gen2, layered transformation using Fabric’s medallion architecture, and reporting through Power BI. The implementation reduced latency by over 90% in some workflows compared to legacy batch systems.
This thesis contributes a practical blueprint for banks seeking to modernize their data infrastructure. It also identifies the technical, organizational, and compliance considerations required to sustainably adopt real-time streaming architectures in regulated environments.