Designing and Implementing Secure Data Flows in Hybrid Software Systems: Cloud-to-Ground Integration
Pudas, Lauri (2025)
Pudas, Lauri
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025060219165
https://urn.fi/URN:NBN:fi:amk-2025060219165
Tiivistelmä
The usage of on-premises environments has not vanished in this day of cloud computing. Especially projects that require full control over the hardware used or have needs to meet set security level are still actively using and developing on self-managed servers. Hybrid cloud infrastructure can open possibilities to extend less secure parts of these projects into public clouds to ease development or serve as data warehousing and preprocessing platform. If public clouds were to be utilized, the project must take care of security related issues that arise from this extension.
In this study, different technologies to connect on-premises environment to public cloud environment were compared. The public cloud environment was acting as a less secure environment and on premises environment was acting as a more secure environment. The study was based on a research scenario where several types of data were transferred from cloud to on-premises. The comparison was focused on creating secure data or message streaming solution with emphasis on constraint where requests were invoked only from on-premises side. Candidates were AWS SQS, Apache Kafka, and PostgreSQL, and they were introduced by themselves and described how they solved the problem. Furthermore, all candidates were compared with each other to highlight their differences for each category.
The research result was that Apache Kafka was chosen from the pool of candidates. Apache Kafka was best suited for the scenario, but it had a few key points that might reduce its desirability. Kafka’s multiuse potential and configurability were seen as highly positive attributes. Kafka also provided high throughput with history playback. However, Kafka requires separate clusters in sending and receiving environments and if the project did not utilize Kafka in any other way, this additional resource requirement can be seen troublesome. Finally, Kafka was implemented to be locally developed within a single workstation for hybrid cloud environment usage. Implementation utilized common tooling and showed the most important steps and configurations to be reproduced.
In this study, different technologies to connect on-premises environment to public cloud environment were compared. The public cloud environment was acting as a less secure environment and on premises environment was acting as a more secure environment. The study was based on a research scenario where several types of data were transferred from cloud to on-premises. The comparison was focused on creating secure data or message streaming solution with emphasis on constraint where requests were invoked only from on-premises side. Candidates were AWS SQS, Apache Kafka, and PostgreSQL, and they were introduced by themselves and described how they solved the problem. Furthermore, all candidates were compared with each other to highlight their differences for each category.
The research result was that Apache Kafka was chosen from the pool of candidates. Apache Kafka was best suited for the scenario, but it had a few key points that might reduce its desirability. Kafka’s multiuse potential and configurability were seen as highly positive attributes. Kafka also provided high throughput with history playback. However, Kafka requires separate clusters in sending and receiving environments and if the project did not utilize Kafka in any other way, this additional resource requirement can be seen troublesome. Finally, Kafka was implemented to be locally developed within a single workstation for hybrid cloud environment usage. Implementation utilized common tooling and showed the most important steps and configurations to be reproduced.