Big Data and Google Cloud Platform : Usage to Create Interactive Reports
Zinchenko, Yevgen (2017)
Zinchenko, Yevgen
Metropolia Ammattikorkeakoulu
2017
Creative Commons Attribution-NonCommercial-ShareAlike 1.0 Finland
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-201705086856
https://urn.fi/URN:NBN:fi:amk-201705086856
Tiivistelmä
This study attempts to show how the use of Big Data products on Google Cloud Platform can be a possible solution to process considerable amounts of data. The focus of this thesis is on Big Data processing and modern tools offered on Google’s technology stack. The goal of this project is to create a backend that operates with the increasing amounts of incoming data.
Big Data products that are considered in this work are Cloud Pub/Sub, DataFlow and BigQuery. Cloud Pub/Sub offers guaranteed delivery of messages up to 100 million messages per second. DataFlow gives a possibility to concentrate on the logic flow of the application while underlying infrastructure will handle parallel data processing. BigQuery provides storage with quick access to Terabytes of data.
The outcome of this work is a new backend that utilizes Big Data products from Google. The implementation of the backend allows processing of high traffic without performance penalties. This work provides a clear guideline of design and implementation of systems with similar architecture that process big amounts of data.
The new backend offers clear market advantages in processing more data where a company can expand to new markets without the risk of an overload. In future projects it might be really interesting to analyse the data with Google’s machine learning products.
Big Data products that are considered in this work are Cloud Pub/Sub, DataFlow and BigQuery. Cloud Pub/Sub offers guaranteed delivery of messages up to 100 million messages per second. DataFlow gives a possibility to concentrate on the logic flow of the application while underlying infrastructure will handle parallel data processing. BigQuery provides storage with quick access to Terabytes of data.
The outcome of this work is a new backend that utilizes Big Data products from Google. The implementation of the backend allows processing of high traffic without performance penalties. This work provides a clear guideline of design and implementation of systems with similar architecture that process big amounts of data.
The new backend offers clear market advantages in processing more data where a company can expand to new markets without the risk of an overload. In future projects it might be really interesting to analyse the data with Google’s machine learning products.