Turning bytes into business value at scale: creation of a data product ecosystem
Pessi, Lauri (2025)
Pessi, Lauri
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025053018214
https://urn.fi/URN:NBN:fi:amk-2025053018214
Tiivistelmä
The development-oriented thesis studies the application of Data Mesh based concept of Data Products as a framework to enhance the creation of value from data in the target organization. While data- and information products have been discussed in the academic and professional literature since 1990’s, the idea of Data Mesh along with its specific interpretation of Data Products was introduced 2019 by Zhamak Dehghani as a socio-technical paradigm change on how analytical data and the processes around it should be re-organized.
During recent years, the proposition of Data Products and decentralized Data Mesh have created a lot of hype, as they were introduced with a promise to fix well known scaling and bottlenecking challenges among other negative traits attributed to the existing data architectures and operating models. The lineage of challenges within analytical data processing can traced from recent 21st century Data Lakes back into early 1980’s emergence of Data Warehouse and beyond. The historical backstory is covered in chorological order within the theoretical section to establish proper grounding for the topic.
The fundamental idea of Data Mesh lends from software development patterns of domain driven design and microservices-architecture and suggests managing data as product and distributing their ownership from central data organization closer to the business domains who inherently know the data. Shifting the ownership closer to the original source domain should relieve the bottlenecks created by data interpretation and preparation from central data organization, whereas Data Products should work as interoperable units of exchange over domain boundaries also within a more distributed ecosystem.
While the research was aimed to first identify the demand, and then to craft a suitable implementation of Data Products to meet it in the target organization, the results and theoretical background suggest that implementing just Data Products without appropriate realignment of ownership would likely produce only diminished returns in response. Another tangential key finding is the possibility of an emergent trend driving a more universal decentralization of data utilization and ownership, which was pointed out in the reviewed prior research and further validated by weak signals observed during the research.
The tangible outcomes of this research are the contributions into the conceptual model for data products in the target organization as well as the high-level technical architecture, including aspects of scalable self-serve data platform. As the timeline of this research covers only the early exploration and design phase activities, the most valuable takeaways as of today can be found in addition to the material results also from the areas suggested for further investigation.
During recent years, the proposition of Data Products and decentralized Data Mesh have created a lot of hype, as they were introduced with a promise to fix well known scaling and bottlenecking challenges among other negative traits attributed to the existing data architectures and operating models. The lineage of challenges within analytical data processing can traced from recent 21st century Data Lakes back into early 1980’s emergence of Data Warehouse and beyond. The historical backstory is covered in chorological order within the theoretical section to establish proper grounding for the topic.
The fundamental idea of Data Mesh lends from software development patterns of domain driven design and microservices-architecture and suggests managing data as product and distributing their ownership from central data organization closer to the business domains who inherently know the data. Shifting the ownership closer to the original source domain should relieve the bottlenecks created by data interpretation and preparation from central data organization, whereas Data Products should work as interoperable units of exchange over domain boundaries also within a more distributed ecosystem.
While the research was aimed to first identify the demand, and then to craft a suitable implementation of Data Products to meet it in the target organization, the results and theoretical background suggest that implementing just Data Products without appropriate realignment of ownership would likely produce only diminished returns in response. Another tangential key finding is the possibility of an emergent trend driving a more universal decentralization of data utilization and ownership, which was pointed out in the reviewed prior research and further validated by weak signals observed during the research.
The tangible outcomes of this research are the contributions into the conceptual model for data products in the target organization as well as the high-level technical architecture, including aspects of scalable self-serve data platform. As the timeline of this research covers only the early exploration and design phase activities, the most valuable takeaways as of today can be found in addition to the material results also from the areas suggested for further investigation.