Consideration of Simpson’s Paradox as a relevant concept for travel platforms
Debyolyy, Dmitriy (2022)
Debyolyy, Dmitriy
2022
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2022081519467
https://urn.fi/URN:NBN:fi:amk-2022081519467
Tiivistelmä
The main purpose of the thesis work was to examine the usefulness of considering Simpson's paradox in data analysis in the field of travel platforms and what benefits can be derived from it.
In the course of the study, there were reviewed the theoretical aspects concerning the Simpson paradox and its nature, presented some examples of the application of the concept in different areas, and carried out a research study dedicated to the analysis of data from Booking.com.
The research part involved the analysis of secondary data in order to detect Simpson's paradox. The research included examining five hotels and calculating their mean values for three variables that could potentially serve as the underlying cause (confounding variable) of the paradox. All calculations were performed using the Python programming language, and bootstrapping method was used to add statistical reliability to the study.
The conclusion states that there is an absence of Simpson's paradox in the examined data, however, the study revealed a series of patterns associated with the studied variables and the reviewer's rating. Such patterns provide a basis for the improvement and optimization of recommender systems of Booking.com. Thereby will be useful for both travel platforms and users of its services: hotels and travelers.
In the course of the study, there were reviewed the theoretical aspects concerning the Simpson paradox and its nature, presented some examples of the application of the concept in different areas, and carried out a research study dedicated to the analysis of data from Booking.com.
The research part involved the analysis of secondary data in order to detect Simpson's paradox. The research included examining five hotels and calculating their mean values for three variables that could potentially serve as the underlying cause (confounding variable) of the paradox. All calculations were performed using the Python programming language, and bootstrapping method was used to add statistical reliability to the study.
The conclusion states that there is an absence of Simpson's paradox in the examined data, however, the study revealed a series of patterns associated with the studied variables and the reviewer's rating. Such patterns provide a basis for the improvement and optimization of recommender systems of Booking.com. Thereby will be useful for both travel platforms and users of its services: hotels and travelers.