Examination of air pollutant concentrations in Smart City Helsinki using data exploration and deep learning methods
Bhuiyan, Rabbil (2021)
Bhuiyan, Rabbil
2021
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2021060113276
https://urn.fi/URN:NBN:fi:amk-2021060113276
Tiivistelmä
Air quality has become a major concern for most of the cities around Europe due to rapid urbanization and industrialization. Smart City is an initiative to solve such problems by integrating information and communication technology with citizens. Smart City, through smart computing technologies, allows capturing of huge data and the real picture of the domain problem. Provided by huge sensor data, air quality can be considered an essential component of the Smart City concept. The current thesis utilized the data from the Horizon 2020 mySMARTLife project, in which pollution detection sensors were deployed on public transport vehicles (trams) for continuous monitoring of pollution concentrations such as NO, NO2, CO, and O3 throughout the day. The study applied widely used several deep learning methods such as Convolutional Neural network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) for predicting hourly pollutant concentration based on spatial and meteorological information. The study also proposed an evaluation of features selection with different combinations of features for the model’s performance and showed the accuracy is increased by fusing meteorological variables and temporal feature engineering data. To figure out the best model performance, four evaluation measures such as coefficient of determination (r2), Mean Square Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE) along with model parameter optimization were applied. It is observed that all the models performed comparatively well in prediction at 24-hour window horizons. Particularly, LSTM architecture outperforms all the models in prediction quality having lower MAE values of 0.09, 0.056, 0.096, and 0.114 for NO, NO2, CO, and O3 respectively. Nevertheless, given the computational efficiency of the CNN algorithm, it can substitute deep feedbackward networks such as RNN, LSTM, and GRU models to predict pollutants rapidly and accurately in case of big data.