Big data handling through automated dataflow : network performance analytics suite
Huovinen, Mike (2025)
Huovinen, Mike
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202503134236
https://urn.fi/URN:NBN:fi:amk-202503134236
Tiivistelmä
The subject of the thesis is big data handling and data visualization in NPAS
environment. The study deals with discusses about being able to monitor single
individual user equipment in network based on IMSI, which has not been done
yet in over the air testing of 5G. The purpose of this work was to create a new
kind of dataflow with IMSI mapping for test analytics in Nokia. OTA testing lacks
the ability to monitor single individual real UEs during testing, so Nokia’s OTA
teams needed a new tool for separating real UE data from other simulator generated traffic for analyzing KPIs on single UE level.
The study consists of a theory section and a practical section. First some background information is provided and after that comes section of mobile networks
followed by big data itself and tools used in thesis. Thesis was project-based, so
before the project itself, there comes section of test environment, which includes
parts dedicated to test line, TTI trace and NPAS.
The author created a fully automated dataflow from collecting TTI trace logs to
visualizing the data in Apache Superset. First TTI traces are collected and then it
is transferred via NiFi and decoded with Nokia’s internal TtiTraceHelper tool. After
that, decoded CSV files are parsed and modified via python script. Then those
new CSVs are sent to Apache Druid through Apache Kafka. Druid can easily
manage the data and forward it to Apache Superset for visualization. For separating the real UEs from other testing devices, the UE’s IMSIs are needed for
identification. IMSI provider server is used to collect those IMSIs, and they are
mapped together based on RNTIs which are included in both TTI trace and IMSI
provider server’s logs. The python parsing script is being used for the mapping.
The result was not as expected since IMSIs could not be mapped in every situation of testing. Some tools used in the thesis did not work as planned and only
some of the IMSIs could be mapped with TTI trace.
Almost all phases of work were done as trial and error to see how things turn out.
Usage of earlier mentioned tools came up as the project was progressing. Still a
lot of improvements can be done for the system such as optimization work and
data management.
environment. The study deals with discusses about being able to monitor single
individual user equipment in network based on IMSI, which has not been done
yet in over the air testing of 5G. The purpose of this work was to create a new
kind of dataflow with IMSI mapping for test analytics in Nokia. OTA testing lacks
the ability to monitor single individual real UEs during testing, so Nokia’s OTA
teams needed a new tool for separating real UE data from other simulator generated traffic for analyzing KPIs on single UE level.
The study consists of a theory section and a practical section. First some background information is provided and after that comes section of mobile networks
followed by big data itself and tools used in thesis. Thesis was project-based, so
before the project itself, there comes section of test environment, which includes
parts dedicated to test line, TTI trace and NPAS.
The author created a fully automated dataflow from collecting TTI trace logs to
visualizing the data in Apache Superset. First TTI traces are collected and then it
is transferred via NiFi and decoded with Nokia’s internal TtiTraceHelper tool. After
that, decoded CSV files are parsed and modified via python script. Then those
new CSVs are sent to Apache Druid through Apache Kafka. Druid can easily
manage the data and forward it to Apache Superset for visualization. For separating the real UEs from other testing devices, the UE’s IMSIs are needed for
identification. IMSI provider server is used to collect those IMSIs, and they are
mapped together based on RNTIs which are included in both TTI trace and IMSI
provider server’s logs. The python parsing script is being used for the mapping.
The result was not as expected since IMSIs could not be mapped in every situation of testing. Some tools used in the thesis did not work as planned and only
some of the IMSIs could be mapped with TTI trace.
Almost all phases of work were done as trial and error to see how things turn out.
Usage of earlier mentioned tools came up as the project was progressing. Still a
lot of improvements can be done for the system such as optimization work and
data management.