Building Topic Modelling on Theses Abstracts Data : Thesis Supervisors Finder for Students
Vu, Mai (2021)
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-03102000
https://urn.fi/URN:NBN:fi:amk-03102000
Tiivistelmä
This thesis focuses on topic modeling on theses data from Finnish Universities of Applied Sciences students collected from Theseus. The main objective of this thesis is to extract valuable information and create useful applications using given data.
The thesis starts with the project background, mentioning the project’s resources, including the data and the CSC’s supercomputer used to run the algorithms. Next is the comprehensive theoretical background that describes NLP and its methodical approach, as well as topic modeling and preprocessing techniques. This section provides the foundation of knowledge for the implementation of the LDA and DTM algorithms in the next chapter. The implementation is done on Anaconda and supercomputer using Python language, and NLTK, gensim library. Thereafter, the results of tested models are reviewed and compared to find out the most suitable; two of those are noteworthy. Eventually, a discussion on improving the project is presented. Afterward, a small test is made to build the thesis supervisor finder for students as proof of concept using the model. The last section is the conclusion of this research.
In short, this thesis is written in hopes that it contributes to researches of applying AI to gain more insights to improve teaching quality and student experience in the higher education sector.
The thesis starts with the project background, mentioning the project’s resources, including the data and the CSC’s supercomputer used to run the algorithms. Next is the comprehensive theoretical background that describes NLP and its methodical approach, as well as topic modeling and preprocessing techniques. This section provides the foundation of knowledge for the implementation of the LDA and DTM algorithms in the next chapter. The implementation is done on Anaconda and supercomputer using Python language, and NLTK, gensim library. Thereafter, the results of tested models are reviewed and compared to find out the most suitable; two of those are noteworthy. Eventually, a discussion on improving the project is presented. Afterward, a small test is made to build the thesis supervisor finder for students as proof of concept using the model. The last section is the conclusion of this research.
In short, this thesis is written in hopes that it contributes to researches of applying AI to gain more insights to improve teaching quality and student experience in the higher education sector.