Developing a Chatbot for Internal Documents
Zürcher, Alexandre (2024)
Zürcher, Alexandre
2024
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on 
https://urn.fi/URN:NBN:fi:amk-2024053119570
https://urn.fi/URN:NBN:fi:amk-2024053119570
Tiivistelmä
This product-based thesis presents the development of a document-based chatbot for the Swiss company Innovatim. The chatbot’s main aim is to provide accurate, document-based responses to end-user queries, and to be embeddable onto any existing website. The end goal of Innovatim is for the product to be a commercial product that they can sell to other companies, in a B2B business model.
One distinctive feature of this project is its use of entirely open-source technologies, enabling the creation of a privacy-focused software, tailored to the commissioning company’s individual client needs.
The report starts with the introduction chapter, which introduces the concepts of Generative AI chatbot and Retrieval Augmented Generation, and states the various objectives of the project, as well as their feasibility.
The chatbot’s goals are to have an intuitive user interface, cite his sources, support PDFs, Word documents, Excel sheets and plain text files, and answer in English, French, German and Italian. The software as a whole needs to be sufficiently secure, to be deployed on a Swiss infrastructure, to not use any AI-related API, to be able to change the LLM model used on the fly, to maintain continuous operation, and to compensate carbon footprint as much as possible.
Following the introduction, the methodology chapter covers the Scrum-inspired project management, and showcases the 10 Usability Heuristics for User Interface Design principles by Dr. Nielsen, which are applied during the development of both the user and administrative interfaces.
Thereafter, the architecture of the solution is explained, which breaks down the choice of using OpenStack and Docker for the infrastructure. Additionally, this section is supported by a diagram which shows how the various docker containers interact with each other.
After that, each of the project’s objective is explored separately in the form of user stories, which present into their implementation, the technology choices, and the various challenges faced.
The thesis wraps up on the discussion chapter, which concludes that the project is qualitative as most goals are met and as the commissioning company is satisfied. However, more development time is still necessary for the chatbot to cite its sources, be able to answer in multiple languages, and be deployed on OpenStack.
One distinctive feature of this project is its use of entirely open-source technologies, enabling the creation of a privacy-focused software, tailored to the commissioning company’s individual client needs.
The report starts with the introduction chapter, which introduces the concepts of Generative AI chatbot and Retrieval Augmented Generation, and states the various objectives of the project, as well as their feasibility.
The chatbot’s goals are to have an intuitive user interface, cite his sources, support PDFs, Word documents, Excel sheets and plain text files, and answer in English, French, German and Italian. The software as a whole needs to be sufficiently secure, to be deployed on a Swiss infrastructure, to not use any AI-related API, to be able to change the LLM model used on the fly, to maintain continuous operation, and to compensate carbon footprint as much as possible.
Following the introduction, the methodology chapter covers the Scrum-inspired project management, and showcases the 10 Usability Heuristics for User Interface Design principles by Dr. Nielsen, which are applied during the development of both the user and administrative interfaces.
Thereafter, the architecture of the solution is explained, which breaks down the choice of using OpenStack and Docker for the infrastructure. Additionally, this section is supported by a diagram which shows how the various docker containers interact with each other.
After that, each of the project’s objective is explored separately in the form of user stories, which present into their implementation, the technology choices, and the various challenges faced.
The thesis wraps up on the discussion chapter, which concludes that the project is qualitative as most goals are met and as the commissioning company is satisfied. However, more development time is still necessary for the chatbot to cite its sources, be able to answer in multiple languages, and be deployed on OpenStack.
