Speech-operated assistive software development
Kortesmaa, Daniel (2025)
Kortesmaa, Daniel
2025
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202502062441
https://urn.fi/URN:NBN:fi:amk-202502062441
Tiivistelmä
Speech operated assistive software refers to software designed to assist the use of various programs or services through a voice-controlled interface. This type of software can be used for specific tasks or as a general speech assistant.
The goal of the thesis was to create a speech operated assistive proof of concept prototype to facilitate the use of Reddit and Telegram platforms.
A prototype software was created using Python as the programming language. It utilizes VOSK for speech recognition. For the Reddit and Telegram platforms, their own API endpoints were used to communicate with the platforms.
The results showed that speech recognition created difficulties. In particular the results of name recognition were inaccurate and phonetically similar words gave varying results. Additionally, some of the platforms’ own functionality was difficult to implement entirely speech-operated due to the platforms’ own limitations.
The research could be continued by improving the user experience as well as expanding the functionality on both platforms. Additionally, the number of supported platforms could be expanded.
The goal of the thesis was to create a speech operated assistive proof of concept prototype to facilitate the use of Reddit and Telegram platforms.
A prototype software was created using Python as the programming language. It utilizes VOSK for speech recognition. For the Reddit and Telegram platforms, their own API endpoints were used to communicate with the platforms.
The results showed that speech recognition created difficulties. In particular the results of name recognition were inaccurate and phonetically similar words gave varying results. Additionally, some of the platforms’ own functionality was difficult to implement entirely speech-operated due to the platforms’ own limitations.
The research could be continued by improving the user experience as well as expanding the functionality on both platforms. Additionally, the number of supported platforms could be expanded.
