Evaluating AI models for acoustic-based symptom detection in home diagnostics
Heinonen, Kalle (2025)
Heinonen, Kalle
2025
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025061723213
https://urn.fi/URN:NBN:fi:amk-2025061723213
Tiivistelmä
Urinary health conditions are often underdiagnosed due to the invasive nature of traditional diagnostic methods like uroflowmetry, which require clinical environments and specialized equipment. These constraints limit early detection and consistent monitoring, especially in remote or resource-limited settings. The widespread availability of smartphones and wearable devices presents a compelling opportunity to shift toward more accessible health monitoring. Motivated by this gap and the growing potential of AI in audio-based diagnostics, this thesis investigates how artificial intelligence can enable urinary symptom detection through sound analysis captured by everyday devices. To address the shortcomings of conventional diagnostics, this study introduces a scalable, non-intrusive solution using real-time audio analysis. A comprehensive methodology was employed, including audio signal processing, feature extraction, and machine learning model development, to evaluate the effectiveness of various algorithms.
The study explores the performance of Support Vector Machines (SVM), Random Forest, Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) in analyzing urinary flow audio data. Key features such as Mel Frequency Cepstral Coefficients (MFCCs), chroma features, and spectral contrast were extracted from the recordings to encapsulate the temporal and spectral characteristics of urinary flow. Models were trained and evaluated using a stratified, balanced dataset. Performance was assessed using metrics such as accuracy, macro-averaged ROC-AUC, training time, and inference latency.
The results revealed that CNN achieved the highest macro-averaged ROC-AUC score of 0.93, indicating strong, balanced classification performance across all urinary flow classes. Random Forest, while achieving the highest accuracy (93.2%) and offering a fast inference time of 3 milliseconds per sample, showed reduced generalization in class-sensitive metrics with a macro-averaged AUC of 0.79. SVM performed well in accuracy but poorly in AUC, highlighting potential overfitting or bias toward dominant classes. RNN exhibited moderate performance but holds potential for modeling session-based trends in longitudinal use cases.
This research demonstrates the feasibility and potential of using AI-driven audio analysis for non-invasive urinary health diagnostics. The findings contribute to the growing field of mobile health technologies, offering a foundation for developing accessible and effective diagnostic tools that could improve early detection, patient outcomes, and healthcare accessibility globally. By bridging the gap between advanced AI methodologies and practical health applications, this study paves the way for further innovation in non-invasive diagnostic systems.
The study explores the performance of Support Vector Machines (SVM), Random Forest, Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) in analyzing urinary flow audio data. Key features such as Mel Frequency Cepstral Coefficients (MFCCs), chroma features, and spectral contrast were extracted from the recordings to encapsulate the temporal and spectral characteristics of urinary flow. Models were trained and evaluated using a stratified, balanced dataset. Performance was assessed using metrics such as accuracy, macro-averaged ROC-AUC, training time, and inference latency.
The results revealed that CNN achieved the highest macro-averaged ROC-AUC score of 0.93, indicating strong, balanced classification performance across all urinary flow classes. Random Forest, while achieving the highest accuracy (93.2%) and offering a fast inference time of 3 milliseconds per sample, showed reduced generalization in class-sensitive metrics with a macro-averaged AUC of 0.79. SVM performed well in accuracy but poorly in AUC, highlighting potential overfitting or bias toward dominant classes. RNN exhibited moderate performance but holds potential for modeling session-based trends in longitudinal use cases.
This research demonstrates the feasibility and potential of using AI-driven audio analysis for non-invasive urinary health diagnostics. The findings contribute to the growing field of mobile health technologies, offering a foundation for developing accessible and effective diagnostic tools that could improve early detection, patient outcomes, and healthcare accessibility globally. By bridging the gap between advanced AI methodologies and practical health applications, this study paves the way for further innovation in non-invasive diagnostic systems.
