A YOLO-based intelligent system for dietary health analysis and food recommendation
Qiu, Qizu; Hu, Jihai (2026)
Qiu, Qizu
Hu, Jihai
2026
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202605059545
https://urn.fi/URN:NBN:fi:amk-202605059545
Tiivistelmä
This thesis focuses on designing and implementing a You Only Look Once (YOLO)v8 deep learning framework-based intelligent diet health analysis and food recommendation system. An automated food recognition process, dynamic portion size estimation, and individualized nutritional recommendations are combined into one system to overcome the limitations of manual diet tracking.
The system architecture is composed of three layers: a React Native mobile client, a FastAPI backend, and a custom fine-tuned You Only Look Once (YOLO)v8 food-detection model. Two other food-detection models pretrained on Common Objects in Context (COCO) are used to detect tableware, providing the necessary spatial reference to estimate portable portions dynamically. To avoid the inconvenience of using external weighing scales, a dynamic portable estimation algorithm estimates the food weight based on the ratio of bounding box areas between the detected food and the user's tableware. The food-detection model is fine-tuned through transfer learning using the 58-class Chinese Food Dataset.
The evaluation results indicate that the fine-tuned food detection model trained for 150 epochs can obtain mean Average Precision at IoU=0.5 (mAP50) equal to 0.76, which shows comparatively good food detection performance and adequate generalization ability under complex real-world meal scenes. Besides that, a rule-based recommender engine is also implemented to suggest balanced intakes of macronutrients. In this way, this thesis constitutes the missing link between modern machine-vision methods and their application in diet-management tools.
The system architecture is composed of three layers: a React Native mobile client, a FastAPI backend, and a custom fine-tuned You Only Look Once (YOLO)v8 food-detection model. Two other food-detection models pretrained on Common Objects in Context (COCO) are used to detect tableware, providing the necessary spatial reference to estimate portable portions dynamically. To avoid the inconvenience of using external weighing scales, a dynamic portable estimation algorithm estimates the food weight based on the ratio of bounding box areas between the detected food and the user's tableware. The food-detection model is fine-tuned through transfer learning using the 58-class Chinese Food Dataset.
The evaluation results indicate that the fine-tuned food detection model trained for 150 epochs can obtain mean Average Precision at IoU=0.5 (mAP50) equal to 0.76, which shows comparatively good food detection performance and adequate generalization ability under complex real-world meal scenes. Besides that, a rule-based recommender engine is also implemented to suggest balanced intakes of macronutrients. In this way, this thesis constitutes the missing link between modern machine-vision methods and their application in diet-management tools.
