Counting Basketball Shot Attempts with Computer Vision (PoC)
Trinh, Thanh (2025)
Trinh, Thanh
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202601091159
https://urn.fi/URN:NBN:fi:amk-202601091159
Tiivistelmä
The thesis presents a proof-of-concept system for detecting, classifying, and mapping basketball shot attempts from short single-view video clips. Basketball performance analysis is increasingly important at many levels of play, yet automatic tools that track both teams during real games are mostly limited to professional environments or costly commercial platforms. Smaller clubs often lack access to such solutions. The aim of this project is to examine whether a lightweight pipeline, built from pretrained models and standard computer vision methods, can provide useful shot-attempt information using only one camera.
The system combines two pretrained Roboflow models: one for recognizing basketball events (jump shots, layups or dunks, and ball-in-basket) and one for detecting court keypoints. Through temporal logic and single-view homography, the pipeline identifies shot attempts, classifies them as made or missed, and maps their locations to canonical court coordinates. The outputs include an annotated broadcast-style video, a top-down court-map visualization, and a simple JSON file listing the location, outcome, and left/right side of each attempt. The pipeline runs on a Google Colab GPU runtime and is designed to be easy to use.
The results show that the system works reliably on broadcast-like clips where the viewpoint and court appearance match the detector’s training distribution. In these cases, the outputs were coherent, the made/missed decisions mostly matched human judgment, and the mapped coordinates were plausible. Runtime performance was practical for short clips. However, the system struggled with amateur footage due to poor court-line visibility, camera motion, occlusion, and domain shift. These issues reduced both homography stability and event detection accuracy.
Overall, the study demonstrates that a minimal single-camera pipeline can provide useful shot-attempt information under controlled conditions. At the same time, the findings highlight the need for further adaptation—such as collecting local training data, fine-tuning the models, and improving robustness to panning—before the system can be used reliably in amateur environments.
The full implementation is submitted as a separate Jupyter notebook for reproducibility.
The system combines two pretrained Roboflow models: one for recognizing basketball events (jump shots, layups or dunks, and ball-in-basket) and one for detecting court keypoints. Through temporal logic and single-view homography, the pipeline identifies shot attempts, classifies them as made or missed, and maps their locations to canonical court coordinates. The outputs include an annotated broadcast-style video, a top-down court-map visualization, and a simple JSON file listing the location, outcome, and left/right side of each attempt. The pipeline runs on a Google Colab GPU runtime and is designed to be easy to use.
The results show that the system works reliably on broadcast-like clips where the viewpoint and court appearance match the detector’s training distribution. In these cases, the outputs were coherent, the made/missed decisions mostly matched human judgment, and the mapped coordinates were plausible. Runtime performance was practical for short clips. However, the system struggled with amateur footage due to poor court-line visibility, camera motion, occlusion, and domain shift. These issues reduced both homography stability and event detection accuracy.
Overall, the study demonstrates that a minimal single-camera pipeline can provide useful shot-attempt information under controlled conditions. At the same time, the findings highlight the need for further adaptation—such as collecting local training data, fine-tuning the models, and improving robustness to panning—before the system can be used reliably in amateur environments.
The full implementation is submitted as a separate Jupyter notebook for reproducibility.
