Early Fusion-based Self-Supervised Learning (SSL) with Cross-Modality Prediction for Brain Tumour Segmentation
Umesh Kalyanappa, Rishika (2025)
Umesh Kalyanappa, Rishika
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025060420224
https://urn.fi/URN:NBN:fi:amk-2025060420224
Tiivistelmä
The study investigated the impact of self-supervised learning (SSL) pretraining on brain tumour segmentation performance using the BraTS2020 dataset from Kaggle. A two-stage methodology was performed. The first stage involved training an SSL model to predict FLAR MRI modality from other modalities (T1 and T1Gd) without relying on labels. Reconstruction quality was assessed using Mean Squared Error (MSE), Peak Signal-to-Noise ratio (PSNR), and Structural similarity index (SSIM). In the second stage, a ResUNet-based segmentation model was trained to perform multi-class segmentation of tumour subregions, which were encoded into a single-channel label map.
The model was trained both with SSL-initialized weights and with random initialization to compare performance. Early fusion was performed by stacking the four MRI modalities (T1, T1Gd, T2, FLAIR) as input channels, allowing the model to learn combined anatomical and pathological features from the initial layers.
A patch-based sampling strategy was employed, along with various loss functions, including Dice combined cross-entropy, Dice combined focal loss, and focal loss only. The learning strategy was based on 5-fold cross-validation. The segmentation model was evaluated on a hold-out test set. Performance for each tumour subregion was measured using the Dice coefficient and Average Surface Distance (ASD) metrics. The same evaluation methodology was applied for the SSL encoder. The results demonstrated that using SSL for weight initialization led to slightly higher Dice scores and lower ASD values across most configurations.
The findings indicate that implementing SSL can enhance the model accuracy and robustness in medical image segmentation tasks, especially when training with a limited amount of labelled data.
The model was trained both with SSL-initialized weights and with random initialization to compare performance. Early fusion was performed by stacking the four MRI modalities (T1, T1Gd, T2, FLAIR) as input channels, allowing the model to learn combined anatomical and pathological features from the initial layers.
A patch-based sampling strategy was employed, along with various loss functions, including Dice combined cross-entropy, Dice combined focal loss, and focal loss only. The learning strategy was based on 5-fold cross-validation. The segmentation model was evaluated on a hold-out test set. Performance for each tumour subregion was measured using the Dice coefficient and Average Surface Distance (ASD) metrics. The same evaluation methodology was applied for the SSL encoder. The results demonstrated that using SSL for weight initialization led to slightly higher Dice scores and lower ASD values across most configurations.
The findings indicate that implementing SSL can enhance the model accuracy and robustness in medical image segmentation tasks, especially when training with a limited amount of labelled data.