SPOCKMIP: Segmentation of Vessels in MRAs with Enhanced Continuity using Maximum Intensity Projection as Loss

Chethan Radhakrishna Karthikesh Varma Chintalapati Sri Chandana Hudukula Ram Kumar Raviteja Sutrave Hendrik Mattern Oliver Speck Andreas Nürnberger Soumick Chatterjee contact@soumick.com Faculty of Computer Science, Otto von Guericke University Magdeburg, Germany Biomedical Magnetic Resonance, Otto von Guericke University Magdeburg, Germany Data and Knowledge Engineering Group, Otto von Guericke University, Magdeburg, Germany German Centre for Neurodegenerative Disease, Magdeburg, Germany Centre for Behavioural Brain Sciences, Magdeburg, Germany Genomics Research Centre, Human Technopole, Milan, Italy
Abstract

Identification of vessel structures of different sizes in biomedical images is crucial in the diagnosis of many neurodegenerative diseases. However, the sparsity of good-quality annotations of such images makes the task of vessel segmentation challenging. Deep learning offers an efficient way to segment vessels of different sizes by learning their high-level feature representations and the spatial continuity of such features across dimensions. Semi-supervised patch-based approaches have been effective in identifying small vessels of one to two voxels in diameter. This study focuses on improving the segmentation quality by considering the spatial correlation of the features using the Maximum Intensity Projection (MIP) as an additional loss criterion. Two methods are proposed with the incorporation of MIPs of label segmentation on the single (z-axis) and multiple perceivable axes of the 3D volume. The proposed MIP-based methods produce segmentations with improved vessel continuity, which is evident in visual examinations of ROIs. In this study, a UNet MSS with ReLU activation replaced by LeakyReLU is trained on the Study Forrest dataset. Patch-based training is improved by introducing an additional loss term, MIP loss, to penalise the predicted discontinuity of vessels. A training set of 14 volumes is selected from the StudyForrest dataset comprising of 18 7-Tesla 3D Time-of-Flight (ToF) Magnetic Resonance Angiography (MRA) images. Then it is used to perform a five-fold cross-validation. The generalisation performance of the method is evaluated using the other unseen volumes in the dataset. It is observed that the proposed method with multi-axes MIP loss produces better quality segmentations with a median Dice of 80.245±0.129plus-or-minus80.2450.12980.245\pm 0.12980.245 ± 0.129. Also, the method with single-axis MIP loss produces segmentations with a median Dice of 79.749±0.109plus-or-minus79.7490.10979.749\pm 0.10979.749 ± 0.109. Furthermore, a visual comparison of the ROIs in the predicted segmentation reveals a significant improvement in the continuity of the vessels when MIP loss is incorporated into training.

keywords:
Vessel Segmentation, Deep Learning, MR Angiograms, 7 Tesla MRA, TOF-MRA, Maximum Intensity Projection, Multi-axis MIP

1 Introduction

Segmentation of vessels on 7T magnetic resonance angiography (MRA) is one of the most important tasks in the analysis of biomedical images, as it provides essential information for the diagnosis and treatment of cerebrovascular diseases. An enhanced signal-to-noise ratio offered by 7T MRA allows for superior visualisation of cerebral vessels, revealing a higher proportion of small vessels compared to 1.5T or 3T MRAs [1]. The intricacies of small vessel structures complicate the segmentation process. Manual segmentation is labour-intensive and often leads to errors due to the challenging nature of the task. Machine learning (ML), specifically deep learning (DL), has shown promise in automating and improving the accuracy of vessel segmentation [1]. Despite its potential, deep learning in 7T MRA vessel segmentation is hindered by the need for extensive and expert-driven annotations. The manual segmentation process, necessary for training these models, is time-consuming and labour-intensive, especially given the intricacy of vessel structures in 7T MRAs. DS6 [2], a semi-supervised deep learning approach, attempts to mitigate this issue by learning from a small dataset with noisy annotations. Although this method successfully segments vessels as small as one to two voxels in diameter, it does not always produce segmentations that preserve the continuity of the vessels. Current research focuses on improving the continuity of vessels by considering the spatial correlation of pixels across dimensions.

For high-resolution 3D 7T MRA, creating ground truth annotations without imperfections is challenging. Usually, manual, semi-automatic, or classic vesselness segmentations are used to create training labels. However, these imperfections in training labels can reduce the performance of the trained deep learning network. In particular, if single voxels or small clusters are missed in the training labels, discontinuities in the deep learning-based vessel segmentation can occur. To overcome these challenges, this research attempts to take advantage of the inherent property of maximum intensity projection (MIP) to compress the 3D training annotations into 2D projections. Conventional maximum intensity projections (MIPs) are often used to quickly assess the vasculature in a single 2D image (or projection), instead of browsing through the entire 3D dataset. Although this reduces the dimensionality of the data, it does compensate for small clusters of missing labels. Using the MIP of the training label as an additional source to create a loss term, the authors expect an improvement in vessel continuity in the resulting DL segmentations.

1.1 Related Work

1.1.1 Vessel segmentation using manual and non-DL based methods

To distinguish between vessels and non-vessels, experts assign each voxel a value of 0 for non-vessels and 1 for vessels. The detection of the lenticulostriate arteries (LSA) is one of the key tasks in this study, which includes the annotation of large and small vessels. Unlike detecting large vessels, perceiving the gaps between these vessels consisting of extremely small vessels (LSA) is difficult for the human eye. Therefore, manual segmentation procedures are biased towards the perspectives and expertise of the individual performing the annotations. The need to annotate a large number of voxels precisely to render segmentations of the 3D volume makes the task time-consuming.

The Frangi [3] filter is designed to enhance blood vessels and other tubular structures with the eventual goal of vessel segmentation by improving contrast and reducing noise. The approach is based on Hessian eigenvalues, which are instrumental in the vessel contrast enhancement and suppression of non-vascular structures. However, this method requires significant parameter tuning to identify small vessels of interest. Occasionally, these parameters need to be manually fine-tuned according to each dataset and volume.

The ’Openly available sMall vEsseL sEgmenTaTion pipelinE’ (OMELETTE)111The Openly available sMall vEsseL sEgmenTaTion pipelinE (OMELETTE): https://gitlab.com/hmattern/omelette [4] method focuses on segmenting images based on thresholds. The voxels above a certain threshold are considered vessels, and the rest are considered as background. Hysteresis thresholding is employed as it helps maintain vessel continuity by considering voxels above the lower threshold if they are connected to the vessels with higher thresholds. Additionally, Jerman’s filter is used to apply Jerman’s vessel response function, which is based on the volume ratio of the Hessian matrix eigenvalues.

1.1.2 Vessel segmentation using deep learning techniques

Convolutional Neural Networks (CNN) have been extensively used for computer vision and image processing tasks. The high-level feature representations learnt using such networks can be efficiently used as segmentation boundaries. However, the biomedical image segmentation task presents the challenge of learning such representations using limited weak annotations. UNet [5] architecture proposes an end-to-end trainable network with a contracting path to learn high-resolution context information, followed by a symmetric expanding path that produces more precisely localised segmentation.

UNet-based architectures have been proven to be efficient in the task of segmenting vessels. One such network is the UNet with Multi-Scale Deep Supervision (UNet MSS) [6, 7]. Zeng et.al. proposed a multi-scale loss to learn discriminative features at every level and computed the overall loss as the sum of losses at each up-sampling scale of the expansion path of the UNet. Using this architecture as the backbone, Chatterjee et.al. [2] proposed a semi-supervised deformation-aware learning approach for vessel segmentation with noisy labels. The limited annotated samples were augmented by subjecting them to random elastic deformations. The deformed samples were trained using a Siamese architecture based on UNet [5] and UNet-MSS [6, 7] models. The approach was based on the hypothesis that learning features at different scales help segment vessels of different sizes and that deformation awareness improves consistency given a small set of noisy samples.

1.1.3 Maximum intensity projection

Maximum intensity projection (MIP) is used to visualise hyperintense structures in a 3D volume as a 2D projection, where, for each projection trace, only the voxel with the highest intensity is shown in the final 2D MIP. [8] hypothesises that a higher proportion of vessel structure is apparent in the MIPs as opposed to the 3D volumes and this can be exploited in cerebrovascular segmentation. MIPs have also been instrumental in the detection of pathologies. [9] and [10] demonstrated the use of MIPs of dynamic contrast-enhanced MRIs in detecting and classifying breast lesions. Furthermore, studies by [11] and [12] have shown that MIPs can be instrumental in detecting pulmonary nodules and qualitative analysis of intracranial vascularity. In the current study, the authors hypothesise that the MIP of the 3D MRA annotations can be used to improve the UNet-MSS [7, 2] network’s perception of vessel continuity.

1.2 Contributions

This attempts to tackle the problem of vessel continuity in deep learning-based segmentation models by introducing a novel approach by incorporating maximum intensity projection (MIP) as an additional loss criterion. Two versions of the proposed loss term have been explored here and have been employed on two different deep learning models and evaluated for overall segmentation quality, underlying vasculature, and vessel continuity. This advancement has significant potential to improve the precision and reliability of vessel segmentation in neuroimaging, thereby contributing to the better diagnosis and treatment of cerebrovascular diseases, especially small vessel disorders.

2 Methodology

Refer to caption
Figure 1: The Modified UNet MSS network is trained on 3D patches created from the input volume. MIPs of multi-level predictions are then compared against their corresponding patches on the MIP of the ground truth to compute MIP loss. The MSS loss is computed by comparing multi-level patch predictions with the corresponding label patches. A weighted sum of the two losses is then backpropagated.

2.1 Proposed Approach: SPOCKMIP

This paper proposes SPOCKMIP222SPOCKMIP: Segmentation Precision Optimised with Continuity Knowledge using Maximum Intensity Projection method, that uses the same architecture of UNet-MSS model from the DS6 research [2] with a replacement of the activation function from ReLU to LeakyReLU, and enhances the patch-based training pipeline by introducing the MIP comparisons as an additional loss term, as shown in the Fig. 1. The MIPs of the predictions for each patch at each level of UNet-MSS are computed. The predicted patch MIPs are then compared with their corresponding patches in the respective label MIPs to evaluate the MIP loss, LMIP(θ)subscript𝐿𝑀𝐼𝑃𝜃L_{MIP}(\theta)italic_L start_POSTSUBSCRIPT italic_M italic_I italic_P end_POSTSUBSCRIPT ( italic_θ ) as shown in Eq (4) and the Fig. 2(a).

2.1.1 Maximum intensity projection loss along the slice-dimension

In addition to the Multi-Scale Supervision (MSS) Loss, the spatial continuity of the vessels along the z-axis (i.e. the slice dimension) is incorporated into the learning in the form of the MIP loss. Eq. 1 represents the total loss, which is a weighted sum of the MSS loss LMSS(θ)subscript𝐿𝑀𝑆𝑆𝜃L_{MSS}(\theta)italic_L start_POSTSUBSCRIPT italic_M italic_S italic_S end_POSTSUBSCRIPT ( italic_θ ) and the MIP loss LMIP(θ)subscript𝐿𝑀𝐼𝑃𝜃L_{MIP}(\theta)italic_L start_POSTSUBSCRIPT italic_M italic_I italic_P end_POSTSUBSCRIPT ( italic_θ ) with weight parameter μ𝜇\muitalic_μ and network parameter θ𝜃\thetaitalic_θ. Eq. 2 represents the MSS loss where m𝑚mitalic_m refers to the total up-sampling scales, and αisubscript𝛼𝑖\alpha_{{}_{i}}italic_α start_POSTSUBSCRIPT start_FLOATSUBSCRIPT italic_i end_FLOATSUBSCRIPT end_POSTSUBSCRIPT is the weight assigned to the loss at a specific up-sampling level. Eq. 3 represents the MIP loss that is calculated by comparing the MIP of the predicted segmentation of the patch y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG against the subset of the MIP of the label segmentation Y𝑌Yitalic_Y encompassing the patch. The Focal Tversky loss [13] is used as the loss function for calculating all losses.

Loss(θ)=μLMSS(θ)+(1μ)LMIP(θ)𝐿𝑜𝑠𝑠𝜃𝜇subscript𝐿𝑀𝑆𝑆𝜃1𝜇subscript𝐿𝑀𝐼𝑃𝜃Loss(\theta)=\mu L_{MSS}(\theta)+(1-\mu)L_{MIP}(\theta)italic_L italic_o italic_s italic_s ( italic_θ ) = italic_μ italic_L start_POSTSUBSCRIPT italic_M italic_S italic_S end_POSTSUBSCRIPT ( italic_θ ) + ( 1 - italic_μ ) italic_L start_POSTSUBSCRIPT italic_M italic_I italic_P end_POSTSUBSCRIPT ( italic_θ ) (1)
LMSS(θ)=1mi=1mαlii(θ)subscript𝐿𝑀𝑆𝑆𝜃1𝑚superscriptsubscript𝑖1𝑚𝛼subscriptsubscript𝑙𝑖𝑖𝜃L_{MSS}(\theta)={1\over m}{\sum_{i=1}^{m}\alpha{{}_{i}}l_{i}(\theta)}italic_L start_POSTSUBSCRIPT italic_M italic_S italic_S end_POSTSUBSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_α start_FLOATSUBSCRIPT italic_i end_FLOATSUBSCRIPT italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_θ ) (2)
LMIP(θ)=1mi=1mαlimipi(θ)subscript𝐿𝑀𝐼𝑃𝜃1𝑚superscriptsubscript𝑖1𝑚𝛼subscript𝑙𝑖𝑚𝑖subscript𝑝𝑖𝜃L_{MIP}(\theta)={1\over m}{\sum_{i=1}^{m}\alpha{{}_{i}}lmip_{i}(\theta)}italic_L start_POSTSUBSCRIPT italic_M italic_I italic_P end_POSTSUBSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_α start_FLOATSUBSCRIPT italic_i end_FLOATSUBSCRIPT italic_l italic_m italic_i italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_θ ) (3)
lmip(θ)=loss(MIP(y^),yMIP(Y))𝑙𝑚𝑖𝑝𝜃𝑙𝑜𝑠𝑠𝑀𝐼𝑃^𝑦𝑦𝑀𝐼𝑃𝑌lmip(\theta)=loss(MIP(\hat{y}),y\subseteq MIP(Y))italic_l italic_m italic_i italic_p ( italic_θ ) = italic_l italic_o italic_s italic_s ( italic_M italic_I italic_P ( over^ start_ARG italic_y end_ARG ) , italic_y ⊆ italic_M italic_I italic_P ( italic_Y ) ) (4)

2.1.2 Cumulative maximum intensity projection loss across multiple axes

The authors propose an additional hypothesis that the continuity of vessel structures can be better perceived by analysing MIPs of the volume across multiple axes. This is achieved by comparing the MIP of the network’s 3D patch predictions against the corresponding patches of MIPs of 3D labels across three different views, as shown in Fig. 2(b). Therefore, the overall MIP loss is calculated as an equally weighted sum of the MIP loss along each axis, as shown in Eq. 5, where β𝛽\betaitalic_β represents the weight coefficient of the MIP loss across the x𝑥xitalic_x, y𝑦yitalic_y and z𝑧zitalic_z axes.

LMIP(θ)=1mβi=1mα(lmipx,i(θ)+lmipy,i(θ)+lmipz,i(θ))iL_{MIP}(\theta)={1\over m}\beta{\sum_{i=1}^{m}\alpha{{}_{i}}(lmip_{x,i}(\theta% )+lmip_{y,i}(\theta)+lmip_{z,i}(\theta))}italic_L start_POSTSUBSCRIPT italic_M italic_I italic_P end_POSTSUBSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG italic_β ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_α start_FLOATSUBSCRIPT italic_i end_FLOATSUBSCRIPT ( italic_l italic_m italic_i italic_p start_POSTSUBSCRIPT italic_x , italic_i end_POSTSUBSCRIPT ( italic_θ ) + italic_l italic_m italic_i italic_p start_POSTSUBSCRIPT italic_y , italic_i end_POSTSUBSCRIPT ( italic_θ ) + italic_l italic_m italic_i italic_p start_POSTSUBSCRIPT italic_z , italic_i end_POSTSUBSCRIPT ( italic_θ ) ) (5)

2.1.3 Hypothesis

The authors hypothesise that the proposed modifications, which take into account the MIP of the volume, enhance the segmentation of small vessels and improve vessel continuity. This hypothesis is evaluated and the performance of the proposed methods is compared against the baseline approaches including UNet [5] and UNet-MSS [6] both in terms of segmentation quality using ROI comparisons and a quantitative evaluation with a set of standard vessel segmentation metrics.

2.2 Datasets and Labels

Refer to caption
(a)
Refer to caption
(b)
Figure 2: (a) MIP loss is calculated by comparing the MIP of the network’s predictions at each level for a patch with the corresponding patch on the MIP of the ground truth segmentation. (b) Overall MIP loss is calculated by comparing the MIPs along three perceivable axes of the network’s predictions at each level for a patch with the corresponding patches on the respective MIPs along multiple axes of the ground truth segmentation.

The proposed methodology was evaluated using the StudyForrest333Source: StudyForrest.org dataset [SF7-TeslaDataset], comprising 7T MRA volumes of the brain acquired using 3D multi-slab Time-Of-Flight (TOF) Magnetic Resonance Angiography (MRA) with a resolution of 300μ𝜇\muitalic_μm of 20 participants. These volumes were divided into three subsets: the training set included 12 volumes, the validation set included four volumes, and the testing set included four volumes. The StudyForrest dataset includes two volumes with phase wrap-around artefacts (Fig. 3), which are regularly observed MR artefacts that occur when the dimensions of the body part being imaged exceed the defined Field of View (FOV). Those two volumes with these artefacts were discarded from the training as they hindered learning by increasing noise, and they were used for additional evaluations.

Refer to caption
Figure 3: Wrap-around artefact

2.2.1 Label preparation

The labels for the test volumes were manually annotated and verified by a neurologist, while the labels for the training and validation sets were created in a semi-automated fashion using Ilastik [14] and the 3D slicer [15]. After annotating the volumes using Ilastik, the MIP of the original volume along the slice dimension was used to validate the accuracy. The resulting segmentation had thicker labels along with abundant noise in the skull region. The label thickness was reduced by annotating the outlining pixels of the vessel labels by non-vessel labels. To minimise skull noise, the noisy skull region was removed by retaining the prominent vessels using 3D slicer. Skull vessels were annotated separately in Ilastik and combined with the prominent vessels mentioned earlier. These methods resulted in accurate segmented volumes with minimal noise. The area opening and area closing morphological methods provided by the scikit-image library were then applied to reduce noise and improve vessel continuity, respectively. The area opening operation was fine-tuned with an area threshold of seven and a connectivity value of two, followed by an area closing operation with an area threshold of sixty and a connectivity value of four.

2.3 Experimental Setup

A 5-fold cross-validation is performed on 14 volumes, where each fold comprises three volumes for validation, while the rest are used for training. The generalisation performance of the thus trained models is evaluated using the held-out test set comprising the remaining four unseen volumes (three volumes were free of wrap-around artefacts, while one volume had such artefact). 3D MRA volumes with dimensions of 480x640x163 were converted to 3D patches of 643. 8000 such patches from the 11 training volumes are randomly selected on each epoch for training. All experiments were performed with a 32GB Nvidia Tesla V100-SXM2 GPU with 10 CPUs and 60 GB RAM.

Refer to caption
Figure 4: Label creation using Ilastik and 3D slicer

SRDataset444SRDataset is a PyTorch dataset extension used to create patches from a 3D volume and lazy load them using a TQDM data loader. The code is referred from https://github.com/soumickmj/FTSuperResDynMRI is used to prepare the dataset of patches for training and validation. The dataset was adapted to identify, for each patch, the corresponding location of the patch on the MIP of the segmentation label using the patch coordinates. Thus, it returns the patch, the corresponding label patch, and the corresponding label MIP patch on each load. All experiments were performed with a learning rate of 0.0001 over 50 epochs. Focal Tversky Loss [13] was used as a loss function for the calculation of both supervised loss (MSS loss) and MIP loss. These were optimised during training using the Adam optimiser [16]. Additionally, Automatic Mixed Precision (AMP)555Using Nvidia Apex and gradient clipping techniques were employed to reduce memory requirements and to prevent exploding gradients.

2.3.1 Hyperparameters

Hyperparameter Value
Batch Size 15
Patch Size 64
Number of Epochs 50
Learning Rate 0.0001
Stride Dimensions 16x32x32 (Depth x Width x Length)
Samples Per Epoch 8000
MSS loss Coefficient (μ𝜇\muitalic_μ) 0.7
MIP loss Coefficient (1μ1𝜇1-\mu1 - italic_μ) 0.3
MSS level Coefficient (lisubscript𝑙𝑖l_{i}italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) [1, 0.66, 0.34] on levels 1, 2, and 3 respectively of UNet MSS [6]
MIP axis Coefficient (αisubscript𝛼𝑖\alpha_{i}italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) 0.33 (equal weightage)
Table 1: The hyperparameters and their optimal values are based on initial experiments.

Table 1 shows the set of different hyperparameters and their optimal values selected based on initial experiments, memory constraints, and the previous study [2]. The large 3D MRA volumes were divided into 3D patches of equal dimensions to facilitate memory-efficient learning while also increasing the number of samples per epoch. In each training iteration, a set of patches was chosen from different training volumes at random and supplied to the network. Taking into account the underlying limitation of the GPU and the size of the dataset used for the experiments, the batch size was set to 15, and the patch dimensionality was set to 643superscript64364^{3}64 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT.

3D patches were created using SRDataset, where the coordinates of the patches are determined by traversing the 3D volume with an arbitrary stride. The dimensions of this stride were set to 16x32x32 (depth x width x length) to allow enough overlap. The patch coordinates returned by the tool were used to lazy load the set of patches during training, and each training volume was divided into more than 10,000 overlapping patches. To mitigate overfitting, 8,000 randomly selected patches are used for each training epoch.

The proposed overall loss is calculated as a weighted sum of multi-scale supervision loss (MSS loss) and MIP loss. Optimising the network weights by minimising this multi-objective loss often results in exploding gradients. To allow the model to explore the local solution landscape of each loss term, a lower learning rate of 0.0001 was selected compared to the previous study [2]. It was observed that the convergence of the single- and multi-axes experiments occurred after 42 epochs. Therefore, the number of epochs is set to 50.

2.3.2 Loss coefficients

The complexity of learning the optimal objective function using deep learning models increases with the number of optimisation criteria. Here, the network is essentially optimising two objectives: voxel similarity loss (MSS loss) and maximum intensity projection loss (MIP loss). The proposed approach for this multi-objective optimisation is to optimise the weighted sum of the two losses. The hyperparameter μ𝜇\muitalic_μ in Eq. (1) is added as an additional network parameter with an initial value of 0.7, which is selected based on experimental results. It was found that the optimal value of the parameter learnt by the network was found to be 0.68. Further experiments showed that the learning suffered from under-fitting with reduced importance on MSS loss i.e., with μ<0.7𝜇0.7\mu<0.7italic_μ < 0.7. On the other hand, a substantial increase in over-segmentation was observed with increasing weight on MIP loss, i.e., μ>0.7𝜇0.7\mu>0.7italic_μ > 0.7. The weight coefficient α𝛼\alphaitalic_α associated with the MIP loss of each axis i in Eq. (5) was set to 0.33, implying equal importance for each axis. Several experiments were performed by assigning different importance to MIP loss at each axis, including a random assignment. These resulted in over-segmentation, ranging from overlapping vessel boundaries to identifying everything as vessels. The loss coefficient β𝛽\betaitalic_β in Eq. 5 was set to 0.3, similar to 1μ1𝜇1-\mu1 - italic_μ in Eq. 1.

Single-axis MIP experiments

The coefficient μ𝜇\muitalic_μ was set to 0.7 after a set of initial experiments. In addition, the hyperparameter was added as a parameter to the network, and the optimal value was found to be 0.68. ReLU activation was replaced by LeakyReLU to improve the optimisation of the combined losses. A higher weight was assigned to the MSS loss to learn the underlying vessel features.

Multi-axis MIP Experiments

The SRDataset2 was adapted to compute the MIP of the labels across three dimensions of perception. The corresponding patches were identified and returned on each data load, along with the patch and its corresponding label patch. The MIP loss was computed as the sum of MIP loss along each axis, as shown in Eq. (5). Equal weights were assigned to the individual axis losses, and the coefficient β𝛽\betaitalic_β was set to 0.33 after initial experiments. The weight coefficient μ𝜇\muitalic_μ was set to 0.7 for the overall sum of losses, as in Eq. (1).

2.4 Evaluation

The segmentation results are quantitatively evaluated against the manually segmented ground truth in terms of the overall segmentation score using Dice Coefficient (Dice), Area under ROC Curve (AUC), and Sensitivity. In addition, the underlying structure of the vessels and the parts of the vessels identified are evaluated using Volumetric Similarity Coefficient (VS), Mutual Information (MI), and Mahalanobis Distance (MHD). The Dice Coefficient and Volumetric Similarity are instrumental in evaluating the overlap and volumetric precision of segmentations, while the AUC and Sensitivity provide critical insights into the model’s discriminative power and its efficacy in accurately identifying true positives. Furthermore, the employment of Mutual Information and Mahalanobis Distance metrics captures both structural similarity and geometric accuracy. This judicious selection of diverse metrics ensures a robust and comprehensive assessment of segmentation quality, thereby significantly enhancing the model’s reliability and applicability in the field.

The Dice Coefficient is a measure of overlap between two sets, A𝐴Aitalic_A and B𝐵Bitalic_B. For segmentation tasks, it can be defined as:

Dice(A,B)=2|AB||A|+|B|Dice𝐴𝐵2𝐴𝐵𝐴𝐵\text{Dice}(A,B)=\frac{2|A\cap B|}{|A|+|B|}Dice ( italic_A , italic_B ) = divide start_ARG 2 | italic_A ∩ italic_B | end_ARG start_ARG | italic_A | + | italic_B | end_ARG (6)

where A𝐴Aitalic_A and B𝐵Bitalic_B are the sets of predicted and ground-truth segments, and |||*|| ∗ | is the number of elements in that particular set. The Area Under the ROC Curve is a performance measurement for classification tasks (segmentation tasks can be considered as a pixel-wise classification task) at various threshold settings. The ROC is a probability curve and the AUC represents the degree or measure of separability. It is computed as:

AUC=01TPR(t)𝑑FPR(t)AUCsuperscriptsubscript01TPR𝑡differential-dFPR𝑡\text{AUC}=\int_{0}^{1}\text{TPR}(t)\,d\text{FPR}(t)AUC = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT TPR ( italic_t ) italic_d FPR ( italic_t ) (7)

where TPR(t)TPR𝑡\text{TPR}(t)TPR ( italic_t ) and FPR(t)FPR𝑡\text{FPR}(t)FPR ( italic_t ) are the true positive and false positive rates at threshold t𝑡titalic_t. Sensitivity, or recall, is the proportion of true positives correctly identified by the model and is computed as:

Sensitivity=TPTP+FNSensitivityTPTPFN\text{Sensitivity}=\frac{\text{TP}}{\text{TP}+\text{FN}}Sensitivity = divide start_ARG TP end_ARG start_ARG TP + FN end_ARG (8)

where TP and FN are the number of true and false positives. The Volumetric Similarity Coefficient measures the similarity in volume between the predicted segmentation and the ground truth using:

VS=1|VAVB|VA+VBVS1subscript𝑉𝐴subscript𝑉𝐵subscript𝑉𝐴subscript𝑉𝐵\text{VS}=1-\frac{|V_{A}-V_{B}|}{V_{A}+V_{B}}VS = 1 - divide start_ARG | italic_V start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT - italic_V start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT | end_ARG start_ARG italic_V start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT + italic_V start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_ARG (9)

where VAsubscript𝑉𝐴V_{A}italic_V start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT and VBsubscript𝑉𝐵V_{B}italic_V start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT are the volumes of the predicted and ground-truth segments. Mutual Information (MI) measures the amount of information obtained about one random variable through the other random variable. For segmentation, it can be expressed as:

MI(A,B)=aAbBp(a,b)logp(a,b)p(a)p(b)MI𝐴𝐵subscript𝑎𝐴subscript𝑏𝐵𝑝𝑎𝑏𝑝𝑎𝑏𝑝𝑎𝑝𝑏\text{MI}(A,B)=\sum_{a\in A}\sum_{b\in B}p(a,b)\log\frac{p(a,b)}{p(a)p(b)}MI ( italic_A , italic_B ) = ∑ start_POSTSUBSCRIPT italic_a ∈ italic_A end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_b ∈ italic_B end_POSTSUBSCRIPT italic_p ( italic_a , italic_b ) roman_log divide start_ARG italic_p ( italic_a , italic_b ) end_ARG start_ARG italic_p ( italic_a ) italic_p ( italic_b ) end_ARG (10)

where p(a,b)𝑝𝑎𝑏p(a,b)italic_p ( italic_a , italic_b ) is the joint probability distribution of A𝐴Aitalic_A and B𝐵Bitalic_B, while p(a)𝑝𝑎p(a)italic_p ( italic_a ) and p(b)𝑝𝑏p(b)italic_p ( italic_b ) are the marginal probability distributions. The Mahalanobis Distance is a measure of the distance between a point and a distribution. For segmentation, it is used to measure the distance between the predicted segmentation and the ground truth:

MHD(A,B)=(AB)TΣ1(AB)MHD𝐴𝐵superscript𝐴𝐵𝑇superscriptΣ1𝐴𝐵\text{MHD}(A,B)=\sqrt{(A-B)^{T}\Sigma^{-1}(A-B)}MHD ( italic_A , italic_B ) = square-root start_ARG ( italic_A - italic_B ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_A - italic_B ) end_ARG (11)

where A𝐴Aitalic_A and B𝐵Bitalic_B are the vectors of predicted and ground-truth segment coordinates, and ΣΣ\Sigmaroman_Σ is the covariance matrix of the distribution.

To facilitate the calculation of the above-mentioned metrics, EvaluateSegmentation666EvaluateSegmentation: A comprehensive tool to compare two medical volumes. [17] tool was employed.

3 Results

3.1 Quantitative Evaluation

A test set, comprising four 7T MRA volumes, is used to evaluate the 5-fold cross-validated models of each approach. One of the volumes in this set contains wrap-around artefacts, which affect the objective quantitative evaluation of the segmentation. Therefore, the resulting comparisons are presented in two categories: with and without the volume containing the aforementioned artefacts.

3.1.1 On test set without wrap-around artefacts

The median and variance over 15 segmentation results obtained from a 5-fold cross-validation (three volumes of the test set without wrap-around artefacts, evaluated over five folds, resulting in 15 segmentation results in total) are used to compare the performance of the methods. These results are reported in Table 2 and presented using violin plots in Fig. 5.

Methods / Metrics Dice Coefficient Area under ROC Curve Sensitivity
UNet 79.585±0.092plus-or-minus79.5850.09279.585\pm 0.09279.585 ± 0.092 0.84869±0.00160plus-or-minus0.848690.001600.84869\pm 0.001600.84869 ± 0.00160 0.69762±0.00648plus-or-minus0.697620.006480.69762\pm 0.006480.69762 ± 0.00648
UNet MSS 79.130±0.085plus-or-minus79.1300.08579.130\pm 0.08579.130 ± 0.085 0.86042±0.00140plus-or-minus0.860420.001400.86042\pm 0.001400.86042 ± 0.00140 0.72123±0.00569plus-or-minus0.721230.005690.72123\pm 0.005690.72123 ± 0.00569
UNet MIP 79.749±0.109plus-or-minus79.7490.10979.749\pm 0.10979.749 ± 0.109 0.85595±0.00070plus-or-minus0.855950.000700.85595\pm 0.000700.85595 ± 0.00070 0.71262±0.00279plus-or-minus0.712620.002790.71262\pm 0.002790.71262 ± 0.00279
UNet MSS MIP 79.857±0.080plus-or-minus79.8570.08079.857\pm 0.08079.857 ± 0.080 0.86447±0.00100plus-or-minus0.864470.001000.86447\pm 0.001000.86447 ± 0.00100 0.72939±0.00403plus-or-minus0.729390.004030.72939\pm 0.004030.72939 ± 0.00403
UNet mMIP 80.245±0.129plus-or-minus80.2450.129\textbf{80.245}\boldsymbol{\pm}\textbf{0.129}80.245 bold_± 0.129 0.86672±0.00127plus-or-minus0.866720.00127\textbf{0.86672}\boldsymbol{\pm}\textbf{0.00127}0.86672 bold_± 0.00127 0.73403±0.00512plus-or-minus0.734030.00512\textbf{0.73403}\boldsymbol{\pm}\textbf{0.00512}0.73403 bold_± 0.00512
UNet MSS mMIP 79.577±0.136plus-or-minus79.5770.13679.577\pm 0.13679.577 ± 0.136 0.86134±0.00118plus-or-minus0.861340.001180.86134\pm 0.001180.86134 ± 0.00118 0.72413±0.00474plus-or-minus0.724130.004740.72413\pm 0.004740.72413 ± 0.00474
(a) Overall Segmentation Scores
Methods / Metrics Volumetric Similarity Mutual Information MHD
UNet 0.87312±0.00634plus-or-minus0.873120.006340.87312\pm 0.006340.87312 ± 0.00634 0.04758±0.00002plus-or-minus0.047580.000020.04758\pm 0.000020.04758 ± 0.00002 0.06812±0.00051plus-or-minus0.068120.000510.06812\pm 0.000510.06812 ± 0.00051
UNet MSS 0.86421±0.00568plus-or-minus0.864210.005680.86421\pm 0.005680.86421 ± 0.00568 0.04783±0.00002plus-or-minus0.047830.000020.04783\pm 0.000020.04783 ± 0.00002 0.07158±0.00082plus-or-minus0.071580.000820.07158\pm 0.000820.07158 ± 0.00082
UNet MIP 0.87358±0.00163plus-or-minus0.873580.001630.87358\pm 0.001630.87358 ± 0.00163 0.04790±0.00001plus-or-minus0.047900.000010.04790\pm 0.000010.04790 ± 0.00001 0.06053±0.00037plus-or-minus0.060530.00037\textbf{0.06053}\boldsymbol{\pm}\textbf{0.00037}0.06053 bold_± 0.00037
UNet MSS MIP 0.88928±0.00364plus-or-minus0.889280.003640.88928\pm 0.003640.88928 ± 0.00364 0.04867±0.00001plus-or-minus0.048670.000010.04867\pm 0.000010.04867 ± 0.00001 0.06625±0.00059plus-or-minus0.066250.000590.06625\pm 0.000590.06625 ± 0.00059
UNet mMIP 0.89829±0.00312plus-or-minus0.898290.00312\textbf{0.89829}\boldsymbol{\pm}\textbf{0.00312}0.89829 bold_± 0.00312 0.04900±0.00002plus-or-minus0.049000.000020.04900\pm 0.000020.04900 ± 0.00002 0.06629±0.00033plus-or-minus0.066290.000330.06629\pm 0.000330.06629 ± 0.00033
UNet MSS mMIP 0.88557±0.00338plus-or-minus0.885570.003380.88557\pm 0.003380.88557 ± 0.00338 0.04944±0.00001plus-or-minus0.049440.00001\textbf{0.04944}\boldsymbol{\pm}\textbf{0.00001}0.04944 bold_± 0.00001 0.06159±0.00076plus-or-minus0.061590.000760.06159\pm 0.000760.06159 ± 0.00076
(b) Quantitative evaluation of underlying vasculature
Table 2: Metrics comparisons of 15 segmentation results of test volumes, excluding the volume with wrap-around artefacts, over 5-fold cross-validation.

The overall segmentation scores, presented in Table 2(a), show improvements with the incorporation of MIP loss compared to their baseline counterparts. It is evident from the Dice score comparisons that the UNet mMIP method clearly outperforms the baselines, with a median Dice score of 80.245±0.129plus-or-minus80.2450.12980.245\pm 0.12980.245 ± 0.129. The AUC and sensitivity comparisons also show that the proposed MIP loss improves segmentation performance. The improvements are considerably greater in the case of the baseline UNet compared to the UNet MSS. The multi-axes UNet mMIP method outperforms its baseline, with a median AUC of 0.867±0.001plus-or-minus0.8670.0010.867\pm 0.0010.867 ± 0.001.

Refer to caption
(a) Comparison of overall segmentation scores of proposed methods evaluated using the Dice Coefficient metric on the test set without wrap-around artefacts.
Refer to caption
(b) Quantitative comparison of underlying vasculature identified by the proposed methods, evaluated using the Volumetric Similarity Coefficient metric on the test set without wrap-around artefacts.
Figure 5: Violin plots comparing the performance of proposed methods and baselines over three test volumes, excluding the volume with wrap-around artefacts, across five folds.

Furthermore, the volumetric comparisons of the identified vasculature are presented in Table 2(b), and they also demonstrate improvements over the baselines with the addition of MIP loss. The UNet model with multi-axes MIP loss, UNet mMIP, outperforms the baselines and other proposed MIP-based methods, with a median score of 0.898±0.003plus-or-minus0.8980.0030.898\pm 0.0030.898 ± 0.003. The comparison of the MHD metric shows that the UNet MSS mMIP method exploits voxel correlations better than the baselines and other proposed methods.

The study proposes two methods for the incorporation of vessel structure information available in the MIP. This includes single-axis MIP-based methods (UNet MIP and UNet MSS MIP) and multi-axes MIP-based methods (UNet mMIP and UNet MSS mMIP). The authors hypothesised that projecting the 3D volume along all three perceivable axes allows the network to learn better spatial correlation of the voxels compared to single-axis projection. Although this hypothesis is validated in the case of the baseline UNet, a reduction is observed in the overall segmentation scores of the UNet MSS with the addition of MIP information from multiple axes. The qualitative evaluation discussed in a later section explains this observation as a consequence of increased over-segmentation. The quantitative evaluation of the underlying vasculature shown in Table 2(b) reveals a considerable improvement in the UNet segmentation results with multi-axis MIP loss as opposed to single-axis MIP loss evident in both VS and MI scores. The performance of the UNet MSS remains marginally similar with the incorporation of single- and multi-axes MIP loss.

Overall, it can be concluded that MIP loss (single- or multi-axes) improves the segmentation performance of both models on all six metrics. However, the conclusion regarding the improvement using multi-axes MIP over single-axis MIP depends on the model. The authors hypothesise that as the UNet MSS already has additional regularisation in terms of the additional loss terms, adding multi-axes MIP might be over-regularising the model - ending up losing performance compared to the single-axis MIP loss.

3.1.2 On test volume with wrap-around artefacts

Methods / Metrics Dice Coefficient Area under ROC Curve Sensitivity
UNet 63.080±0.109plus-or-minus63.0800.10963.080\pm 0.10963.080 ± 0.109 0.73868±0.00122plus-or-minus0.738680.001220.73868\pm 0.001220.73868 ± 0.00122 0.47775±0.00499plus-or-minus0.477750.004990.47775\pm 0.004990.47775 ± 0.00499
UNet MSS 64.070±0.048plus-or-minus64.0700.04864.070\pm 0.04864.070 ± 0.048 0.74810±0.00070plus-or-minus0.748100.000700.74810\pm 0.000700.74810 ± 0.00070 0.49677±0.00285plus-or-minus0.496770.002850.49677\pm 0.002850.49677 ± 0.00285
UNet MIP 64.177±0.037plus-or-minus64.1770.03764.177\pm 0.03764.177 ± 0.037 0.76815±0.00022plus-or-minus0.768150.000220.76815\pm 0.000220.76815 ± 0.00022 0.53774±0.00088plus-or-minus0.537740.000880.53774\pm 0.000880.53774 ± 0.00088
UNet MSS MIP 64.821±0.040plus-or-minus64.8210.04064.821\pm 0.04064.821 ± 0.040 0.76566±0.00037plus-or-minus0.765660.000370.76566\pm 0.000370.76566 ± 0.00037 0.53248±0.00152plus-or-minus0.532480.001520.53248\pm 0.001520.53248 ± 0.00152
UNet mMIP 64.357±0.125plus-or-minus64.3570.12564.357\pm 0.12564.357 ± 0.125 0.75854±0.00138plus-or-minus0.758540.001380.75854\pm 0.001380.75854 ± 0.00138 0.51805±0.00563plus-or-minus0.518050.005630.51805\pm 0.005630.51805 ± 0.00563
UNet MSS mMIP 65.946±0.084plus-or-minus65.9460.084\textbf{65.946}\boldsymbol{\pm}\textbf{0.084}65.946 bold_± 0.084 0.77267±0.00067plus-or-minus0.772670.00067\textbf{0.77267}\boldsymbol{\pm}\textbf{0.00067}0.77267 bold_± 0.00067 0.54650±0.00271plus-or-minus0.546500.00271\textbf{0.54650}\boldsymbol{\pm}\textbf{0.00271}0.54650 bold_± 0.00271
(a) Overall Segmentation Scores
Methods / Metrics Volumetric Similarity Mutual Information MHD
UNet 0.70075±0.01043plus-or-minus0.700750.010430.70075\pm 0.010430.70075 ± 0.01043 0.03268±0.00001plus-or-minus0.032680.000010.03268\pm 0.000010.03268 ± 0.00001 0.29559±0.00142plus-or-minus0.295590.001420.29559\pm 0.001420.29559 ± 0.00142
UNet MSS 0.72650±0.00660plus-or-minus0.726500.006600.72650\pm 0.006600.72650 ± 0.00660 0.03352±0.00001plus-or-minus0.033520.000010.03352\pm 0.000010.03352 ± 0.00001 0.25334±0.00184plus-or-minus0.253340.001840.25334\pm 0.001840.25334 ± 0.00184
UNet MIP 0.76243±0.00171plus-or-minus0.762430.001710.76243\pm 0.001710.76243 ± 0.00171 0.03429±0.00000plus-or-minus0.034290.000000.03429\pm 0.000000.03429 ± 0.00000 0.25282±0.00079plus-or-minus0.252820.000790.25282\pm 0.000790.25282 ± 0.00079
UNet MSS MIP 0.78264±0.00250plus-or-minus0.782640.002500.78264\pm 0.002500.78264 ± 0.00250 0.03457±0.00000plus-or-minus0.034570.000000.03457\pm 0.000000.03457 ± 0.00000 0.25173±0.00018plus-or-minus0.251730.000180.25173\pm 0.000180.25173 ± 0.00018
UNet mMIP 0.75771±0.00816plus-or-minus0.757710.008160.75771\pm 0.008160.75771 ± 0.00816 0.03397±0.00002plus-or-minus0.033970.000020.03397\pm 0.000020.03397 ± 0.00002 0.26898±0.00029plus-or-minus0.268980.000290.26898\pm 0.000290.26898 ± 0.00029
UNet MSS mMIP 0.79332±0.00385plus-or-minus0.793320.00385\textbf{0.79332}\boldsymbol{\pm}\textbf{0.00385}0.79332 bold_± 0.00385 0.03564±0.00001plus-or-minus0.035640.00001\textbf{0.03564}\boldsymbol{\pm}\textbf{0.00001}0.03564 bold_± 0.00001 0.24759±0.00022plus-or-minus0.247590.00022\textbf{0.24759}\boldsymbol{\pm}\textbf{0.00022}0.24759 bold_± 0.00022
(b) Quantitative evaluation of underlying vasculature
Table 3: Metrics comparisons of 5 segmentation results of test volume with wrap-around artefact over 5-fold cross-validation.

The presence of wrap-around artefacts observed in one volume, as shown in Fig. 3, results in over-segmentation of the vessels on the skull and disrupts vessel boundaries, thus disregarding the continuity of the vessels in the vicinity of the artefacts. The authors evaluated the models separately only on this volume across five folds to assess the robustness of these models against such artefacts.

Comparison of the overall segmentation scores tabulated in Table 3(a) shows that the proposed UNet MSS mMIP method outperforms the other baselines and the proposed methods in the presence of these artefacts, with a median Dice of 65.946±0.084plus-or-minus65.9460.08465.946\pm 0.08465.946 ± 0.084 across five cross-validation folds. A similar trend is observed in the quantitative comparison of the vasculature, as shown in Table 3(b).

Incorporation of MIP loss is seen to improve overall segmentation scores with UNet and UNet MSS baselines. Table 3(b) shows a considerable improvement in the identification of the underlying vasculature when these baselines are trained with the proposed single- and multi-axes MIP loss. It is also observable that the multi-axes MIP loss outperforms the single-axis MIP loss counterparts in the presence of the artefacts.

The proposed UNet MSS mMIP is found to be the method with the best performance when evaluated on the volume with wrap-around artefacts. The multi-axes method also appears to outperform its single-axis counterpart both in terms of overall segmentation and the underlying vasculature.

3.2 Qualitative Evaluation

Refer to caption
Figure 6: Visual comparison of ROIs of MIPs of segmentations resulting from training all the models with MIP loss along the z-axis and MIP loss along multiple axes. Annotations on the MIPs show the missing continuity of the vessels and the improvements after using the z-axis and multi-axes MIP loss. Each image is an overlay of network predictions on the ground truth, where blue and red represent false positives and false negatives, respectively.
Refer to caption
Figure 7: Comparison of False Positives between segmentations generated by UNet-MSS_MIP and UNet MSS mMIP. Each image is an overlay of network predictions on the Ground Truth, where blue annotations represent false positives and red annotations represent false negatives.

The qualitative evaluation was performed by selecting five regions of interest (ROI) exhibiting notable presence of lenticulostriate arteries, as depicted in Fig. 6. The ROIs are marked with five different colours on the MIP of one of the test volumes, and their respective segmentations resulting from baselines and proposed approaches are also tabulated. Vessels in red and blue denote false negatives and false positives, respectively, while white denotes correctly segmented vessels. Yellow circles mark the notable differences among the different methods. The annotations on the ROIs of the segmentation results of the baselines (UNet and UNet MSS) show the discontinuity of the vessels. Visual comparison of these ROIs against their single-axis MIP counterparts reveals improved vessel continuity. It is also evident that the models with multi-axes MIP loss prevail over the baselines. The ROI comparisons between single-axis and multi-axes MIP-based models show a reduction in false positives. This is also evident in Fig. 7, especially in the skull. ROIs show that the multi-axes MIP-based models appear to over-segment the vessels. However, a closer look at the MIP of the input volume reveals that these overextensions are still part of the vessel, which is absent in the ground truth.

4 Conclusion

This paper proposed a MIP-based loss term to improve vessel continuity and overall segmentation performance of deep learning models, and evaluated this loss term on two different deep learning models. It was demonstrated that the proposed MIP-based methods outperform the baselines, both quantitatively and qualitatively. The generated segmentations not only show improvement in the continuity of the vessels, but also identify structures of small vessels that were missed in the resulting segmentations from the baselines. It was observed from experiments that the voxel similarity loss (MSS loss) retains a higher importance compared to the MIP loss, as learning suffers if voxel intensity comparisons inherent in MSS loss are insufficient. Between the two types of MIP losses explored here - single- and multi-axes - the single-axis is more stable and widely applicable to different models, while the multi-axes might be advantageous for some and over-regularise other models. Overall, the UNet model with multi-axes MIP outperformed all other models (including the baselines), resulting in a Dice score of 80.245±0.12980.245±0.12980.245\textpm 0.12980.245 ± 0.129 compared.

In future work, the authors propose the exploration of volumetric MIP loss replacing the patch-based MIP loss proposed here for training the model. The hypothesis is that this would allow the network to perceive complex vessel structures that are missing in the partial information available in a patch. Furthermore, the merits of incorporating MIP information in semi-supervised learning approaches, such as deformation-aware learning [2], can also be a future direction for exploration.

References