Abstract

Intelligent Imaging: Harnessing AI for Enhanced Detection and Diagnosis

About Lesson

Abstract Body:

Objectives:
The primary goal of our research is to develop a comprehensive end to end AI-driven pipeline for parametric FDG brain PET mapping. The pipeline included a 3D Convolution Neural Network [1] (CNN) classifier, ‘Frame-net’ to identify internal carotid arteries (ICA) visible frames from 4D PET scans, a UNETR-based [2] ‘ICA-net’ for segmenting the ICA to derive image-derived blood input (IDIF), and a Recurrent Neural Network (RNN)-based [3] ‘MCIF-net’ to derive a model-corrected blood input function (MCIF) with partial volume (PV) corrections.

Materials and Methods:

Dynamic FDG PET of the brain was conducted on 50 subjects using a time-of-flight PET CT scanner. Prior static MPRAGE MRI scans facilitated co-registration. Image preprocessing involved motion correction of the 4D PET data and coregistration in MR space using bash scripts designed using the FMRIB’s Software Library (FSL) [4] [5].

Next, we focused on localizing the internal carotid arteries (ICA) within the early frames of the dynamic sequence using ‘Frame-net’ to identify the temporal frame displaying the ICA. For this network, five temporal frames were selected for analysis from each patient. Each NIfTI file was resized to a uniform shape of (128, 128, 128, 5) using linear interpolation and normalized. The ‘Frame-net’ architecture included an input layer with inputs of shape (128, 128, 128, 5), five convolutional layers with increasing filter sizes (64, 128, 128, 256, 256) and kernel sizes of (3, 3, 3), each followed by ReLU activation and max pooling layers (2, 2, 2), dropout Layers between 0.2 and 0.3 and two fully connected dense layers with 512 and 256 neurons, respectively, followed by a final softmax layer to output the class probabilities. The model was compiled using the Adam optimizer and categorical cross-entropy loss with a total of 4,508,933 trainable parameters. Training with 5-fold cross validation was performed over 100 epochs with a batch size of 2.

Then the selected ICA visible frames by the ‘Frame-net’ classifier were again preprocessed by resizing to 128×128×128 voxels and normalizing. Following data augmentation for both images and labels, we employed 10-fold cross validation for robust model evaluation. The ‘ICA-net’ was trained using the Adam optimizer and performance assessed using the Dice coefficient and IoU index. A combined loss function of Tversky and Cross-Entropy loss was employed to enhance sensitivity and specificity amidst class imbalance. The Tversky Loss, adjusted with alpha and beta at 0.5, and including a softmax layer for multi-class tasks, complements the Cross-Entropy Loss that refines pixel-wise classification. Losses were equally weighted (0.5 each), forming a composite loss.

Next, MCIF was computed by optimizing the IDIF obtained from the ICA [6]. The ‘MCIF-net’, mapping IDIF to MCIF, was developed using a hybrid recurrent neural network architecture incorporating both LSTM and Bi-directional GRU layers [7]. MCIF was then utilized to compute Ki Map using the Patlak model [8]. Subsequently, for an epilepsy patient with known surgical ground truth, Z-score map was computed, normalizing Ki against the mean and standard deviation (SD) for the entire brain, covering 18 super-regions per hemisphere [6].

Results:

The Frame-net classifier reached an average validation accuracy of 86.11%. The ICA-net achieved a notable average Dice score of 83.99% and IoU of 72.51% across all evaluated scans. The MCIF-net demonstrated a minimal root mean squared error of 0.0201. When applied to actual patient data this integrated pipeline accurately identified the regions of seizure onset, leading to successful clinical interventions where the patient attained a seizure-free status post-treatment.

Conclusion:
The efficacy of the Frame-net classifier alongside ICA-net and MCIF-net demonstrates a significant advancement in automating neuroimaging. The AI-driven pipeline marks a crucial step forward in neurological diagnostics and treatment planning.

Image/Figure:

Click to view full size

Image/Figure Caption:

Deep Learning Pipeline for 3D Quantification of Tracer Uptake Rates in PET Imaging: This pipeline illustrates the end-to-end process for processing 4D DICOM data and generating 3D tracer uptake maps. It includes the following steps: (1) Data pre-processing with co-registration of motion-corrected PET and MRI images to an atlas and template; (2) 3D CNN frame selection for identifying frames with visible internal carotid arteries (ICA); (3) Use of ICA-net with 3D UNETR architecture for segmenting the ICA; (4) Application of MCIF-net for spillover and partial volume corrections using a combination of LSTM and GRU models; (5) Quantitative and visual analysis to compute the Patlak model and generate a 3D map of tracer uptake rates.

Author

Rugved Sanjay Chavan, MS computer science
University of Virginia