PARL4D: Progression-Aware Multimodal Representation Learning for Early Therapy Response Prediction from 4D SPECT–CT
by
OHSA/B17
Early prediction of therapy response is critical in advanced prostate cancer, where ineffective radioligand treatment should be discontinued promptly to avoid toxicity and delayed alternative interventions. Current response assessment relies on expert interpretation of longitudinal computed tomography (CT) and single-photon emission computed tomography (SPECT) imaging, which is labor-intensive and subjective. In this work, we formulate early therapy response prediction as a 4D multimodal learning problem over variable-length longitudinal SPECT–CT imaging volumes and clinical laboratory data acquired across therapy cycles. We propose PARL4D (Progression-Aware Representation Learning from 4D data), a novel deep learning framework that leverages a 3D foundation model backbone, learnable cycle embeddings, and explicit progression-aware feature encoding via ordered scan pairs. A gated fusion module integrates cross-modal and progression cues to produce patient-level binary predictions. Unlike prior multimodal or longitudinal methods that operate on 2D imaging, electronic health records, or dense segmentation tasks, our approach directly models volumetric SPECT–CT and clinical data evolution over time. Extensive experiments demonstrate the effectiveness of progression-aware representation learning for early therapy response prediction.
Laboratory for Simulation and Modelling
SDSC hub at PSI