A deep-learning reconstruction framework for low-dose, dynamic x-ray CT
Decades of x-ray CT research focuses on iterative reconstruction and denoising methods to alleviate constraints on data sampling and ionizing radiation dose. In particular, multi-channel CT imaging applications (multienergy, dynamic) remain an active area of research because the relationships across channels (e.g. similar structures across energies, sparse differences over time) enable highly effective data undersampling and reconstruction. Now, deep learning (DL) reconstruction methods are at the forefront of CT research. A key to the success of DL reconstruction methods is their ability to learn data-specific prior information as a supplement for mathematical optimization methods and established priors. Previously, we demonstrated how the split Bregman optimization method can be combined with supervised learning and simulated CT data to enable volumetric, projection and image domain (dual domain) reconstruction of real, single-channel mouse micro-CT data. Here, we update this past work in three ways: (1) we extend the reconstruction framework to handle multi-channel, time-resolved CT data (3D + time); (2) we revise the popular Vision Transformer (ViT) architecture for compatibility with 4D image-to-image processing; and (3) we propose a network training cost function which ties supervised training in a phantom to self-supervised training in real data. We demonstrate these extensions by training an image domain ViT on undersampled MOBY mouse phantom data (36 projections/phase, 10 cardiac phases; supervised) and fully sampled, real mouse micro-CT data (900 projections/phase; self-supervised) and show that the trained network robustly regularizes the reconstruction of real mouse micro-CT when only 300 projections/phase are used for reconstruction.