image: Visual comparisons of different methods for super-resolution. The figure presents the measurements (left), reconstructed results (center), and ground truth (right). PSNR and LPIPS values are annotated below reconstructed images.
Credit: Visual Intelligence
Image reconstruction is the cornerstone of science and engineering, enabling to visualize hidden structures in medical scans, satellite data, and microscopy. Conventional analytic algorithms offer theoretical rigor but struggle with complex noise and nonlinear degradation, while deep neural networks facilitate reconstruction quality at the cost of interpretability and retraining. The growing demand for reliable and data-efficient reconstruction motivates a new research direction to integrate signal processing principles with data-driven priors. Recently, unsupervised deep learning methods have emerged as a promising alternative with a revolutionary development from denoising to diffusion-based generation. A wide spectrum of in-depth studies have been dedicated to exploring unified frameworks that combine theoretical guarantees, flexibility, and efficiency for stable image reconstruction.
Researchers from Shanghai Jiao Tong University have published (DOI: 10.1007/s44267-025-00092-z) a comprehensive review in Visual Intelligence (October 22, 2025) to summarize a decade of progress in unsupervised deep learning for image reconstruction. The study highlights how denoising priors and diffusion-based generative models transform traditional inverse problems into data-driven yet theoretically grounded solutions. The authors systematically review mathematical foundations, convergence properties, and real-world applications to reveal how unsupervised learning methods overcome the limitations of supervised networks and analytic algorithms to achieve high-quality interpretable and scalable image reconstruction.
The review traces the evolution of unsupervised image reconstruction from denoising priors to diffusion-based generative models. Early approaches such as Plug-and-Play (PnP) and Regularization by Denoising (RED) reformulated explicit optimization steps with powerful denoisers to enable high-quality reconstruction without paired datasets. These methods bridged deep learning with classical signal processing by offering partial theoretical convergence guarantees. Building upon this foundation, diffusion priors introduced probabilistic modeling to overcome oversmoothing and achieve high perceptual quality. Diffusion models progressively perturb and denoise to learn the distribution of natural images and generate reconstructions that are both realistic and faithful to the measured data. The paper systematically compares key algorithms—PnP, RED, Diffusion Posterior Sampling (DPS), and Diffusion Plug-and-Play (DPnP)—across tasks such as deblurring and super-resolution. Benchmark tests on FFHQ datasets show that denoising priors yield higher fidelity (PSNR), while diffusion priors enhance perceptual realism (LPIPS), demonstrating complementary strengths. The authors also discuss open challenges in theory, computation, and generalization.
“Unsupervised deep learning provides the best of both worlds,” said Professor Wenrui Dai from Shanghai Jiao Tong University, the corresponding author of the study. “It inherits the stability and convergence guarantees of classical signal processing while embracing the expressive power of neural networks. By eliminating the need for costly paired data and retraining, these approaches mark a turning point for computational imaging.” “The elegance of methodology in the era of large-scale models lies in combining the theoretical rigor of signal processing and powerful ability of generative AI. Our goal is to build unified, theoretically sound frameworks that enable reliable reconstruction across diverse imaging modalities—from medical scanners to satellites.” This was commented on by Professor Hongkai Xiong, director of the MIN Lab, where the authors are from.
The implications of this survey reach far beyond theoretical perspectives. Unsupervised image reconstruction has already accelerated MRI and CT diagnostics, improved synthetic aperture radar imaging for remote sensing, and enhanced hyperspectral data recovery for environmental monitoring. By reducing data requirements and computational costs, these methods provide new potential for real-time adaptive imaging in medicine, astronomy, and autonomous systems. Future research will focus on improving convergence proofs, robustness to domain shifts, and scaling diffusion models to 3D and multimodal imaging. The authors conclude that integrating mathematical priors with generative learning will be key to the next generation of trustworthy intelligent imaging technologies.
Funding information
This work was supported in part by the National Natural Science Foundation of China (Nos. 62431017 and 62125109).
About Visual Intelligence
Visual Intelligence is an international, peer-reviewed, open-access journal devoted to the theory and practice of visual intelligence. This journal is the official publication of the China Society of Image and Graphics (CSIG), with Article Processing Charges fully covered by the Society. It focuses on the foundations of visual computing, the methodologies employed in the field, and the applications of visual intelligence, while particularly encouraging submissions that address rapidly advancing areas of visual intelligence research.
About the Authors
Dr. Hongkai Xiong is a Cheung Kong Professor and a Distinguished Professor at Shanghai Jiao Tong University (SJTU). His research interests include signal representation and wavelet analysis, image and video coding, multimedia communication and networking, computer vision, and machine learning. In SJTU, he directs the Institute of Media, Information, and Network (MIN Lab). He has served for various IEEE Conferences as a Technical Program Committee Member. He was elevated to Fellow of IEEE for his contributions to multi-scale multimedia signal representation, coding and communication. He is a fellow of the Chinese Institute of Electronics (CIE) and Asia-Pacific Artificial Intelligence Association (AAIA).
Dr. Wenrui Dai is a professor at Department of Electronic Engineering, Shanghai Jiao Tong University (SJTU). His research interests include multimedia signal processing, image and video coding, and machine learning.
Journal
Visual Intelligence
Article Title
A contemporary survey on image reconstruction with unsupervised deep learning: from denoising to generation
Article Publication Date
23-Oct-2025