Generative adversarial networks and perceptual losses for video super-resolution
Metadata
Show full item recordEditorial
IEEE
Materia
Artificial neural networks video signal processing image resolution image generation
Date
2019Referencia bibliográfica
Lucas, A., López-Tapia, S., Molina, R., Katsaggelos, A.K., Generative Adversarial Networks and Perceptual Losses for Video Super-Resolution, IEEE Transactions on Image Processing, volumen 28, número 73312-3327
Sponsorship
This work was supported in part by the Sony 2016 Research Award Program Research Project and in part by the National Science Foundation under Grant DGE-1450006. The work of S. López-Tapia was supported in part by the Spanish Ministry of Economy and Competitiveness under Project DPI2016-77869-C2-2-R, in part by the Visiting Scholar Program at the University of Granada, and in part by the Spanish FPU Program. The work of R. Molina was supported in part by the Spanish Ministry of Economy and Competitiveness under Project DPI2016-77869-C2-2-R and in part by the Visiting Scholar Program at the University of Granada.Abstract
Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based
formulation for VSR. We introduce a new generator network
optimized for the VSR problem, named VSRResNet, along with
new discriminator architecture to properly guide VSRResNet
during the GAN training. We further enhance our VSR GAN
formulation with two regularizers, a distance loss in feature-space
and pixel-space, to obtain our final VSRResFeatGAN model.
We show that pre-training our generator with the mean-squarederror
loss only quantitatively surpasses the current state-of-theart
VSR models. Finally, we employ the PercepDist metric to
compare the state-of-the-art VSR models. We show that this
metric more accurately evaluates the perceptual quality of SR
solutions obtained from neural networks, compared with the
commonly used PSNR/SSIM metrics. Finally, we show that
our proposed model, the VSRResFeatGAN model, outperforms
the current state-of-the-art SR models, both quantitatively and
qualitatively