Efficient gradient computation for optimization of hyperparameters

Jingyan Xu, Frederic Noo

Research output: Contribution to journalArticlepeer-review

Abstract

We are interested in learning the hyperparameters in a convex objective function in a supervised setting. The complex relationship between the input data to the convex problem and the desirable hyperparameters can be modeled by a neural network; the hyperparameters and the data then drive the convex minimization problem, whose solution is then compared to training labels. In our previous work (Xu and Noo 2021 Phys. Med. Biol. 66 19NT01), we evaluated a prototype of this learning strategy in an optimization-based sinogram smoothing plus FBP reconstruction framework. A question arising in this setting is how to efficiently compute (backpropagate) the gradient from the solution of the optimization problem, to the hyperparameters to enable end-to-end training. In this work, we first develop general formulas for gradient backpropagation for a subset of convex problems, namely the proximal mapping. To illustrate the value of the general formulas and to demonstrate how to use them, we consider the specific instance of 1D quadratic smoothing (denoising) whose solution admits a dynamic programming (DP) algorithm. The general formulas lead to another DP algorithm for exact computation of the gradient of the hyperparameters. Our numerical studies demonstrate a 55%-65% computation time savings by providing a custom gradient instead of relying on automatic differentiation in deep learning libraries. While our discussion focuses on 1D quadratic smoothing, our initial results (not presented) support the statement that the general formulas and the computational strategy apply equally well to TV or Huber smoothing problems on simple graphs whose solutions can be computed exactly via DP.

Original languageEnglish (US)
Article number03NT01
JournalPhysics in medicine and biology
Volume67
Issue number3
DOIs
StatePublished - Feb 7 2022

Keywords

  • automatic differentiation
  • dynamic programming
  • gradient backpropagation
  • hyperparameter learning
  • implicit differentiation
  • proximal mapping

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Efficient gradient computation for optimization of hyperparameters'. Together they form a unique fingerprint.

Cite this