Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz


Y. Chen, T. Pock, R. Ranftl, H. Bischof. Revisiting Loss-Specific Training of Filter-Based MRFs for Image Restoration. German Conference on Pattern Recognition, 2015.

Chen et al. revisit bi-level optimization for filter-based MRF models, i.e. Fields of Experts [1]. In particular, considering the problem of image denoising, the MRF model is given by

$E(x; x^{(n)}) = \sum_{i = 1}^{N_f} \alpha_i \sum_{p = 1}^{N_p} \rho((k_i \ast x)_p) + \frac{\lambda}{2} \|x - x^{(n)}\|_2^2$(1)

where $k_i \ast x$ denotes the convolution of the image $x$, consisting of $N_p$ pixels, with the $i$-th of $N_f$ filters (and $(k_i \ast x)_p$ is the $p$-the pixel in the convolved image). In practice, the convolution $k_i \ast x$ is expressed as matrix multiplication $K_i x$ where $K_i$ is the matrix corresponding to filter $k_i$; additionally, the filters are expressed using a set of basis filters, i.e. $K_i = \sum_{j = 1}^{N_B} \beta_{i,j} B_j$ for the basis $\{B_1, \ldots B_{N_B}\}$. $\rho$ is defined as the Lorentzian function:

$\rho(s) = \log(1 + s^2)$.

For denoising, given a noisy image $x^{(n)}$, the denoised image $x^\ast$ is computed as

$x^\ast = \arg\min_{x} E(x)$.(2)

The model parameters, i.e. $\{\alpha_i, \beta_{i, j}\}$, are learned using the following bi-level optimization problem applied on a training set $\{x^{(c)}, x^{(n)}\}_{n = 1}^N$ of clean and noisy images:

$$\begin{Bmatrix}\min_{\alpha, \beta} L(x^\ast(\alpha, \beta)) := \sum_{n = 1}^N \frac{1}{2} \|x_n^\ast(\alpha, \beta) - x^{(c)}\|_2^2\\\text{where } x_n^\ast(\alpha, \beta) = \arg \min_{x} E(x; x^{(n)})\end{Bmatrix}$$

Chen et al. replace the lower-level problem by the equivalent constraint $\nabla_{x} E = 0$, which stems from the first-order optimality condition. Then, the Lagrangian function can be constructed and - after some simplification - solved using well-known optimization techniques; see the paper for details.

Given learned model parameters, the denoising problem in Equation (2) can also be solved using the iPiano algorithm. This has been done by Ochs et al. in [2] to demonstrate the applicability of the iPiano algorithm to computer vision problems. In particular, they replace the $L_2$ data term in Equation (1) by a $L_1$ term (the model has to be retrained using an approximation to the $L_1$ norm, but denoising can directly use the $L_1$ norm due to the iPiano algorithm).

  • [1] S. Roth, M. J. Black. Fields of Experts. International Journal of Computer Vision, volume 82, number 2, 2009.
  • [2] P. Ochs, Y. Chen,T. Brox, T. Pock. iPiano: Inertial Proximal Algorithm for Non-Convex Optimization. Computing Research Repository, abs/1404.4805, 2014.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or using the following platforms: