"Fully Connected Deep Structured Networks", Schwing and Urtasun • David Stutz

MARCH2017

READING

A. G. Schwing, R. Urtasun. Fully Connected Deep Structured Networks. CoRR, 2015.

Schwing and Urtasun build upon the work of Chen and Schwing [1] in order to jointly optimize deep features and Markov Random Field (MRF) parameters in fully connected models. This is in contrast to [1] where the model was assumed to be connected across local regions only.

Their approach is based on two components. For inference in the fully connected model, they utilize the work by Krähenbühl and Koltun [2] (describing efficient mean field inference in fully connected models using Gaussian filtering). For learning, they also build upon work by Krähenbühl and Koltun [3] for learning the model parameters. The contribution by Schwing and Urtasun lies in not using logistic regressors to model the unary potentials. Instead, general (deep) networks can be used.

They consider the same model as in [2] where the score function $F(x, y;w)$ decomposes as sum of pairwise terms:

$f_{ij}(\hat{y}_i, \hat{y}_j, x, w) = \sum_{m = 1}^M \mu^{(m)}(\hat{y}_i, \hat{y}_j, w) k^{(m)}(\hat{f}_i(x) - \hat{f}_j(x))$

where $\mu^{(m)}$ are pairwise label compatibility functions and $k^{(m)}$ are kernel functions.

Inference is then accomplished by minimizing the Kullback-Leibler divergence between the model distribution and an assumed approximation $q_{(x,y)} (y) = \prod_i q_{(x,y)}(y_i)$. The mean field approximation updates are then derived as in [2].

For learning, the surrogate loss

$L_{(x,y)}(q_{(x,y)}) = -\sum_{i = 1}^N \log q_{(x,y),i}(y_i)$

is minimized. Deriving the gradient (using the chain rule) results in a recursive definition due to the used mean field update equations. Details can be found in [3]. Regarding the generalization from logistic regressors to deep networks, Schwing and Urtasun argue that the corresponding gradients can easily be derived. Unfortunately, the details (or examples) are missing. The proposed learning algorithm is, thus, a combination of forward pass, marginal computation and backward pass.

[1] L.-C. Chen, A. G. Schwing, A. L. Yuille, R. Urtasun. Learning Deep Structured Models. CoRR, 2014.
[2] P. Krähenbühl, V. Koltun. Efficient inference in fully connected CRFs with Gaussian edge potentials. NIPS, 2011.
[3] P. Krähenbühl, V. Koltun. Parameter Learning and Convergent Inference for Dense Random Fields. ICML, 2013.

What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.

IAM

DAVIDSTUTZ

READING

SEARCHTHEBLOG

ARCHIVES

TAGS