"Density Forests", Criminisi, Shotton • David Stutz

APRIL2015

READING

A. Criminisi, J. Shotton. Density Forests. In Decision Forests for Computer Vision and Medical Image Analysis, Springer London, 2013.

MACHINE LEARNING

As part of "Decision Forests for Computer Vision and Medical Image Analysis" by Criminisi and Shotton, this chapter discusses density forests: Decision forests used for density estimation (that is, decision forests for unsupervised learning or clustering forests). At the leafs, each tree contains a simple prediction model: a Gaussian distribution. As result, a single density tree can be viewed as special case of a Gaussian Mixture Model (e.g. see [1, ch. 9]) using hard assignments instead of soft assigments.

Criminisi and Shotton detail the effects of the number of trees as well as the maximum depth of each individual tree on the estimated density. Figure 1 shows results taken from their chapter.

Figure 1 (click to enlarge): The influence of $T$, the number of trees, and $D$, the maximum depth of each individual tree on a toy dataset. Top row: $T=1$ with $D = 2,4$ and $6$. Bottom row: $T=400$ with $T=2,4$ and $6$.

In contrast, figure 2 shows results using a custom implementation. Obviously, density trees are capable of representing Gaussian mixtures nearly perfectly. However, density trees tend to overfit the data and, therefore, parameters such as maximum depth or the number of samples required for splits is critical. The implementation is still under development, however, will be made publicly available in the next few months.

Figure 2 (click to enlarge): Single density tree applied on the samples shown in the left most column. The samples are colored according to the corresponding leaf and were generated using a mixture of Gaussians, shown in column two. The estimated density is shown in column three. The result is highly dependent on parameter settings such as minimum number of samples required for splits. From top to bottom: $1000$ samples, $2$ components and $250$ samples required for a split; $10000$ samples, $7$ components and $250$ samples required for a split; $10000$ samples, $7$ components and $1000$ samples required for a split.

[1] C. Bishop. Pattern Recognition and Machine Learning. Springer Verlag, New York, 2006.

What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.

IAM

DAVIDSTUTZ

READING

SEARCHTHEBLOG

ARCHIVES

TAGS