Jin et al. propose a proposal- and detection-free pipeline for instance segmentation. Their approach is, as illustrated in Figure 1, based on three steps: an initial semantic segmentation, an instance label transformation (i.e. the choice of representation to use for predicting instance labels), and integrating the instance labels with the semantic segmentation. This pipeline, based on a semantic segmentation, is in contrast to many other approaches based on object detectors/proposal detectors.
For semantic labeling, they make use of the work in  without CRF post-processing. This semantic segmentation is later merged with an inferred instance labeling. Jin et al. Propose three different representations of the instance labling (all three illustrated in Figure 2):
Figure 3: Qualitative results for all three transformations. From left to right (for each row): input image, ground truth segmentation with instances, instance prediction using connected components, pixel-based affinity representation, superpixel-based affinity representation and boundary-based representation.
Overall, Jin et al. replace the overhead of object/proposal detectors used in other works with an additional overhead on the segmentation side. Qualitative results are shown in Figure 3.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or get in touch with me: