[This is a followup post to my original post on Superpixels.]
Today’s “paper of the day”: M. Van den Bergh, X. Boix, G. Roig, B. de Capitani and Luc Van Gool, SEEDS: Superpixels Extracted via Energy-Driven Sampling, ECCV 12. Website (with code).
The paper describes a very fast algorithm — “SEEDS” — for generating superpixels. SEEDS starts by partitioning an image into a grid of square superpixels, and then refines each superpixel by shifting pixels (or blocks of pixels) at boundaries from one superpixel to another. What’s cool about this approach is that it can be made very fast — each update is local and only requires very simple comparisons to reach a decision about transferring pixel ownership. Moreover, the algorithm can be stopped at any point in time (this is how the authors are able to achieve 30Hz processing time — running the algorithm longer results in better segmentations). While the presentation of the paper has some issues (convoluted descriptions, overly complex notation, tiny unreadable figures), the algorithm itself is very simple, fast, and effective. Good stuff.
Some superpixel results (taken from author website):
The quality of the superpixel segmentations is quite high according to the standard error metrics for superpixels (see this paper for an excellent description of the error metrics). In fact, the quality of the superpixels is much higher than that of SLIC, the subject of my previous post. One issue I realized about the SLIC paper is that it presented and compared results at only a single setting of the number of superpixels — the SEEDS paper does a much better job of evaluating performance as a function of the number of superpixels.
Overall, I’m a big fan of the work on superpixels. In my opinion segmentation has received a disproportionate amount of attention in the literature — but really superpixels are much more useful (and more commonly used) than complete segmentations. Superpixels are an acknowledgement that segmentation is not an end in itself but rather as a pre-processing step for further computer vision.
What I’d really like to see is superpixel algorithms that can achieve extremely high boundary recall (currently boundary recall is at around 90% for 600 superpixels or so, although the localization of the edges is a bit rough). While of course this is THE goal of superpixel algorithms, it’s surprising to me that achieving higher recall is as challenging as it is. Segmentation is notoriously difficult — but with superpixels one can cheat by being very liberal at reporting edges. I’d be curious to see an analysis of what edges are still being missed. While there’s some edges that have no gradient (illusory contours), this does not seem to be the dominant issue.. So, what would it take to get higher boundary recall?