What is a good evaluation measure for semantic segmentation?

Gabriela Csurka, Diane Larlus, Florent Perronnin
In this work, we consider the evaluation of the semantic segmentation task. We discuss the strengths and limitations of the few existing measures, and propose new ways to evaluate semantic segmentation. First, we argue that a per-image score instead of one computed over the entire dataset brings a lot more insight. Second, we propose to take contours more carefully into account. Based on the conducted experiments, we suggest best practices for the evaluation. Finally, we present a user study we conducted to better understand how the quality of image segmentations is perceived by humans.
24th British Machine Vision Conference (BMVC), University of Bristol, 9 - 13 Sept 2013.


2013-027.pdf (488.32 kB)