Classification and retrieval does not tell the whole story. For many tasks, we want a more refined description of the image. For instance we might want to segment the image to retrieve a specific portion, or detect if an object is in the image and where (e.g. is there a face? a car? where?). We may even want to detect a particular instance (e.g. is this particular flower in the image? has someone stamped this document as confidential?). Many techniques for this exist, but users are being more demanding with expectations, and many do not scale, so this presents a significant challenge.
Semantic segmentation
The semantic segmentation task is the task of labelling the pixels of an image depending on their semantic category. This means that objects are precisely located at the pixel level. For some applications it may be necessary to divide the entire scene into regions, where each region corresponds to a particular class.
|
|
|
For other tasks, only objects belonging to a predefined list of categories may be of interest and need to be retrieved and located in the image.
|
|
|