Assessing performance on data not seen during training is critical in order to validate machine learning models. In computer vision, however, experimentally measuring the actual robustness and generalization performance of high-level recognition methods is difficult in practice, especcially in video analyzis, due to high data acquisition and labeling costs.
Furthermore, it is sometimes nearly impossible to acquire data for some test scenarios of interest (e.g., storms, accidents, ...). In this work, we show how to leverage the recent progress in computer graphics (especially off-the-shelf tools like game engines) to generate photo-realistic virtual worlds useful to assess the performance of video analysis algorithms.
The main benefits of our approach are (i) the low cost of data generation, including with high-quality detailed annotations, (ii) the flexibility to automatically generate rich and varied scenes and their annotations, including under rare conditions to perform "what-if" and "ceteris paribus" analysis, and (iii) techniques to quantify the "transferability of conclusions" from synthetic to real-world data.
The main novel idea behind our approache consists in initializing the virtual worlds from 3D synthetic clones of real-world video sequences.
Citation: CVPR 2016
, Las Vegas, Nevada, USA; June 26th - July 1st, 2016.
Also: MIT Technology Review
| 16th March 2016