Semi-automatic training of an object recognition system in scene camera data using gaze tracking and accelerometers
Cognolato Matteo, Graziani Mara, Giordaniello Francesca, Saetta Gianluca, Bassetto , Brugger Peter, Caputo Barbara, Müller Henning, Atzori Manfredo
Computer Vision Systems : 11th International Conference, ICVS 2017, Shenzhen, China, July 10-13, 2017 (pp. 175-184). 2017, Cham : Springer
Lien vers la publication
Object detection and recognition algorithms usually require large, annotated training sets. The creation of such datasets requires expensive manual annotation. Eye tracking can help in the annotation procedure. Humans use vision constantly to explore the environment and plan motor actions, such as grasping an object. In this paper we investigate the possibility to semi-automatically train object recognition with eye tracking, accelerometer in scene camera data, learning from the natural hand-eye coordination of humans. Our approach involves three steps. First, sensor data are recorded using eye tracking glasses that are used in combination with accelerometers and surface electromyography that are usually applied when controlling prosthetic hands. Second, a set of patches are extracted automatically from the scene camera data while grasping an object. Third, a convolutional neural network is trained and tested using the extracted patches. Results show that the parameters of eye-hand coordination can be used to train an object recognition system semi-automatically. These can be exploited with proper sensors to fine-tune a convolutional neural network for object detection and recognition. This approach opens interesting options to train computer vision and multi-modal data integration systems and lays the foundations for future applications in robotics. In particular, this work targets the improvement of prosthetic hands by recognizing the objects that a person may wish to use. However, the approach can easily be generalized.