IPSJ SIG Technical Report (Computer Vision and Image Media) 2014(CVIM192-32) 1-16 2014年5月
In this study, we propose a framework to describe the relationship named spatiotemporal correlation between video contents and human gaze dynamics. The spatiotemporal correlation consists of (1) the event-level spatiotemporal gaps between visual events in videos and gaze reactions and (2) the scene-level correlations between video scene structures and corresponding gaze dynamics. Our framework can describe this twofold relationship simply and efficiently by discovering and combining primitive spatiotemporal patterns of visually salient regions in videos and those of gaze. The effectiveness of this framework is confirmed via several practical tasks of gaze behavior analyses in real environments, attentional target identification, attentive state estimation and gaze point prediction.In this study, we propose a framework to describe the relationship named spatiotemporal correlation between video contents and human gaze dynamics. The spatiotemporal correlation consists of (1) the event-level spatiotemporal gaps between visual events in videos and gaze reactions and (2) the scene-level correlations between video scene structures and corresponding gaze dynamics. Our framework can describe this twofold relationship simply and efficiently by discovering and combining primitive spatiotemporal patterns of visually salient regions in videos and those of gaze. The effectiveness of this framework is confirmed via several practical tasks of gaze behavior analyses in real environments, attentional target identification, attentive state estimation and gaze point prediction.
P. Benner, R. Findeisen, D. Flockerzi, U. Reichl, K. Sundmacher (担当:分担執筆, 範囲:Chap.3, Magnus Egerstedt, Jean-Pierre de la Croix, Hiroaki Kawashima, and Peter Kingston, "Interacting with Networks of Mobile Agents")