研究者業績

川嶋 宏彰

カワシマ ヒロアキ  (Hiroaki Kawashima)

基本情報

所属
兵庫県立大学 情報科学研究科 教授
学位
博士(情報学)(京都大学)

J-GLOBAL ID
200901098553710896
researchmap会員ID
5000031823

外部リンク

論文

 75
  • Hiroaki Kawashima, Yu Horii, Takashi Matsuyama
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 442-445 2010年  査読有り
    A variety of methods for audio-visual integration, which integrate audio and visual information at the level of either features, states, or classifier outputs, have been proposed for the purpose of robust speech recognition. However, these methods do not always fully utilize auditory information when the signal-to-noise ratio becomes low. In this paper, we propose a novel approach to estimate speech signal in noise environments. The key idea behind this approach is to exploit clean speech candidates generated by using timing structures between mouth movements and sound signals. We first extract a pair of feature sequences of media signals and segment each sequence into temporal intervals. Then, we construct a cross-media timing-structure model of human speech by learning the temporal relations of overlapping intervals. Based on the learned model, we generate clean speech candidates from the observed mouth movements.
  • Ryo Yonetani, Hiroaki Kawashima, Takatsugu Hirayama, Takashi Matsuyama
    Proceedings - International Conference on Pattern Recognition 101-104 2010年  査読有り
    We propose a novel method to estimate the object that a user is focusing on by using the synchronization between the movements of objects and a user's eyes as a cue. We first design an event as a characteristic motion pattern, and we then embed it within the movement of each object. Since the user's ocular reactions to these events are easily detected using a passive camera-based eye tracker, we can successfully estimate the object that the user is focusing on as the one whose movement is most synchronized with the user's eye reaction. Experimental results obtained from the application of this system to dynamic content (consisting of scrolling images) demonstrate the effectiveness of the proposed method over existing methods. © 2010 IEEE.
  • Jean-Baptiste Dodane, Takatsugu Hirayama, Hiroaki Kawashima, Takashi Matsuyama
    Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009 201-208 2009年  査読有り
    Human-machine interaction still lacks smoothness and naturalness despite the widespread utilization of intelligent systems and emotive agents. In order to improve the interaction, this work proposes an approach to estimate user's interest based on the relationships between dynamics of user's eye movements, more precisely the endogenous control mode of saccades, and machine's proactive visual content presentation. Under a specially-designed presentation phase to make the user express the endogenous saccades, we analyzed delays between the saccades and the presentation events. As a result, we confirmed that the delay while the user's gaze is maintained on the previous presented content regardless of the next event, called resistance, is a good indicator of the interest estimation (70% success, upon 20 experiments). It showed higher accuracy than the conventional interest estimation based on gaze duration. © 2009 IEEE.
  • Akihiro Kobayashi, Jyunji Satake, Takatsugu Hirayama, Hiroaki Kawashima, Takashi Matsuyama
    IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2008年9月  査読有り
  • 川嶋宏彰, 三井健, 松山隆司
    第11回画像の認識・理解シンポジウム (MIRU) 339-346 2008年7月  査読有り
  • 堀井悠, 川嶋宏彰, 松山隆司
    第11回画像の認識・理解シンポジウム (MIRU) 193-200 2008年7月  査読有り
  • Yu Horii, Hiroaki Kawashima, Takashi Matsuyama
    IEEE CVPR Workshop on Interaction Dynamics on Human Communicative Behavior Analysis 2008年6月  査読有り
  • Hiroaki Kawashima, Takeshi Nishikawa, Takashi Matsuyama
    Conference on Human Factors in Computing Systems - Proceedings 3585-3590 2008年  査読有り
    Turn-taking in a smooth conversation is supported by the anticipation of the floor handover timing among participants. However, it becomes difficult to maintain natural turn-taking in video conferencing with transmission delays because the utterances and movements of each participant are presented to the others with a time lag, which often leads to a collision of utterances. In order to facilitate smooth communication over a video-conferencing system, we propose a novel method, "Visual Filler," that fills temporal gaps in turn-taking caused by the existence of delays. Visual Filler overlays an artificial visual stimulus that has a function similar to that of filler sounds on a screen with participant images. We have evaluated the effectiveness of a Visual Filler for reducing the unnaturalness of turn-taking on a simulated dyadic dialog situation with a delay.
  • 川嶋宏彰, 松山隆司
    情報処理学会論文誌 48(12) 3680-3691 2007年12月  査読有り
  • 川嶋宏彰, 西川猛司, 松山隆司
    情報処理学会論文誌 48(12) 3715-3728 2007年12月  査読有り
  • Hiroaki Kawashima, Takashi Matsuyama
    International Conference on Image Analysis and Processing (ICIAP) 789-794 2007年9月  査読有り
  • 西川猛司, 川嶋宏彰, 松山隆司
    情報科学技術レターズ 311-314 2007年9月  査読有り
  • 川嶋宏彰, スコギンズ・リーバイ, 松山隆司
    ヒューマンインタフェース学会 9(3) 379-390 2007年8月  査読有り
  • 平山高嗣, 川嶋宏彰, 西山正紘, 松山隆司
    ヒューマンインタフェース学会 9(2) 201-211 2007年5月  査読有り
  • Hiroaki Kawashima, Kimitaka Tsutsumi, Takashi Matsuyama
    ARTICULATED MOTION AND DEFORMABLE OBJECTS, PROCEEDINGS 4069 453-463 2006年  査読有り
    Modeling and describing temporal structure in multimedia signals, which are captured simultaneously by multiple sensors, is important for realizing human machine interaction and motion generation. This paper proposes a method for modeling temporal structure in multimedia signals based on temporal intervals of primitive signal patterns. Using temporal difference between beginning points and the difference between ending points of the intervals, we can explicitly express timing structure; that is, synchronization and mutual dependency among media signals. We applied the model to video signal generation from an audio signal to verify the effectiveness.
  • Hiroaki Kawashima, Takashi Matsuyama
    IEICE Transactions on Fundamentals E88-A(11) 3022-3035 2005年11月  査読有り
    This paper addresses the parameter estimation problem of an interval-based hybrid dynamical system (interval system). The interval system has a two-layer architecture that comprises a finite state automaton and multiple linear dynamical systems. The automaton controls the activation timing of the dynamical systems based on a stochastic transition model between intervals. Thus, the interval system can generate and analyze complex multivariate sequences that consist of temporal regimes of dynamic primitives. Although the interval system is a powerful model to represent human behaviors such as gestures and facial expressions, the learning process has a paradoxical nature : temporal segmentation of primitives and identification of constituent dynamical systems need to be solved simultaneously. To overcome this problem, we propose a multiphase parameter estimation method that consists of a bottom-up clustering phase of linear dynamical systems and a refinement phase of all the system parameters. Experimental results show the method can organize hidden dynamical systems behind the training data and refine the system parameters successfully.
  • 川嶋宏彰, 西山正紘, 松山隆司
    情報科学技術レターズ 153-156 2005年9月  査読有り
  • Hiroaki Kawashima, Takashi Matsuyama
    3rd International Conference on Advances in Pattern Recognition (S. Singh et al. Eds.: ICAPR 2005 Springer LNCS 3686) 229-238 2005年8月  査読有り
  • スコギンズ・リーバイ, 川嶋宏彰, 松山隆司
    インタラクション2005 (D-404) 1-2 2005年3月  査読有り
  • M Nishiyama, H Kawashima, T Hirayama, T Matsuyama
    ANALYSIS AND MODELLING OF FACES AND GESTURES, PROCEEDINGS 3723 140-154 2005年  査読有り
    This paper presents a method for interpreting facial expressions based on temporal structures among partial movements in facial image sequences. To extract the structures, we propose a novel facial expression representation, which we call a facial score, similar to a musical score. The facial score enables us to describe facial expressions as spatio-temporal combinations of temporal intervals; each interval represents a simple motion pattern with the beginning and ending times of the motion. Thus, we can classify fine-grained expressions from multivariate distributions of temporal differences between the intervals in the score. In this paper, we provide a method to obtain the score automatically from input images using bottom-up clustering of dynamics. We evaluate the efficiency of facial scores by comparing the temporal structure of intentional smiles with that of spontaneous smiles.
  • 川嶋宏彰, 堤公孝, 松山隆司
    第7回情報論的学習理論ワークショップ(IBIS) 86-93 2004年11月  査読有り
  • 川嶋宏彰, 堤公孝, 松山隆司
    情報科学技術レターズ 175-178 2004年9月  査読有り
  • Hiroaki Kawashima, Takashi Matsuyama
    Systems and Computers in Japan 34(14) 1-12 2003年12月  査読有り
    This paper proposes a system architecture for event recognition that dynamically integrates information from multiple sources (e.g., multimodal data from visual and auditory sensors). The proposed system consists of multiple event classifiers called Continuous State Machines (CSMs). Each CSM has a state transition rule in a continuous state space and classifies time-varying patterns from a different single source. Since the rule is defined as an extension of Kalman filters (i.e., the next state is deduced from the trade-off scheme between the input data and the model's prediction), CSMs support dynamic time warping and robustness against noise. We then introduce an interaction method among CSMs to classify events from multiple sources. A continuous state space (i.e., vector space) allows us to design interaction as minimization of an energy function. This interaction enables the system to dynamically suppress unreliable classifiers and improves system reliability and the accuracy of classifying events in dynamically changing situations (e.g., the object is temporary occluded from one of multiple cameras in a gesture recognition task). Experimental results on gesture recognition by two cameras show the effectiveness of our proposed system.
  • 川嶋宏彰, 松山隆司
    電子情報通信学会論文誌 J85-D-II(12) 1801-1812 2002年12月  査読有り
  • H Kawashima, T Matsuyama
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS 2 785-789 2002年  査読有り
    This paper proposes a system architecture for event recognition that integrates information from multiple sources (e.g., gesture and speech recognition from distributed sensors in the real world). The proposed system consists of multiple recognizers named Continuous State Machines (CSMs). Each CSM has a state transition rule in a continuous state space and classifies time-varying patterns from a single source. Since the rule is defined as a simplification of Kalman filter (i.e., the next state is deduced from the trade-off scheme between input data and model's prediction), CSMs support dynamic time warping and robustness against noise. We then introduce an interaction method among CSMs to classify events from multiple sources. A continuous state space (i.e., vector space) allows as to design interaction as recursively minimizing an energy function. This interaction enables the system to dynamically focus over the multiple sources, and improves reliability and accuracy of classifying events in dynamically changing situations (e.g., the object is temporally occluded from one of multiple cameras in a gesture recognition task). Experimental results on gesture recognition by two cameras show the effectiveness of our proposed system.

MISC

 70

書籍等出版物

 6
  • 浮田 浩行, 濱上 知樹, 藤吉 弘亘, 大町 真一郎, 戸田 智基, 岩崎 敦, 小林 泰介, 鈴木 亮太, 木村 雄喜, 橋本 大樹, 玉垣 勇樹, 水谷 麻紀子, 永田 毅, 木村 光成, 李 晃伸, 川嶋 宏彰 (担当:共著, 範囲:第11章11ページ)
    コロナ社 2023年1月 (ISBN: 9784339033854)
  • Katsushi Ikeuchi (Editor) (担当:分担執筆, 範囲:Active Appearance Models)
    Springer 2021年10月14日 (ISBN: 3030634159)
  • 笹島 宗彦(編) (担当:分担執筆, 範囲:p.36-50(3.2 相関))
    朝倉書店 2021年4月5日 (ISBN: 4254129114)
  • P. Benner, R. Findeisen, D. Flockerzi, U. Reichl, K. Sundmacher (担当:分担執筆, 範囲:Chap.3, Magnus Egerstedt, Jean-Pierre de la Croix, Hiroaki Kawashima, and Peter Kingston, "Interacting with Networks of Mobile Agents")
    Birkhauser-Springer 2014年
  • 乾敏郎, 川口潤, 吉川左紀子 (担当:分担執筆, 範囲:第I部 第10章「タイミング」)
    ミネルヴァ書房 2010年

講演・口頭発表等

 31

主要な担当経験のある科目(授業)

 19

共同研究・競争的資金等の研究課題

 17

産業財産権

 2

学術貢献活動

 7

社会貢献活動

 12