研究者業績

中野 有紀子

ナカノ ユキコ  (Yukiko Nakano)

基本情報

所属
成蹊大学 理工学部 理工学科 教授
学位
博士(情報理工学)(東京大学)

J-GLOBAL ID
201101020839458565
researchmap会員ID
B000004842

外部リンク

1990年東京大学大学院教育学研究科修士課程修了.同年,日本電信電話(株)入社.2002年 MIT Media Arts & Sciences修士課程修了.同年より JST社会技術研究開発センター専門研究員,東京農工大学大学院工学府特任准教授,成蹊大学理工学部情報科学科准教授を経て,現在,成蹊大学理工学部情報科学科教授.知的で自然なユーザインタフェースの実現に向けて,人との言語・非言語コミュニケーションが可能な会話エージェントの研究に従事.博士(情報理工学).ACM,人工知能学会,電子情報通信学会,情報処理学会各会員.

経歴

 2

論文

 51
  • Candy Olivia Mawalim, Shogo Okada, Yukiko I. Nakano, Masashi Unoki
    Journal on Multimodal User Interfaces 17(2) 47-63 2023年  
  • Atsushi Ito, Yukiko I. Nakano, Fumio Nihei, Tatsuya Sakato, Ryo Ishii, Atsushi Fukayama, Takao Nakamura
    J. Inf. Process. 31 34-44 2023年  
  • Atsushi Ito, Yukiko I. Nakano, Fumio Nihei, Tatsuya Sakato, Ryo Ishii, Atsushi Fukayama, Takao Nakamura
    IUI 2022: 27th International Conference on Intelligent User Interfaces 85-88 2022年  
  • Fumio Nihei, Ryo Ishii, Yukiko I. Nakano, Kyosuke Nishida, Ryo Masumura, Atsushi Fukayama, Takao Nakamura
    INTERSPEECH 1086-1090 2022年  
  • 久芳 和己, 中野 有紀子, 岡田 将吾
    人工知能学会全国大会論文集 JSAI2022 3C4GS604-3C4GS604 2022年  
    ジェスチャーや表情など、さまざまな非言語情報は2者間対話の世界において重要な役割を担っている。 本研究の第一の目的は,多言語話者による二者間相互作用における非言語情報の差異を分析することである.そこで,文化的背景の異なる3カ国で収集されたマルチモーダルな会話データセットであるNoXiデータコーパスを用いて,ANOVA分析により3カ国の対話ペアグループ間で異なる非言語情報を分析し、報告する。
  • Yukiko I. Nakano, Eri Hirose, Tatsuya Sakato, Shogo Okada, Jean-Claude Martin
    ICMI 5-14 2022年  
  • 鈴木 凱, 岡田 将吾, 黄 宏軒, 中野 有紀子
    人工知能学会全国大会論文集 JSAI2022 3H3OS12a01-3H3OS12a01 2022年  
    本論文では,マルチモーダル特徴量を用いた議論の質の推定モデルの精度を改善するための手法を提案する.計56回のグループミーティングで観測された参加者の韻律・表情・言語・発話ターンの特徴量を含むグループ会議コーパスMATRICSを用いる.先行研究で課題となっていた時系列データに含まれる全てのフレーム・全てのモダリティの特徴量が, そのラベルの推定に有効であるとは限らないという問題に対して,ノイズラベルに有効な弱教師あり学習であるCo-teachingをよりノイズに対して頑健に拡張したN-teachingモデルを提案する.またノイズとして学習に使われなかったサンプルについて分析を行い,先行研究との比較を行った.本研究では議論内容のOriginally(新規性)の指標においてMAE 0.309という最高精度を得た.
  • Candy Olivia Mawalim, Shogo Okada, Yukiko I. Nakano
    ACM Transactions on Multimedia Computing, Communications, and Applications 17(4) 1-27 2021年11月30日  
    Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the R2 score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best R2 = 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best R2 = 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.
  • Kazufumi Tsukada, Yutaka Takase, Yukiko I. Nakano
    ACM/IEEE International Conference on Human-Robot Interaction 02-05- 93-94 2015年3月2日  査読有り
  • Takashi Yoshino, Yutaka Takase, Yukiko I. Nakano
    ACM/IEEE International Conference on Human-Robot Interaction 02-05- 127-128 2015年3月2日  査読有り
  • Naoko Saito, Shogo Okada, Katsumi Nitta, Yukiko I. Nakano, Yuki Hayashi
    AAAI Spring Symposium - Technical Report SS-15-07 100-103 2015年  
  • 林 佑樹, 二瓶 芙巳雄, 中野 有紀子, 黄 宏軒, 岡田 将吾
    56(4) 1217-1227 2015年  査読有り
  • Reo Suzuki, Yutaka Takase, Yukiko I. Nakano
    In Proceedings of The Eighth International Conference on Advances in Computer-Human Interactions (ACHI 2015) 92-95 2015年  査読有り
  • Sakiko Nihonyanagi, Yuki Hayashi, Yukiko I. Nakano
    GazeIn 2014 - Proceedings of the 7th ACM Workshop on Eye Gaze in Intelligent Human Machine Interaction: Eye-Gaze and Multimodality, Co-located with ICMI 2014 33-37 2014年11月16日  
  • Fumio Nihei, Yukiko I. Nakano, Yuki Hayashi, Hung-Hsuan Huang, Shogo Okada
    ICMI 2014 - Proceedings of the 2014 International Conference on Multimodal Interaction 136-143 2014年11月12日  査読有り
  • Hung-Hsuan Huang, Roman Bednarik, Kristiina Jokinen, Yukiko I. Nakano
    Proceedings of the 16th International Conference on Multimodal Interaction(ICMI) 527-528 2014年  
  • 中野有紀子, 馬場直哉, 黄宏軒, 林佑樹
    人工知能学会論文誌 29(1) 69-79 2014年  査読有り
  • 林佑樹, 小川裕史, 中野有紀子
    情報処理学会論文誌 55(1) 189-198 2014年  査読有り
  • Vrzakova, H, Bednarik, R, Nihei, F, Nakano, Y
    In the 8th Nordic Conference on Human-Computer Interaction 915-918 2014年  査読有り
  • Misato Yatsushiro, Naoya Ikeda, Yuki Hayashi, Yukiko I. Nakano
    GazeIn 2013 - Proceedings of the 2013 ACM Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, co-located with ICMI 2013 13-18 2013年12月13日  
  • 石井 亮, 小澤 史朗, 川村 春美, 小島 明, 中野 有紀子
    電子情報通信学会論文誌 D J96-D(1) 110-119 2013年  査読有り
  • 馬場直哉, 黄 宏軒, 中野有紀子
    人工知能学会論文誌 28(2) 149-159 2013年  査読有り
  • Yukiko I. Nakano, Naoya Baba, Hung-Hsuan Huang, Yuki Hayashi
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION 35-42 2013年  査読有り
  • Hung-Hsuan Huang, Hiroki Matsushita, Kyoji Kawagoe, Yoichi Sakai, Yuuko Nonaka, Yukiko Nakano, Kiyoshi Yasuda
    Proceedings of the 11th IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2012 295-299 2012年  査読有り
  • 塚本 剛生, 中野 有紀子
    日本バーチャルリアリティ学会論文誌 17(2) 79-89 2012年  査読有り
    This paper proposes a direction giving avatar system in Metaverse, which automatically generates direction giving gestures based on linguistic information obtained from the user's chat text input and spatial information in Metaverse. First, we conduct an experiment to collect direction giving conversation corpus. Then, using the collected corpus, we analyze the relationship between the proxemics of conversation participants and the position of their direction giving gestures. Next, we analyze the relationship between linguistic features in direction giver's utterances and the shape of their spatial gestures. We define five categories of gesture concepts and four gesture shape parameters, and analyze the relationship between the gesture concepts and a set of gesture parameters. Based on these results, we propose an automatic gesture decision mechanism and implement a direction giving avatar system in Metaverse.
  • Yukiko I. Nakano, Yuki Fukuhara
    ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION 77-84 2012年  査読有り
  • 石井亮, 大古亮太, 中野有紀子, 西田豊明
    情報処理学会論文誌, 52(12) 3625-3636 2011年  査読有り
  • 中野有紀子
    2011 International Conference on Intelligent User Interfaces (IUI2011), Workshop on Eye Gaze in Intelligent Human Machine Interaction 2011年  査読有り
  • 中野有紀子
    in the proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS2011) 441-448 2011年  査読有り
  • 中野有紀子
    the 11th International Conference on Intelligent Virtual Agents (IVA2011) 262-268 2011年  査読有り
  • 中野有紀子
    the 11th International Conference on Intelligent Virtual Agents (IVA2011), 255-261 2011年  査読有り
  • 中野有紀子
    Proceedings of the 11th International Conference on Intelligent Virtual Agents (IVA 2011) 1-13 2011年  査読有り
  • 中野有紀子
    the 13th ACM International Conference on Multimodal Interaction (ICMI2011) 401-408 2011年  査読有り
  • 黄 宏軒, 武田 信也, 小野 正貴, 中野 有紀子
    人工知能学会全国大会論文集 JSAI2010 1C13-1C13 2010年  
    本研究は,旅行代理店を題材として,二人の客の間でなされる意思決定会話を支援する情報提供エージェントを開発するものである.本システムは,ユーザの発話状態と顔向きの変化から,会話内容が相談,雑談,質問,理解のどの状態であるかを推定し,会話制御機構において,四種類の状態に応じてエージェントによる会話への参入方法を決定する.指標の有効性と推定精度を,被験者評価実験によって検証し,その結果を報告する.
  • Yukiko I. Nakano, Toyoaki Nishida
    Proceedings of the Symposium on Conversational Informatics for Supporting Social Intelligence and Interaction: Situational and Environmental Information Enforcing Involvement in Conversation 128-135 2005年  
  • Yoshiyasu Ogasawara, Masashi Okamoto, Yukiko I. Nakano, Yong Xu, Toyoaki Nishida
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3683 289-295 2005年  
  • Kazunori Okamoto, Yukiko I. Nakano, Masashi Okamoto, Hung-Hsuan Huang, Toyoaki Nishida
    Knowledge-Based Intelligent Information and Engineering Systems 848-854 2005年  
  • Yukiko I. Nakano, Toshiyasu Murayama, Toyoaki Nishida
    IEICE Transactions on Information and Systems E87-D(6) 1338-1346 2004年  
  • Takaaki Hasegawa, Yukiko I. Nakano, Tsuneaki Kato
    Proceedings of the International Conference on Autonomous Agents 75-82 1997年  
  • Yukiko I. Nakano, Masashi Okamoto, Daisuke Kawahara, Qing Li, Toyoaki Nishida
      査読有り
    This paper proposes a method for assigning gestures to text based on lexical and syntactic information. First, our empirical study identified lexical and syntactic information strongly correlated with gesture occurrence and suggested that syntactic structure is more useful for judging gesture occurrence than local syntactic cues. Based on the empirical results, we have implemented a system that converts text into an animated agent that gestures and speaks synchronously. 1
  • Justine Cassell, Tom Stocky, Tim Bickmore, Yang Gao, Yukiko Nakano, Kimiko Ryokai, Catherine Vaucelle, Hannes Vilhjálmsson
      査読有り
    In this paper, we describe an embodied conversational kiosk that builds on research in embodied conversational agents (ECAs) and on information displays in mixed reality and kiosk format in order to display spatial intelligence. ECAs leverage people’s abilities to coordinate information displayed in multiple modalities, particularly information conveyed in speech and gesture. Mixed reality depends on users ’ interactions with everyday objects that are enhanced with computational overlays. We describe an implementation, MACK (Media lab Autonomous Conversational Kiosk), an ECA who can answer questions about and give directions to the MIT Media Lab’s various research groups, projects and people. MACK uses a combination of speech, gesture, and indications on a normal paper map that users place on a table between themselves and MACK. Research issues involve users’ differential attention to hand gestures, speech and the map, and flexible architectures for Embodied Conversational Agents that allow these modalities to be fused in input and generation.
  • Hung-hsuan Huang, Tsuyoshi Masuda, Ra Cerekovic, Kateryna Tarasenko, Igor S. P, Yukiko Nakano, Toyoaki Nishida
      査読有り
    Abstract. Embodied Conversational Agents (ECAs) are computer generated life-like characters that interact with human users in face-to-face conversations. To achieve natural multi-modal conversations, ECA systems are very sophisticated and require many building assemblies and thus are difficult for individual research groups to develop. This paper proposes a generic architecture, the Universal ECA Framework, which is currently under development and includes a blackboard-based platform, a high-level protocol to integrate general purpose ECA components and ease ECA system prototyping. 1. The Essential Components of Embodied Conversational Agents and the Issues to Integrate Them Embodied Conversational Agents (ECAs) are computer generated life-like characters that interact with human users in face-to-face conversations. To achieve natural communications with human users, many software or hardware assemblies are required in an ECA system. By their functionalities in the information flow of the interactions with human users, they can be divided into four categories:
  • Justine Cassell, Yukiko I. Nakano, Timothy W. Bickmore, Ace L. Sidner, Charles Rich
      査読有り
    This paper addresses the problem of designing embodied conversational agents that exhibit appropriate posture shifts during dialogues with human users. Previous research has noted the importance of hand gestures, eye gaze and head nods in conversations between embodied agents and humans. However, this research has neglected the role of other body movements, in particular postural shifts. We present an analysis of human monologues and dialogues that suggests that postural shifts can be predicted as a function of discourse state in monologues, and discourse state and conversation state in dialogues. On the basis of these findings, we have implemented an embodied conversational agent that uses a dialogue manager called Collagen in such a way as to generate postural shifts.
  • Yukiko Nakano, Gabe Reinstein, Tom Stocky, Justine Cassell
      査読有り
    We investigate the verbal and nonverbal means for grounding, and propose a design for embodied conversational agents that relies on both kinds of signals to establish common ground in human-computer interaction. We analyzed eye gaze, head nods and attentional focus in the context of a direction-giving task. The distribution of nonverbal behaviors differed depending on the type of dialogue move being grounded, and the overall pattern reflected a monitoring of lack of negative feedback. Based on these results, we present an ECA that uses verbal and nonverbal grounding acts to update dialogue state.
  • Yukiko I. Nakano, Kenji Imamura, Kenji Hnamura, Hisashi Ohara
      査読有り
    While recent advancements in virtual reality technology have created a rich communication interface linking humans and computers, there has been little work on building dialogue systems for 3D virtual worlds. This paper proposes a method for altering the instruction dialogue to match the user's view in a virtual environment. We illustrate the method with the system MID-3D, which interactively instructs the user on dismantling some parts of a car. First, in order to change the content of the instruction dialogue to match the user's view, we extend the refinement-driven planning algorithm by using the user's view as a plan constraint. Second, to manage the dialogue smoothly, the system keeps track of the user's viewpoint as part of the dialogue state and uses this information for coping with interruptive subdialogues. These mechanisms enable MID-3D to set instruction dialogues in an incremental way; it takes account of the user's view even when it changcs frequently.
  • Yukiko I. Nakano, Tsuneaki Kato
      査読有り
    The purpose of this pape is to identify effective factors for selecting discourse organization cue phrases in instruction dialogue that signM changes in discourse structure such as topic shifts and attentional state changes. By using a machine learning technique, a variety of features concerning discourse structure, task structure, and dialogue context are examined in terms of their effectiveness and the best set of learning 2eaturea is identified. Our result reveals that, in addition to discourse structure, already identified in previous studies, task structure and alogue context play an important role. Moreover, an evaluation using a large dialogue corpus shows the utility of applying machine learning techniques to cue phrase selection.
  • Yukiko I. Nakano, Yoshiko Arimoto, Kazuyoshi Murata, Yasuhiro Asa, Mika Enomoto, Hirohiko Sagawa
      査読有り
    The aim of this paper is to develop animated agents that can control multimodal instruction dialogues by monitoring user’s behaviors. First, this paper reports on our Wizard-of-Oz experiments, and then, using the collected corpus, proposes a probabilistic model of fine-grained timing dependencies among multimodal communication behaviors: speech, gestures, and mouse manipulations. A preliminary evaluation revealed that our model can predict a instructor’s grounding judgment and a listener’s successful mouse manipulation quite accurately, suggesting that the model is useful in estimating the user’s understanding, and can be applied to determining the agent’s next action. 1
  • Justine Cassell, Yukiko I. Nakano, Timothy W. Bickmore, Ace L. Sidner, Charles Rich
      査読有り
    This paper addresses the issue of designing embodied conversational agents that exhibit appropriate posture shifts during dialogues with human users. Previous research has noted the importance of hand gestures, eye gaze and head nods in conversations between embodied agents and humans. We present an analysis of human monologues and dialogues that suggests that postural shifts can be predicted as a function of discourse state in monologues, and discourse and conversation state in dialogues. On the basis of these findings, we have implemented an embodied conversational agent that uses Collagen in such a way as to generate postural shifts. 1.
  • Matthias Rehm, Yukiko Nakano, Elisabeth André, Toyoaki Nishida
      査読有り
    Abstract. We present our concept of integrating culture as a computational parameter for modeling multimodal interactions with virtual agents. As culture is a social rather than a psychological notion, its influence is evident in interactions, where cultural patterns of behavior and interpretations mismatch. Nevertheless, taking culture seriously its influence penetrates most layers of agent behavior planning and generation. In this article we concentrate on a first meeting scenario, present our model of an interactive agent system and identify, where cultural parameters play a role. To assess the viability of our approach, we outline an evaluation study that is set up at the moment. 1
  • Afia Akhter Lipi, Yukiko Nakano, Matthias Rehm
      査読有り
    Abstract. The goal of this paper is to integrate culture as a computational term in embodied conversational agents by employing an empirical data-driven approach as well as a theoretical model-driven approach. We propose a parameter-based model that predicts nonverbal expressions appropriate for specific cultures. First, we introduce the Hofstede theory to describe socio-cultural characteristics of each country. Then, based on the previous studies in cultural differences of nonverbal behaviors, we propose expressive parameters to characterize nonverbal behaviors. Finally, by integrating socio-cultural characteristics and nonverbal expressive characteristics, we establish a Bayesian network model that predicts posture expressiveness from a country name, and vice versa.

MISC

 42

共同研究・競争的資金等の研究課題

 12