研究者業績

湯本 高行

ユモト タカユキ  (Takayuki Yumoto)

基本情報

所属
兵庫県立大学 社会情報科学部 准教授
学位
博士(情報学)(京都大学)

J-GLOBAL ID
200901000308952299
researchmap会員ID
5000091303

外部リンク

受賞

 1

論文

 26
  • 川原 敬史, 橋口 友哉, 湯本 高行, 大島 裕明
    電子情報通信学会論文誌D 情報・システム J105-D(5) 322-336 2022年5月1日  査読有り
    本研究では,事故の概要を説明したテキストを入力として,当事者が受けた傷病の程度を推定する手法を提案する.入力の対象とするテキストは,数文程度の文書を想定している.機械学習による分類問題を解くことで,その入力に該当する傷病の程度を推定するというのが提案手法の構成となる.本研究で利用するデータは,事故情報データバンクシステムで公開されている事故データである.入力として用いるのは「事故の概要」項目に記載されたテキストである.提案手法では,入力テキストを汎用言語モデルBERTを利用して分散表現として表現する.BERTのモデルとしては,日本語Wikipediaを用いて学習された事前学習モデルを用いる.しかし,傷病の程度を推定するというタスクの正解率を向上させるために,四つの工夫,(1)クラスウェイト,(2)Ordinal Classification,(3)マルチタスクラーニング,(4)トークンラベル推定による追加学習モデル,を導入する.これらの工夫を用いる場合と用いない場合において,傷病の程度の推定の正解率やMacro F1,RMSE,混同行列による評価にどのような影響が出るかを検証した.その結果,(1)クラスウェイト,並びに,(2)Ordinal Classificationを導入した際に,Macro F1の向上とRMSEの改善が得られるという結果となった.また,(3)マルチタスクラーニングを導入した際に正解率の向上が見られた.
  • Yuya Koyama, Takayuki Yumoto, Teijiro Isokawa, Naotake Kamiura
    Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication, IMCOM 2019, Phuket, Thailand, January 4-6, 2019 996-1005 2019年  査読有り
  • 飯塚翔, 湯本高行, 新居学, 上浦尚武
    電子情報通信学会論文誌 D(Web) J101-D(4) 681‐689 (WEB ONLY) 2018年4月1日  
  • International Journal of Biomedical Soft Computing and Human Sciences 22(1) 9-18 2017年  査読有り
  • Sho Iizuka, Takayuki Yumoto, Manabu Nii, Naotake Kamiura
    Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, National Center of Sciences, Tokyo, Japan, June 7-10, 2016 2016年  査読有り
  • Takayuki Yumoto, Takahiro Yamanaka, Manabu Nii, Naotake Kamiura
    DIGITAL LIBRARIES: KNOWLEDGE, INFORMATION, AND DATA IN AN OPEN ACCESS SOCIETY 10075 85-91 2016年  査読有り
    We propose rarity-oriented retrieval methods for serendipity using two approaches. We define rare information as relevant and atypical information. We propose two approaches. In the first approach, we use social bookmark data. We introduce tag estimation to our previous work. The second approach is based on word co-occurrence in a dataset. In both approaches, we use conditional probabilities to express relevancy and atypicality. In experiments, we compared our methods with the relevance-oriented method, the diversity-oriented method, and another rarity-oriented method. Our methods using word co-occurrence obtained better nDCG scores than the other methods.
  • Yasuyuki Okamura, Takayuki Yumoto, Manabu Nii, Naotake Kamiura
    Proceedings of the 14th International Conference WWW/Internet 2015 55-62 2015年1月  査読有り
    © 2015. People are posting huge amounts of varied information on the Web as the popularity of social media continues to increase. The sentiment of a tweet posted on Twitter can reveal valuable information on the reputation of various targets both on the Web and in the real world. We propose a method to classify tweet sentiments by machine learning. In most cases, machine learning requires a significant amount of manually labeled data. Our method is different in that we use social bookmark data as training data for classifying tweets with URLs. In social bookmarks, comments are written using casual expressions, similar to tweets. Since tags in social bookmarks partly represent sentiment, they can be used as supervisory signals for learning. The proposed method moves beyond the basic "positive"/"negative" classification to classify impressions as "useful", "funny", "negative", and "other".
  • Takayuki Yumoto
    Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, NTCIR-11, National Center of Sciences, Tokyo, Japan, December 9-12, 2014 2014年  査読有り
  • Takayuki Yumoto, Ryohei Tada, Manabu Nii, Kunihiro Sato
    2013 SECOND IIAI INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2013) 284-288 2013年  査読有り
    In this paper, we propose rarity of a Web page in a category given by a user to find useful information that a few people know. A rare Web page is a page that belongs to a given category and that is atypical in the category. We define a probability that the page is a rare Web page in the given category as a rarity score. The rarity score is a product of a relevancy score and an atypicality score. The relevancy is a probability that a Web page belongs to a category given by a user. The atypicality is a conditional probability that a page is atypical in the category when it belongs to the category. Both probabilities are calculated by using tags of social bookmark services and words in Web pages. We evaluated the proposed relevancy score by classifying whetherWeb pages belong to a certain category. We also evaluated the proposed rarity as a metric for ranking Web pages, and compared the rankings by relevancy and atypicality. We confirmed usefulness of the rarity score to find relevant and atypical pages.
  • Tatsuya Fujisaka, Takayuki Yumoto, Kazutoshi Sumiya
    WEB INFORMATION SYSTEMS AND MINING, PT II 6988 103-+ 2011年  査読有り
    Conventional search engines are able to extract commonplace information by incorporating users requests into their queries. Users perform niche requests when they want to obtain atypical objects or unique information. In these instances, it is difficult for users to expand their queries to match their niche requests. In this paper, we introduce a query suggestion method for finding objects that have atypical characteristics. Our method focuses on the property values of an object, and elicits atypical property values by using the relation between an object's name and a typical property value.
  • Ling Xu, Takayuki Yumoto, Shinya Aoki, Qiang Ma, Masatoshi Yoshikawa
    Proceedings of the Annual Hawaii International Conference on System Sciences 1-10 2011年  査読有り
    The advantages of the multimedia make the video news presented believable and impressed to the viewers when the personal opinions and ideological perspectives hidden in the contents still cause the effect. To reduce the risk of the misleading, based on a Material-Opinion model, we propose a method of detecting the inconsistent news items reporting the same event when the viewer is watching one of them. In the Material-Opinion Model, main participants filmed as the materials are presented to the viewer through the video stream, which is used to support the arguments put forward. Based on this model, given a series of multimedia news items reporting a same event, we explore inconsistency between any two of them by computing their dissimilarities of materials and of opinions. Material-dissimilarity is based on the appearance of the main participants in the video. Opinion-dissimilarity is calculated as the vector difference of two vectors consisting of the argument points extracted from the closed captions. If one of the dissimilarities is high and the other is low, we consider that there exists the inconsistency as a result. We also show some experimental results to validate the proposed methods.
  • Yuko Koba, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    2010 4th International Universal Communication Symposium, IUCS 2010 - Proceedings 350-354 2010年  査読有り
    People often want to know the names of the objects that they can explain but don't know the names. It is, however, difficult to find such object names using conventional Web search engines. So, we propose a new method for finding the object name from the descriptions given by a user. This method consists of two phases, the extraction phase and the validation phase. In the extraction phase, candidate words are extracted by conducting a Web search using a combination of the queries generated from the user's descriptions. In the validation phrase, each candidate word is validated through a Web search using the candidate word. We rank the candidate words based on the user's description. We evaluated our algorithm by performing several tasks to find the object names from questions in Q&amp A sites. We also compared it with the methods using queries consisting of all the words in the description and queries consisting of user-selected and user-generated words. The precision by our algorithm was higher than the precision by the other methods. ©2010 IEEE.
  • Katsumi Tanaka, Hiroaki Ohshima, Adam Jatowt, Satoshi Nakamura, Yusuke Yamamoto, Kazutoshi Sumiya, Ryong Lee, Daisuke Kitayama, Takayuki Yumoto, Yukiko Kawai, Jianwei Zhang, Shinsuke Nakajima, Yoichi Inagaki
    Proceedings of the 4th International Conference on Ubiquitous Information Management and Communication ICUIMC 10 147-156 2010年  査読有り
    We describe a new concept and method for evaluating the Web information credibility. The quality control of information (text, image, video etc.) on the Web is generally insufficient due to low publishing barriers. As a result, there is a large amount of mistaken and unreliable information on the Web that can have detrimental effects on users. This calls for technology that facilitates the judging of the credibility (expertise and trustworthiness) of Web content and the accuracy of the information that users encounter on the Web. Such technology should be able to handle a wide range of tasks: extracting several credibility-related features from the target Web content, extracting reputation-related information for the target Web content, such as hyperlinks and social bookmarks and evaluating its distribution, and evaluating features of the target content authors. We propose and describe methodologies of analyzing information credibility of Web information: (1) content analysis, (2) social support analysis and (3) author analysis. We overview our recent research activities on Web information credibility evaluation based on this methodologies. © 2010 ACM.
  • Takeru Nakabayashi, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi, Kazutoshi Sumiya
    ROLE OF DIGITAL LIBRARIES IN A TIME OF GLOBAL CHANGE 6102 112-+ 2010年  査読有り
    We define the peculiarity of text as a metric of information credibility. Higher peculiarity means lower credibility. We extract the theme word and the characteristic words from text and check whether there is a subject-description relation between them. The peculiarity is defined using the ratio of the subject-description relation between a theme word and characteristic words. We evaluate the extent to which peculiarity can be used to judge by classifying text from Wikipedia and Uncyclopedia in terms of the peculiarity.
  • Takayuki Yumoto, Kazutoshi Sumiya
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS 6193 264-+ 2010年  査読有り
    Social bookmarks are used to find Web pages drawing much attention. However, tendency of pages to collect bookmarks is different by their topic. Therefore, the number of bookmarks can be used to know attention intensity to pages but it cannot be used as the metric of the intensity itself. We define the relative quantity of social bookmarks (RQS) for measuring the attention intensity to a Web page. The RQS is calculated using the number of social bookmarks of related pages. Related pages are found using similarity based on specificity of social tags. We define two types of specificity, local specificity, which is the specificity for a user, and global, which is the specificity common in a social bookmark service.
  • Ryouji Nonaka, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    ACM International Conference Proceeding Series 350-354 2009年  査読有り
    We propose a method for searching for comprehensible how-to information on the Web. In our how-to information search, we use lightweight analysis of Web pages to extract how-to information from Web pages obtained by conventional Web search engines and rank them according to their easily-viewable-degree. In the extraction process, we focus on expressions in Web page text blocks that describe procedures. In the ranking process, we focus on images, the effect of letter string and the length of the how-to information. Copyright 2009 ACM.
  • Shinya Aoki, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    ACM International Conference Proceeding Series 344-349 2009年  査読有り
    Recently, we have been able to often compare two objects using search engines. However, we often browse high ranked Web pages by search engines, which may give biased information. We propose a method for searching Web pages where two objects are compared using a search engine, extracting comparison points from those Web pages, and showing these points to users. Comparison points are keywords for comparing objects. The proposed method can be used to extract points for efficient comparison by using comparison expressions such as "Liquid Crystal TVs are better ..." and "... than Plasma TVs.", etc. Copyright 2009 ACM.
  • Takayuki Yumoto, Yuta Mori, Kazutoshi Sumiya
    SEVENTH INTERNATIONAL CONFERENCE ON CREATING, CONNECTING AND COLLABORATING THROUGH COMPUTING, PROCEEDINGS 121-+ 2009年  査読有り
    In this paper, we propose a method of converting a given sequence of search queries about a certain topic into a sequence of search queries about a given different topic. We define the concept of a search skeleton for topic conversion. A search skeleton represents relationships between keywords in a query. A given sequence of search queries is converted into a sequence of search skeletons, which art in turn converted into a sequence of search queries about the target topic. We evaluated our method of search query conversion and found that the precision for deciding types of subtopic keywords in search queries was 84.4%, the precision for finding relational keywords was 35.7%, and the precision for converting dynamic subtopic keywords was 40.0%.
  • Toru Onoda, Takayuki Yumoto, Kazutoshi Sumiya
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION 162-+ 2008年  査読有り
    Query-recommendation systems based on inputted queries have become widespread. These services are effective if users cannot input relevant queries. However, the conventional systems do not take into consideration the relevance between recommended queries. This paper proposes a method of obtaining related queries and clustering them by using the history of query frequencies in query logs. We define similarity in queries based on the history of query frequency and use it for clustering queries. We selected various queries and extracted related queries and then clustered them. We found that our method was useful for clustering queries that were used in around the same term.
  • Yutaka Kabutoya, Takayuki Yumoto, Satoshi Oyama, Keishi Tajima, Katsumi Tanaka
    2007 IEEE INTERNATIONAL WORKSHOP ON DATABASES FOR NEXT GENERATION RESEARCHERS 43-+ 2007年  査読有り
    It is difficult to watch TV contents in an active manner such that the user can interactively select TV contents, because TV is originally a broadcast information media. It is also difficult for users to judge whether the information of TV contents is valid because conventional TV contents are not directly linked with related or evidence information. One of the methods to cope with these problems is to provide complementary or comparative information of TV contents obtained from other media such as Web etc. In our research, using the topic structure proposed by Ma et al., we evaluated quality of TV contents, and visualize the qualmity. In this paper we defined "contents coverage, " "generality, " and "social acceptance" as aspects of TV contents' quality, and examined to what extent there is complementary information against TV contents in Web pages. We also inplemented a new system to complement TV contents by Web pages, called "TV contents spectrum analyzer," which visualizes the degrees of generality and social acceptance of TV contents using WWW.
  • Yutaka Kabutoya, Takayuki Yumoto, Satoshi Oyama, Keishi Tajima, Katsumi Tanaka
    ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops 134 2006年  査読有り
    Recently, it is getting more frequent to search not Web contents but local contents, e.g., by Google Desktop Search. Google succeeded in the Web search because of its PageRank algorithm for the ranking of the search results. PageRank estimates the quality of Web pages based on their popularity, which in turn is estimated by the number and the quality of pages referring to them through hyperlinks. This algorithm, however, is not applicable when we search local contents without link structure, such as text data. In this research, we propose a method to estimate the quality of local contents without link structure by using the PageRank values of Web contents similar to them. Based on this estimation, we can rank the desktop search results. Furthermore, this method enables us to search contents across different resources such as Web contents and local contents. In this paper, we applied this method to Web contents, calculated the scores that estimate their quality, and we compare them with their page quality scores by PageRank.
  • Takayuki Yumoto, Katsumi Tanaka
    DIGITAL LIBRARIES: ACHIEVEMENTS, CHALLENGES AND OPPORTUNITIES, PROCEEDINGS 4312 244-+ 2006年  査読有り
    Conventional Web search engines rank their searched results page by page. That is, conventionally, the information unit for both searching and ranking is a single Web page. There are, however, cases where a set of searched pages shows a better similarity (relevance) to a given (keyword) query than each individually searched page. This is because the information a user wishes to have is sometimes distributed on multiple Web pages. In such cases, the information unit used for ranking should be a set of pages rather than a single page. In this paper, we propose the notion of a "page set ranking", which is to rank each pertinent set of searched Web pages. We describe our new algorithm of the page set ranking to efficiently construct and rank page sets. We present some experimental results and the effectiveness of our approach.
  • T Yumoto, K Tanaka
    DIGITAL LIBRARIES: IMPLEMENTING STRATEGIES AND SHARING EXPERIENCES, PROCEEDINGS 3815 301-310 2005年  査読有り
    Conventional Web search engines evaluate each single page as a ranking unit. When the information a user wishes to have is distributed on multiple Web pages, it is difficult to find pertinent search results with these conventional engines. Furthermore, search result lists are hard to check and they do not tell us anything about the relationships between the searched Web pages. We often have to collect Web pages that reflect different viewpoints. Here, a collection of pages may be more pertinent as a search result item than a single Web page.. In this paper, we propose the idea to realize the notion of "multiple viewpoint retrieval" in Web searches. Multiple viewpoint retrieval means searching Web pages that have been described from different viewpoints for a specific topic, gathering multiple collections of Web pages, ranking each collection as a search result and returning them as results. In this paper, we consider the case of page-pairs, We describe a feature-vector based approach to finding pertinent page-pairs. We also analyze the characteristics of page-pairs.
  • 湯本 高行, 田中 克己
    日本データベース学会letters 3(2) 17-20 2004年9月  査読有り
  • T Yumoto, Q Ma, K Sumiya, K Tanaka
    FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, PROCEEDINGS 83-92 2003年  査読有り
    Dynamic content integration of multiple information sources is one way of providing richer content that will satisfy the diverse demands of users. In this paper, we propose an XML-based language to compose synchronized content from web and video content. The notable features of this language are as follows: (1) dynamic unit identification of content that is composed into synchronized content and (2) dynamic retrieval of content through pre-defined retrieval criteria. This dynamic identification and retrieval of composable units are based on the author's intentions. Content authors can specify the units of their content that are to be integrated into new content by describing the conditions concerning this content and the conditions concerning the surrounding content. Although the proposed language looks like SMIL (Synchronized Multimedia Integration Language), it differs in its dynamic identification and retrieval capabilities. Indeed, the proposed language works just like the meta-mechanism for conventional SMIL. That is, the script written by the proposed language can generate SMIL data as its output.

MISC

 87

書籍等出版物

 1
  • 笹嶋 宗彦, 大島 裕明, 山本岳洋, 湯本 高行 (担当:分担執筆)
    朝倉書店 2023年9月 (ISBN: 9784254129151)

共同研究・競争的資金等の研究課題

 6