Curriculum Vitaes

Takayuki Yumoto

  (湯本 高行)

Profile Information

Affiliation
University of Hyogo
Degree
博士(情報学)(京都大学)

J-GLOBAL ID
200901000308952299
researchmap Member ID
5000091303

External link

Education

 1

Awards

 1

Papers

 26
  • KAWAHARA Takafumi, HASHIGUCHI Tomoya, YUMOTO Takayuki, OHSHIMA Hiroaki
    J105-D(5) 322-336, May 1, 2022  Peer-reviewed
    In this research, we propose a method for estimating the degree of injury from text documents that describe accidents. It is assumed that a text document to be input consists of a few sentences. The proposed method is to estimate the degree of injury by solving a classification problem using machine learning techniques. The data used in this research is the accident data published in the Accident Information Data Bank System. The text in the “Summary of the accident” field is used as an input. In the proposed method, an input text is represented as a distributed representation using the generic language model called BERT. As a model for BERT, we use a pre-trained model trained using the Japanese Wikipedia. To improve the performance of the task of estimating the degree of injury, we introduce the following four ideas; (1) the class weights, (2) the ordinal classification, (3) the multitasking learning, and (4) the fine-tuning model with token label estimation. We examined the effects of using and not using these ideas on the accuracy, Macro F1, RMSE, and confusion matrices for the task of estimating the degree of injury. The results showed that Macro F1 and RMSE are improved when (1) the class weights and (2) the ordinal classification are introduced. In addition, the accuracy is improved when (3) the multitasking learning is introduced.
  • Yuya Koyama, Takayuki Yumoto, Teijiro Isokawa, Naotake Kamiura
    Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication, IMCOM 2019, Phuket, Thailand, January 4-6, 2019, 996-1005, 2019  Peer-reviewed
  • 飯塚翔, 湯本高行, 新居学, 上浦尚武
    電子情報通信学会論文誌 D(Web), J101-D(4) 681‐689 (WEB ONLY), Apr 1, 2018  
  • Takayuki YUMOTO, Takahiro YAMANAKA, Manabu NII, Naotake KAMIURA
    International Journal of Biomedical Soft Computing and Human Sciences, 22(1) 9-18, 2017  Peer-reviewed
  • Sho Iizuka, Takayuki Yumoto, Manabu Nii, Naotake Kamiura
    Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, National Center of Sciences, Tokyo, Japan, June 7-10, 2016, 2016  Peer-reviewed
  • Takayuki Yumoto, Takahiro Yamanaka, Manabu Nii, Naotake Kamiura
    DIGITAL LIBRARIES: KNOWLEDGE, INFORMATION, AND DATA IN AN OPEN ACCESS SOCIETY, 10075 85-91, 2016  Peer-reviewed
    We propose rarity-oriented retrieval methods for serendipity using two approaches. We define rare information as relevant and atypical information. We propose two approaches. In the first approach, we use social bookmark data. We introduce tag estimation to our previous work. The second approach is based on word co-occurrence in a dataset. In both approaches, we use conditional probabilities to express relevancy and atypicality. In experiments, we compared our methods with the relevance-oriented method, the diversity-oriented method, and another rarity-oriented method. Our methods using word co-occurrence obtained better nDCG scores than the other methods.
  • Yasuyuki Okamura, Takayuki Yumoto, Manabu Nii, Naotake Kamiura
    Proceedings of the 14th International Conference WWW/Internet 2015, 55-62, Jan, 2015  Peer-reviewed
    © 2015. People are posting huge amounts of varied information on the Web as the popularity of social media continues to increase. The sentiment of a tweet posted on Twitter can reveal valuable information on the reputation of various targets both on the Web and in the real world. We propose a method to classify tweet sentiments by machine learning. In most cases, machine learning requires a significant amount of manually labeled data. Our method is different in that we use social bookmark data as training data for classifying tweets with URLs. In social bookmarks, comments are written using casual expressions, similar to tweets. Since tags in social bookmarks partly represent sentiment, they can be used as supervisory signals for learning. The proposed method moves beyond the basic "positive"/"negative" classification to classify impressions as "useful", "funny", "negative", and "other".
  • Takayuki Yumoto
    Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, NTCIR-11, National Center of Sciences, Tokyo, Japan, December 9-12, 2014, 2014  Peer-reviewed
  • Takayuki Yumoto, Ryohei Tada, Manabu Nii, Kunihiro Sato
    2013 SECOND IIAI INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2013), 284-288, 2013  Peer-reviewed
    In this paper, we propose rarity of a Web page in a category given by a user to find useful information that a few people know. A rare Web page is a page that belongs to a given category and that is atypical in the category. We define a probability that the page is a rare Web page in the given category as a rarity score. The rarity score is a product of a relevancy score and an atypicality score. The relevancy is a probability that a Web page belongs to a category given by a user. The atypicality is a conditional probability that a page is atypical in the category when it belongs to the category. Both probabilities are calculated by using tags of social bookmark services and words in Web pages. We evaluated the proposed relevancy score by classifying whetherWeb pages belong to a certain category. We also evaluated the proposed rarity as a metric for ranking Web pages, and compared the rankings by relevancy and atypicality. We confirmed usefulness of the rarity score to find relevant and atypical pages.
  • Tatsuya Fujisaka, Takayuki Yumoto, Kazutoshi Sumiya
    WEB INFORMATION SYSTEMS AND MINING, PT II, 6988 103-+, 2011  Peer-reviewed
    Conventional search engines are able to extract commonplace information by incorporating users requests into their queries. Users perform niche requests when they want to obtain atypical objects or unique information. In these instances, it is difficult for users to expand their queries to match their niche requests. In this paper, we introduce a query suggestion method for finding objects that have atypical characteristics. Our method focuses on the property values of an object, and elicits atypical property values by using the relation between an object's name and a typical property value.
  • Ling Xu, Takayuki Yumoto, Shinya Aoki, Qiang Ma, Masatoshi Yoshikawa
    Proceedings of the Annual Hawaii International Conference on System Sciences, 1-10, 2011  Peer-reviewed
    The advantages of the multimedia make the video news presented believable and impressed to the viewers when the personal opinions and ideological perspectives hidden in the contents still cause the effect. To reduce the risk of the misleading, based on a Material-Opinion model, we propose a method of detecting the inconsistent news items reporting the same event when the viewer is watching one of them. In the Material-Opinion Model, main participants filmed as the materials are presented to the viewer through the video stream, which is used to support the arguments put forward. Based on this model, given a series of multimedia news items reporting a same event, we explore inconsistency between any two of them by computing their dissimilarities of materials and of opinions. Material-dissimilarity is based on the appearance of the main participants in the video. Opinion-dissimilarity is calculated as the vector difference of two vectors consisting of the argument points extracted from the closed captions. If one of the dissimilarities is high and the other is low, we consider that there exists the inconsistency as a result. We also show some experimental results to validate the proposed methods.
  • Yuko Koba, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    2010 4th International Universal Communication Symposium, IUCS 2010 - Proceedings, 350-354, 2010  Peer-reviewed
    People often want to know the names of the objects that they can explain but don't know the names. It is, however, difficult to find such object names using conventional Web search engines. So, we propose a new method for finding the object name from the descriptions given by a user. This method consists of two phases, the extraction phase and the validation phase. In the extraction phase, candidate words are extracted by conducting a Web search using a combination of the queries generated from the user's descriptions. In the validation phrase, each candidate word is validated through a Web search using the candidate word. We rank the candidate words based on the user's description. We evaluated our algorithm by performing several tasks to find the object names from questions in Q&amp A sites. We also compared it with the methods using queries consisting of all the words in the description and queries consisting of user-selected and user-generated words. The precision by our algorithm was higher than the precision by the other methods. ©2010 IEEE.
  • Katsumi Tanaka, Hiroaki Ohshima, Adam Jatowt, Satoshi Nakamura, Yusuke Yamamoto, Kazutoshi Sumiya, Ryong Lee, Daisuke Kitayama, Takayuki Yumoto, Yukiko Kawai, Jianwei Zhang, Shinsuke Nakajima, Yoichi Inagaki
    Proceedings of the 4th International Conference on Ubiquitous Information Management and Communication ICUIMC 10, 147-156, 2010  Peer-reviewed
    We describe a new concept and method for evaluating the Web information credibility. The quality control of information (text, image, video etc.) on the Web is generally insufficient due to low publishing barriers. As a result, there is a large amount of mistaken and unreliable information on the Web that can have detrimental effects on users. This calls for technology that facilitates the judging of the credibility (expertise and trustworthiness) of Web content and the accuracy of the information that users encounter on the Web. Such technology should be able to handle a wide range of tasks: extracting several credibility-related features from the target Web content, extracting reputation-related information for the target Web content, such as hyperlinks and social bookmarks and evaluating its distribution, and evaluating features of the target content authors. We propose and describe methodologies of analyzing information credibility of Web information: (1) content analysis, (2) social support analysis and (3) author analysis. We overview our recent research activities on Web information credibility evaluation based on this methodologies. © 2010 ACM.
  • Takeru Nakabayashi, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi, Kazutoshi Sumiya
    ROLE OF DIGITAL LIBRARIES IN A TIME OF GLOBAL CHANGE, 6102 112-+, 2010  Peer-reviewed
    We define the peculiarity of text as a metric of information credibility. Higher peculiarity means lower credibility. We extract the theme word and the characteristic words from text and check whether there is a subject-description relation between them. The peculiarity is defined using the ratio of the subject-description relation between a theme word and characteristic words. We evaluate the extent to which peculiarity can be used to judge by classifying text from Wikipedia and Uncyclopedia in terms of the peculiarity.
  • Takayuki Yumoto, Kazutoshi Sumiya
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 6193 264-+, 2010  Peer-reviewed
    Social bookmarks are used to find Web pages drawing much attention. However, tendency of pages to collect bookmarks is different by their topic. Therefore, the number of bookmarks can be used to know attention intensity to pages but it cannot be used as the metric of the intensity itself. We define the relative quantity of social bookmarks (RQS) for measuring the attention intensity to a Web page. The RQS is calculated using the number of social bookmarks of related pages. Related pages are found using similarity based on specificity of social tags. We define two types of specificity, local specificity, which is the specificity for a user, and global, which is the specificity common in a social bookmark service.
  • Ryouji Nonaka, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    ACM International Conference Proceeding Series, 350-354, 2009  Peer-reviewed
    We propose a method for searching for comprehensible how-to information on the Web. In our how-to information search, we use lightweight analysis of Web pages to extract how-to information from Web pages obtained by conventional Web search engines and rank them according to their easily-viewable-degree. In the extraction process, we focus on expressions in Web page text blocks that describe procedures. In the ranking process, we focus on images, the effect of letter string and the length of the how-to information. Copyright 2009 ACM.
  • Shinya Aoki, Takayuki Yumoto, Manabu Nii, Yutaka Takahashi
    ACM International Conference Proceeding Series, 344-349, 2009  Peer-reviewed
    Recently, we have been able to often compare two objects using search engines. However, we often browse high ranked Web pages by search engines, which may give biased information. We propose a method for searching Web pages where two objects are compared using a search engine, extracting comparison points from those Web pages, and showing these points to users. Comparison points are keywords for comparing objects. The proposed method can be used to extract points for efficient comparison by using comparison expressions such as "Liquid Crystal TVs are better ..." and "... than Plasma TVs.", etc. Copyright 2009 ACM.
  • Takayuki Yumoto, Yuta Mori, Kazutoshi Sumiya
    SEVENTH INTERNATIONAL CONFERENCE ON CREATING, CONNECTING AND COLLABORATING THROUGH COMPUTING, PROCEEDINGS, 121-+, 2009  Peer-reviewed
    In this paper, we propose a method of converting a given sequence of search queries about a certain topic into a sequence of search queries about a given different topic. We define the concept of a search skeleton for topic conversion. A search skeleton represents relationships between keywords in a query. A given sequence of search queries is converted into a sequence of search skeletons, which art in turn converted into a sequence of search queries about the target topic. We evaluated our method of search query conversion and found that the precision for deciding types of subtopic keywords in search queries was 84.4%, the precision for finding relational keywords was 35.7%, and the precision for converting dynamic subtopic keywords was 40.0%.
  • Toru Onoda, Takayuki Yumoto, Kazutoshi Sumiya
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 162-+, 2008  Peer-reviewed
    Query-recommendation systems based on inputted queries have become widespread. These services are effective if users cannot input relevant queries. However, the conventional systems do not take into consideration the relevance between recommended queries. This paper proposes a method of obtaining related queries and clustering them by using the history of query frequencies in query logs. We define similarity in queries based on the history of query frequency and use it for clustering queries. We selected various queries and extracted related queries and then clustered them. We found that our method was useful for clustering queries that were used in around the same term.
  • Yutaka Kabutoya, Takayuki Yumoto, Satoshi Oyama, Keishi Tajima, Katsumi Tanaka
    2007 IEEE INTERNATIONAL WORKSHOP ON DATABASES FOR NEXT GENERATION RESEARCHERS, 43-+, 2007  Peer-reviewed
    It is difficult to watch TV contents in an active manner such that the user can interactively select TV contents, because TV is originally a broadcast information media. It is also difficult for users to judge whether the information of TV contents is valid because conventional TV contents are not directly linked with related or evidence information. One of the methods to cope with these problems is to provide complementary or comparative information of TV contents obtained from other media such as Web etc. In our research, using the topic structure proposed by Ma et al., we evaluated quality of TV contents, and visualize the qualmity. In this paper we defined "contents coverage, " "generality, " and "social acceptance" as aspects of TV contents' quality, and examined to what extent there is complementary information against TV contents in Web pages. We also inplemented a new system to complement TV contents by Web pages, called "TV contents spectrum analyzer," which visualizes the degrees of generality and social acceptance of TV contents using WWW.
  • Yutaka Kabutoya, Takayuki Yumoto, Satoshi Oyama, Keishi Tajima, Katsumi Tanaka
    ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops, 134, 2006  Peer-reviewed
    Recently, it is getting more frequent to search not Web contents but local contents, e.g., by Google Desktop Search. Google succeeded in the Web search because of its PageRank algorithm for the ranking of the search results. PageRank estimates the quality of Web pages based on their popularity, which in turn is estimated by the number and the quality of pages referring to them through hyperlinks. This algorithm, however, is not applicable when we search local contents without link structure, such as text data. In this research, we propose a method to estimate the quality of local contents without link structure by using the PageRank values of Web contents similar to them. Based on this estimation, we can rank the desktop search results. Furthermore, this method enables us to search contents across different resources such as Web contents and local contents. In this paper, we applied this method to Web contents, calculated the scores that estimate their quality, and we compare them with their page quality scores by PageRank.
  • Takayuki Yumoto, Katsumi Tanaka
    DIGITAL LIBRARIES: ACHIEVEMENTS, CHALLENGES AND OPPORTUNITIES, PROCEEDINGS, 4312 244-+, 2006  Peer-reviewed
    Conventional Web search engines rank their searched results page by page. That is, conventionally, the information unit for both searching and ranking is a single Web page. There are, however, cases where a set of searched pages shows a better similarity (relevance) to a given (keyword) query than each individually searched page. This is because the information a user wishes to have is sometimes distributed on multiple Web pages. In such cases, the information unit used for ranking should be a set of pages rather than a single page. In this paper, we propose the notion of a "page set ranking", which is to rank each pertinent set of searched Web pages. We describe our new algorithm of the page set ranking to efficiently construct and rank page sets. We present some experimental results and the effectiveness of our approach.
  • T Yumoto, K Tanaka
    DIGITAL LIBRARIES: IMPLEMENTING STRATEGIES AND SHARING EXPERIENCES, PROCEEDINGS, 3815 301-310, 2005  Peer-reviewed
    Conventional Web search engines evaluate each single page as a ranking unit. When the information a user wishes to have is distributed on multiple Web pages, it is difficult to find pertinent search results with these conventional engines. Furthermore, search result lists are hard to check and they do not tell us anything about the relationships between the searched Web pages. We often have to collect Web pages that reflect different viewpoints. Here, a collection of pages may be more pertinent as a search result item than a single Web page.. In this paper, we propose the idea to realize the notion of "multiple viewpoint retrieval" in Web searches. Multiple viewpoint retrieval means searching Web pages that have been described from different viewpoints for a specific topic, gathering multiple collections of Web pages, ranking each collection as a search result and returning them as results. In this paper, we consider the case of page-pairs, We describe a feature-vector based approach to finding pertinent page-pairs. We also analyze the characteristics of page-pairs.
  • 湯本 高行, 田中 克己
    日本データベース学会letters, 3(2) 17-20, Sep, 2004  Peer-reviewed
  • T Yumoto, Q Ma, K Sumiya, K Tanaka
    FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, PROCEEDINGS, 83-92, 2003  Peer-reviewed
    Dynamic content integration of multiple information sources is one way of providing richer content that will satisfy the diverse demands of users. In this paper, we propose an XML-based language to compose synchronized content from web and video content. The notable features of this language are as follows: (1) dynamic unit identification of content that is composed into synchronized content and (2) dynamic retrieval of content through pre-defined retrieval criteria. This dynamic identification and retrieval of composable units are based on the author's intentions. Content authors can specify the units of their content that are to be integrated into new content by describing the conditions concerning this content and the conditions concerning the surrounding content. Although the proposed language looks like SMIL (Synchronized Multimedia Integration Language), it differs in its dynamic identification and retrieval capabilities. Indeed, the proposed language works just like the meta-mechanism for conventional SMIL. That is, the script written by the proposed language can generate SMIL data as its output.

Misc.

 87

Books and Other Publications

 1
  • 笹嶋 宗彦, 大島 裕明, 山本岳洋, 湯本 高行 (Role: Contributor)
    朝倉書店, Sep, 2023 (ISBN: 9784254129151)

Research Projects

 6