研究者業績

大島 裕明

オオシマ ヒロアキ  (Hiroaki Ohshima)

基本情報

所属
兵庫県立大学 大学院情報科学研究科 准教授
学位
博士(情報学)(京都大学)

研究者番号
90452317
J-GLOBAL ID
201401077923568388
researchmap会員ID
7000008756

論文

 135
  • Hiroaki Ohshima, Adam Jatowt, Satoshi Oyama, Katsumi Tanaka
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 5802 167-180 2009年  査読有り
    In this paper, we describe an approach for detection and visualization of coordinate term relationships over time and their evolution using temporal data available on the Web. Coordinate terms are terms with the same hypernym and they often represent rival or peer relationships of underlying objects. We have built a system that portrays the evolution of coordinate terms in an easy and intuitive way based on data in an online news archive collection spanning more than 100 years. With the proposed method, it is possible to see the changes in the peer relationships between objects over time together with the context of these relationships. The experimental results proved quite high precision of our method and indicated high usefulness for particular knowledge discovery tasks. © 2009 Springer-Verlag Berlin Heidelberg.
  • Makoto P. Kato, Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    International Conference on Information and Knowledge Management, Proceedings 27-36 2009年  査読有り
    We describe methods to search with a query by example in a known domain for information in an unknown domain by exploiting Web search engines. Relational search is an effective way to obtain information in an unknown field for users. For example, if an Apple user searches for Microsoft products, similar Apple products are important clues for the search. Even if the user does not know keywords to search for specific Microsoft products, the relational search returns a product name by querying simply an example of Apple products. More specifically, given a tuple containing three terms, such as (Apple, iPod, Microsoft), the term Zune can be extracted from the Web search results, where Apple is to iPod what Microsoft is to Zune. As a previously proposed relational search requires a huge text corpus to be downloaded from the Web, the results are not up-to-date and the corpus has a high construction cost. We introduce methods for relational search by using Web search indices. We consider methods based on term co-occurrence, on lexico-syntactic patterns, and on combinations of the two approaches. Our experimental results showed that the combination methods got the highest precision, and clarified the characteristics of the methods. Copyright 2009 ACM.
  • Hiroaki Ohshima, Satoshi Oyama, Hiroyuki Kondo, Katsumi Tanaka
    ACM International Conference Proceeding Series 193-196 2009年  査読有り
    Since our daily lives strongly depend on information obtained by Web search, the credibility of Web search results has become crucial. An important aspect of the credibility of search results is regionality of Web pages. In this paper, we propose a system for helping users assess the credibility of search results by measuring and presenting the regionality of support to Web pages. We conceive two different types of measures for evaluating "geographical social support": the uniformity of support and the proximity of support. The uniformity of geographical support (US) indicates uniformity of geographic distribution of Web pages linking to a Web page. It is calculated by using the Kullback-Leibler (KL) divergence. The proximity of geographical support (PS) express how a page is supported by pages geographically located close to the page. We describe our implemented prototype system that shows the two measures for Web search results. Copyright 2009 ACM.
  • 大島 裕明, 田中 克己
    日本データベース学会論文誌 7(3) 1-6 2008年12月  査読有り
  • 金子 恭史, 中村 聡史, 大島 裕明, 田中 克己
    日本データベース学会論文誌 DBSJ Journal 7(1) 181-186 2008年6月  査読有り
  • 稲川 雅之, 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会論文誌 DBSJ Journal 7(1) 175-180 2008年6月  査読有り
  • Yuhei Akahoshi, Hiroaki Ohshima, Yutaka Kidawara, Katsumi Tanaka
    INTERNATIONAL CONFERENCE ON INFORMATICS EDUCATION AND RESEARCH FOR KNOWLEDGE-CIRCULATING SOCIETY, PROCEEDINGS 181-+ 2008年  査読有り
    We propose a way to wrap both Web information spaces and pervasive data environments by providing a uniform view of databases. Search result of search engines and processing results of several Web search services are wrapped and viewed as "virtual tables" of relational databases. The device manipulation environment and data captured by those devices are also viewed as "virtual tables". Invoking a ubiquitous device is regarded as the insertion, of a tuple into a virtual table. Since both the Web and pervasive device environments are regarded as uniform virtual tables, it becomes easy to develop services over the Web and the real world.
  • Hiroaki Ohshima, Adam Jatowt, Satoshi Oyama, Katsumi Tanaka
    2008 INTERNATIONAL WORKSHOP ON INFORMATION-EXPLOSION AND NEXT GENERATION SEARCH : INGS 2008, PROCEEDINGS 61-68 2008年  査読有り
    Certain data repositories provide search functionality for temporally ordered data. News archive search or blog search are examples of search interfaces that allow issuing structured queries composed of arbitrary terms and selected time constraints for performing temporal search. However, extracting aggregated knowledge,such as detecting the evolution of objects or their relationships through these interfaces is difficult for users. In this paper, we discuss the problem of knowledge extraction and agglomeration from repositories of temporal data. In particular, we propose a method,for detecting and visualizing changes in coordinate terms over time based on a news archive.
  • Shun Hattori, Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS 4976 99-110 2008年  査読有り
    Concept hierarchies, such as hyponymy and meronymy relations, are very important for various natural language processing systems. Many researchers have tackled how to mine very large corpora of documents such as the Web for them not manually but automatically. However, their methods are mostly based on lexico-syntactic patterns as not necessary but sufficient conditions of concept hierarchies, so they can achieve high precision but low recall when using stricter patterns or they can achieve high recall but low precision when using looser patterns. In this paper, property inheritance from a concept to its hyponyms is assumed to be necessary and sufficient conditions of hyponymy relations to achieve high recall and not low precision, and we propose a method to acquire hyponymy relations from the Web based on property inheritance.
  • Adam Jatowt, Yukiko Kawai, Hiroaki Ohshima, Katsumi Tanaka
    HYPERTEXT'08: Proceedings of the 19th ACM Conference on Hypertext and Hypermedia, HT'08 with Creating'08 and WebScience'08 5-14 2008年  査読有り
    The current Web is a dynamic collection where little effort is made to version pages or to enable users to access historical data. As a consequence, they generally do not have sufficient temporal support when browsing the Web. However, we think that there are many benefits to be obtained from integrating documents with their histories. For example, a document's history can enable us to travel back through time to establish its trustworthiness. This paper discusses the possible types of interactions that users could have with document histories and it presents several examples of systems that we have implemented for utilizing this historical data. To support our view, we present the results of an online survey conducted with the objective of investigating user needs for temporal support on the Web. Although the results indicated quite low use of Web archives by users, they simultaneously emphasized their considerable interest in page histories. Copyright 2008 ACM.
  • Makoto Kato, Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2008, PROCEEDINGS 5175 235-249 2008年  査読有り
    Conventional Web image search engines can return reasonably accurate results for queries containing concrete terms, but the results are less accurate for queries containing only abstract terms, such as "spring" or "peace." To improve the recall ratio without drastically degrading the precision ratio, we developed a method that replaces an abstract query term given by a user with a set of concrete terms and that uses these terms in queries input into Web image search engines. Concrete terms are found for a given abstract term by making use of social tagging information extracted from a social photo sharing system, such as Flickr. This information is rich in user impressions about the objects in the images. The extraction and replacement are done by (1) collecting social tags that include the abstract term, (2) clustering the tags in accordance with the term co-occurrence of images, (3) selecting concrete terms from the clusters by using WordNet, and (4) identifying sets of concrete terms that are associated with the target abstract term by using a technique for association rule mining. Experimental results show that our method improves the recall ratio of Web image searches.
  • Yasufumi Kaneko, Satoshi Nakamura, Hiroaki Ohishima, Katsumi Tanaka
    Digital Libraries: Universal and Ubiquitous Access to Information, Proceedings 5362 71-81 2008年  査読有り
    This paper proposes a method to allow users to search for Web pages according to their search intentions. We introduce a degree of "unconfidence" for each term in a Web search query. We first investigate the relationships among query terms by accessing a Web search engine. Next, according to a degree of users' unconfidences for each query term and the relationships among query terms, our system finds alternative terms by accessing to a Web search engine. Then, our system generates a collection of new queries that, are different from the original query and merges the Web search results obtained for each new query. We implemented our system and showed the. usefulness of our system based on user evaluation.
  • Masashi Yamaguchi, Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 249-257 2008年  査読有り
    A method is described for discovering coordinate terms, such as "Honda" and "Nissan," for a given term, such as "Toyota," as well as their common topic terms, from the query logs of a Web search engine. Coordinate terms are good candidates for use in making comparisons. A HITS-based algorithm is applied to a bipartite graph between coordinate term candidates and co-occurrence patterns to identify coordinate and topic terms. Spectral analysis is used to distinguish coordinate terms corresponding to different aspects of the search term. As a result, we can discover terms related to the terms in a search engine query that reflect the needs and interests of the user. © 2008 IEEE.
  • 大島 裕明, 小山 聡, 田中 克己
    情報処理学会論文誌. データベース 48(20) 50-60 2007年12月15日  査読有り
    本稿では,「Web集約質問」を処理するためのWeb検索エンジンに対する関係データベースインタフェースを提案する.Web検索の目的がページを発見することであるのに対して,Web集約質問処理ではWeb全体における集約された情報を取得することを目的とする.Web集約質問処理には,(1) Web検索,(2) 自然言語処理(NLP)によるデータ抽出,(3) 情報集約といった機能が必要である.関係データベースシステムはすでに強力な集約機能を保有しているため,我々は関係データベース上にWeb検索や自然言語処理機能のためのインタフェースを実装した.ユーザはSQLによって,アドホックなWeb集約質問処理や,そこで得られる知識とローカルデータベース上の既存のデータとの結合を行うことが可能となる.本稿では,これらのインタフェースの設計と実装,それを利用したアプリケーションについて述べる.We propose a relational database interface to conventional Web search engines for processing "Web aggregate queries". Whereas the purpose of a Web search is usually to find specific Web pages, the purpose of processing a Web aggregate query is to obtain aggregated information from the overall Web. To process Web aggregate queries, we need several functions such as (1) Web searching, (2) natural language processing for text extraction, and (3) aggregation of data. Because a relational database system has a robust aggregation capability, we implemented a relational database interface to conventional Web search engines with a natural language processing capability. Using SQL through the relational interface, users can formulate and execute several ad hoc SQL queries. Local databases can also be joined with Web search results through the relational database interface. We describe the design and implementation of the relational database interface, and its applications.
  • 服部 峻, 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会letters 6(2) 9-12 2007年9月  査読有り
  • 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会letters 6(2) 53-56 2007年9月  査読有り
  • 大島 裕明, 中村 聡史, 田中 克己
    日本データベース学会Letters 6(1) 113-116 2007年6月  査読有り
  • Satoshi Oyama, Taro Tezuka, Hiroaki Ohshima, Katsumi Tanaka
    Proceedings of DASFAA2007 International Workshop on Scalable Web Information Integration and Service (SWIIS2007) 9-12 2007年4月  査読有り
  • 大島 裕明, 山口 雅史, 小山 聡, 田中 克己
    日本データベース学会letters 5(4) 37-40 2007年3月  査読有り
  • 大島裕明, 小山聡, 田中克己
    電子情報通信学会論文誌 D J90-D(2) 196-208 2007年2月1日  査読有り
    本研究では,ユーザが既に保有している文書のいくつかをクエリとして,それらとカテゴリー的に「兄弟(sibling)」の関係にあるような文書を検索する手法について提案を行う.ここで検索したい文書は,クエリの文書群と大域的には同一カテゴリーに分類されるような文書で,その内容がクエリの文書群とは異なるような,似て非なるような内容の文書である.我々はこのような文書の検索を「兄弟カテゴリー文書の検索」と呼ぶ.現在,ある事柄について網羅的に調べるためにはWeb検索エンジンなどを用いるが,ユーザは探している文書で使われているような語を考え,その語をクエリとして検索エンジンから文書群を取得し,検索結果から既にもっている文書とは異なるような文書を一つひとつ調べていくことを行っている.そのような場面においては我々が提案するような「兄弟カテゴリー文書の検索」のニーズがあると考えられる.本論文では,ユーザが既にもっている文書のいくつかを,検索したい文書に対する兄弟カテゴリーとしてクエリとし,ある文書がそのクエリに対して兄弟カテゴリーに属するような文書として適合するものであるかどうかを評価する手法について提案を行う.
  • Satoshi Nakamura, Shinji Konishi, Adam Jatowt, Hiroaki Ohshima, Hiroyuki Kondo, Taro Tezuka, Satoshi Oyama, Katsumi Tanaka
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS 4675 38-+ 2007年  査読有り
    Increased usage of Web search engines in our daily lives means that the trustworthiness of searched results has become crucial. User studies on the usage of search engines and analysis of the factors used to determine trust that users have in search results are described in this paper. Based on the analysis, we developed a system to help users determine the trustworthiness of Web search results by computing and showing each returned page's topic majority, topic coverage, locality of supporting pages (i.e., pages linked to each search result) and other information. The measures proposed in the paper can be applied to the search of Web-based libraries or can be useful in the usage of digital library search systems.
  • Shun Hattori, Taro Tezuka, Hiroaki Ohshima, Satoshi Oyama, Junpei Kawamoto, Keishi Tajima, Katsumi Tanaka
    MODELING AND USING CONTEXT 4635 248-+ 2007年  査読有り
    This paper proposes a method of context-aware querying in mobile/ubiquitous Web searches, which provides mobile users with four capabilities: (1) context-aware keyphrase inference to help them input a keyphrase as a part of their keyword-based query, (2) context-aware subtopic tree generation to help them specify their information demand on one subtopic, (3) discovery of comparable keyphrases to their original query to help them make better decisions, and (4) meta vertical search focused on one subtopic to make the retrieval results more precise.
  • 大島 裕明, 小山 聡, 田中 克己
    情報処理学会論文誌. データベース 47(19) 98-112 2006年12月15日  査読有り
    本研究では,ユーザが与えた1語のクエリに対して,Web検索エンジンが持つ情報のみから同位語とそのコンテキストを発見する手法について提案する.同位語とは,共通の上位語を持つような語のことである.従来研究として,同位語や,上位語,下位語などを求めるような研究は数多くあるが,それらはWeb上の文書を利用するものも含めて,巨大なコーパスを解析して大量の結果を求めるというものであった.我々の提案する手法では,Web文書のタイトルやスニペットといったWeb検索エンジンが持つ情報のみを,少ない回数のWeb検索によって取得し,それらを解析して同位語を発見する.提案手法では,ある語に対する同位語は並列助詞「や」で接続されることを利用してWeb検索エンジンに対するクエリを作成して,その検索結果のみから同位語を得る.そこでは何の事前準備も必要なく,また,あらゆる分野の語に対して同位語を発見することができる.さらに,発見された同位語とクエリの語の背後にあるコンテキストも同時に取得する.このような同位語発見は,Web検索におけるクエリ拡張や想起支援や,何かを調べるにあたって他のものと比較したいときの比較対象の発見など,幅広い分野で利用することができると考えられる.We propose a method of using only a Web search engine index to discover coordinate terms, i.e., terms that have the same hypernym. Several research methods acquire coordinate terms, but they require huge corpora or many Web pages. Our proposed method uses only the information in a Web search engine index such as titles and snippets of Web pages. These are obtained by a few Web searches, and then they are parsed to discover coordinate terms. We focus attention on coordinate terms that are connected by the coordinating particle "ya," and use those to make queries for a Web search engine. Our method does not require any preprocessing, and can find coordinate terms for terms in any field. At the same time, we find the background context between a query term and each discovered coordinate term. Such a service for discovering coordinate terms can be used in any field for such purposes as query expansion, word remembrance support system, or finding comparable objects.
  • 山口 雅史, 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会letters 5(2) 17-20 2006年9月  査読有り
  • 野田 武史, 大島 裕明, 小山 聡, 田島 敬史, 田中 克己
    日本データベース学会letters 5(2) 69-72 2006年9月  査読有り
  • Masashi Yamaguchi, Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    Proceedings of the Workshop on Web Search Technology - from Search to Semantic Search, in conjunction with the 1st Asian Semantic Web Conference (ASWC 2006) 223-234 2006年9月  査読有り
  • 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会論文誌 DBSJ Letters 5(1) 121-124 2006年6月  査読有り
  • Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    WEB INFORMATION SYSTEMS - WISE 2006, PROCEEDINGS 4255 40-47 2006年  査読有り
    We propose a method for searching coordinate terms using a traditional Web search engine. "Coordinate terms" are terms which have the same hypernym. There are several research methods that acquire coordinate terms, but they need parsed corpora or a lot of computation time. Our system does not need any preprocessing and can rapidly acquire coordinate terms for any query term. It uses a conventional Web search engine to do two searches where queries are generated by connecting the user's query term with a conjunction "OR". It also obtains background context shared by the query term and each returned coordinate term.
  • Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    DIGITAL LIBRARIES: ACHIEVEMENTS, CHALLENGES AND OPPORTUNITIES, PROCEEDINGS 4312 91-+ 2006年  査読有り
    We propose methods of searching Web pages that are "semantically" regarded as "siblings" with respect to given page examples. That is, our approach aims to find pages that are similar in theme but have different content from the given sample pages. We called this "sibling page search". The proposed search methods are different from conventional content-based similarity search for Web pages. Our approach recommends Web pages whose "conceptual" classification category is the same as that of the given sample pages, but whose content is different from the sample pages. In this sense, our approach will be useful for supporting a user's opportunistic search, meaning a search in which the user's interest and intention are not fixed. The proposed methods were implemented by computing the "common" and "unique" feature vectors of the given sample pages, and by comparing those feature vectors with each retrieved page. We evaluated our method for sibling page search, in which our method was applied to test sets consisting of page collections from the Open Directory Project (ODP).
  • H Ohshima, S Oyama, K Tanaka
    FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS 3841 579-590 2006年  査読有り
    A method is described for extracting semantic relationships between terms appearing in documents stored on a personal computer; these relationships can be used to personalize Web search. It is based on the assumption that the information a person stores on a personal computer and the directory structure in the PC reflect, to some extent, the person's knowledge, ideology, and concept classification. It works by identifying semantic relationships between the terms in documents on the PC; these relationships reflect the person's relative valuation of each term in a pair, The directory structure is examined to identify the deviations in the appearance of the terms within each directory. These deviations are then used to identify the relationships between the terms. Four relationships are defined: broad, narrow, co-occurrent, and exclusive. They can be used to personalize Web search through, for example, expansion of queries and re-ranking of search results.
  • T Kawashige, S Oyama, H Ohshima, K Tanaka
    FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS 3841 486-497 2006年  査読有り
    When reading a Web page or editing a word processing document, we often search the Web by using a term on the page or in the document as part of a query. There is thus a correlation between the purpose for the search and the document being read or edited. Modifying the query to reflect this purpose can thus improve the relevance of the search results. There have been several attempts to extract keywords from the text surrounding the search term and add them to the initial query. However, identifying appropriate additional keywords is difficult; moreover, existing methods rely on precomputed domain knowledge. We have developed Context Matcher: a query modification method that uses the text surrounding the search term in the initial search results as well as the text surrounding the term in the document being read or edited, the "source document". It uses the text surrounding the search term in the initial results to weight candidate keywords in the source document for use in query modification. Experiments showed that our method often found documents more related to the source document than baseline methods that use context either in only the source document or search results.
  • 大島 裕明, 小山 聡, 田中 克己
    日本データベース学会Letters 4(2) 17-20 2005年9月  査読有り
  • Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka
    Proceedings - International Workshop on Biomedical Data Engineering, BMDE2005 2005 32-257 2005年  査読有り
    This paper proposes a method of extracting personal conceptual structures from documents on a personal computer that contain a great deal of personal information, and applying them to personalize Web searches. Everybody has differing ideologies, concepts, and knowledge and there is a lot of personal information stored on PCs. While it is easy for a person to determine what a PC user thinks and knows, computers cannot. In this work, two types of personal conceptual structures are represented, i.e., a "personal concept tree" and "relationships between terms". The "personal concept tree" indicates personal concept classification. "Relationships between terms " indicates how the user thinks of a term, and how it is created from the "personal concept tree " based on deviations in the appearance of the term. This paper also proposes the personalization of Web searches, i.e., expanding query keywords and re-ranking of search results using personal conceptual structures. © 2005 IEEE.
  • 大島 裕明
    神戸大学大学院自然科学研究科 2004年3月  査読有り

MISC

 205

書籍等出版物

 4

講演・口頭発表等

 4

共同研究・競争的資金等の研究課題

 19

産業財産権

 3

学術貢献活動

 2