Curriculum Vitaes

Virach Sornlertlamvanich

  (ソンラートラムワニッチ ウィラット)

Profile Information

Professor (Professor), Department of Data Science, Musashino University
Department of Engineering, Thammasat University
Ph.D.(Sep, 1998, Tokyo Institute of Technology)

Contact information
researchmap Member ID

External link

In 2003, he achieved the “National Distinguished Researcher Award” in Information Technology and Communication from the National Research Council of Thailand, following by the “ASEAN Outstanding Engineering Achievement Award” from ASEAN Federation of Engineering Organizations (AFEO) in 2011. He was also esteemed “The Researcher of the Year 2001” by the Nation Newspaper in 2001. He started his research career in the field of Knowledge Engineering and Artificial Intelligence during his study in Kyoto University in 1980-1986. He started his research in Natural Language Processing by participating in the Multi-lingual Machine Translation project during 1988-1995, and received his Ph.D. from Tokyo Institute of Technology in 1998. Some of his long-running research contributions can be seen in the initiative in the development of Thai POS tagged corpus (ORCHID, 1997), the first corpus based Thai-English dictionary (LEXiTRON, 1997), and the first English-Thai online machine translation web service (ParSit, 2000) based on the Inter-lingual approach. His recent efforts are on the research and development of the technologies for digital content creation and understanding. He proposed the Digitized Thailand project in 2009 to establish an intelligent service platform for being a fundamental framework for digital content sharing and application mashup. Some of the achievements have already been publicized in culture and local wisdom digitization and the applications on the digital content services for tourism, product design and education. His research interest includes Natural Language Processing, Human Language Technology, Information Retrieval, Data Mining, Artificial Intelligence, Machine Learning, Deep Learning, Social Media Analytics and the related fields.




  • Htet Htet Htun, Virach Sornlertlamvanich
    2017 8th International Conference on Information and Communication Technology for Embedded Systems, IC-ICTES 2017 - Proceedings, Jun 23, 2017  Peer-reviewed
    © 2017 IEEE. For the biomedical ontologies, Concept Similarity Measures (CSMs) become important in order to find similar treatments between diseases. For the ontology primitive concepts, they do not have enough definitions because they are partially defined in the ontology so one way to find the similarity between primitive concepts is to apply textual similarity methods between concept names. But existing textual similarity methods cannot give correct similarity degrees for all concept pairs. In this paper, we propose a new primitive concept name similarity measure based on natural language processing to get a better result in concept similarity measure in terms of noun phrase construction analysis. We conduct experiments on the standard clinical ontology SNOMED CT and make the comparison between our proposed method and existing two approaches against human expert results in order to prove our proposed similarity measure give correct and nearest similarity degree between primitive concepts.
  • Waranrach Viriyavit, Virach Sornlertlamvanich, Waree Kongprawechnon, Panita Pongpaibool, Tsuyoshi Isshiki
    2017 8th International Conference on Information and Communication Technology for Embedded Systems, IC-ICTES 2017 - Proceedings, Jun 23, 2017  Peer-reviewed
    © 2017 IEEE. This paper describes bed posture classification by using a Neural Network model for elderly care. Data collected from a sensor panel (composed of piezoelectric sensors and pressure sensors), which is placed under a mattress in the thoracic area, we use Neural Network for posture classification. Bayesian approach is used for estimating the likelihood of consecutive postures. The sensing data are normalized into a range of 0 to 1 by the unity-based normalization (or feature scaling) method for eliminating the bias between the different types of sensors. Also, the accumulated signal data in one second time slots (120-inputs set) can improve the coverage of the trained model. The results from Neural Network and Bayesian network estimation are combined by the weighted arithmetic mean. Our proposed technique is applied to elderly patient data with five different postures i.e., out of bed, sitting, lying down, lying left, and lying right. This resulted in 91.50% accuracy when the proportion of coefficient for Neural Network and Bayesian probability is 0.3 and 0.7 respectively.
  • Akanit Kwangkaew, Virach Sornlertlamvanich, Itsuo Kumazawa, Siriya Skolthanarat
    Proceedings - 11th 2016 International Conference on Knowledge, Information and Creativity Support Systems, KICSS 2016, 1-5, Jun 16, 2017  Peer-reviewed
    © 2016 IEEE. Economy trend (eco-trend) is the most important factor for developing the country. Unfortunately, various inevitable and unpredictable factor causes an effect on economic trend while the Natural Disaster period happened. The fluctuation of the trend is then occurred and make it more difficult to forecast. According to this research, the analysing method of the eco-trends prediction was represented by stock prices prediction and use the datasets of some industrials sector which mainly uses electricity for production. Then we found that the stock prices can be predicted more precisely after increasing electric energy consumption to be input features taking by using Artificial Neural Network. However, the result of the prediction is precisely in the normal period only. Therefore to analyse the prediction occurring in natural disaster period (the flood of Thailand 2011), the crosschecking method is considered. Finally, For the performance comparison of experiment results, the least mean squares (LMS) and root mean squared error (RMSE) are used. Finally the results of this research, they not only show how power consumption makes the results of stock prediction are more precise, but also provide the time-delay that is the indicator of the economic trends changing and can then explain the behaviour of the industrial segment in the natural disaster period.
  • Htet Htet Htun, Virach Sornlertlamvanich, Boontawee Suntisrivaraporn
    Proceedings - 11th 2016 International Conference on Knowledge, Information and Creativity Support Systems, KICSS 2016, 1-6, Jun 16, 2017  Peer-reviewed
    © 2016 IEEE. Recently, a non-standard reasoning service of measuring similarity between two concepts has been proposed for Description Logic (DL) ontologies, in addition to the classical reasoning service of testing subsumption and logical equivalence. One of the previous works suggests that similarity not only depends on the objective aspects (i.e. concept descriptions of the two concepts), but is also influenced by the subjective factors (i.e. judgments of the viewing agent). In this paper, we propose to employ various text similarity measures to compare the textual annotations of primitive concepts as well as primitive roles from the side of estimating human experts' interpretations. A collection of primitive similarity degrees obtained in this way is regarded as an automatically-generated possible doctors' judgments (preference profile) for primitive similarity measures. We perform extensive experiments on the renown clinical ontology SNOMED CT. After generating the primitive concepts similarity measures with various similarity methods, this paper presented interesting findings from the experiments and discuss benefits and usability of our approach.
  • Virach Sornlertlamvanich, Phat Jotikabukkana, Sarawoot Kongyoung, Yukari Shirota, Takako Hashimoto
    Jun, 2017  Peer-reviewedCorresponding author
  • Virach Sornlertlamvanich, La-or Kovavisaruch, Taweesak Sanpechuda, Krisada Chinda, Pobsit Kamolvej
    Journal of Applied Sciences, 5(3) 615-622, Jun, 2017  Peer-reviewedLast author
  • Htet Htet Htun, Virach Sornlertlamvanich
    Communications in Computer and Information Science, 780 76-90, 2017  Peer-reviewed
    © 2017, Springer Nature Singapore Pte Ltd. The semantic similarity measure between biomedical terms or concepts is a crucial task in biomedical information extraction and knowledge discovery. Most of the existing similarity approaches measure the similarity degree based on the path length between concept nodes as well as the depth of the ontology tree or hierarchy. These measures do not work well in case of the “primitive concepts” which are partially defined and have only few relations in the ontology structure. Namely, they cannot give the desired similarity results against human expert judge on the similarity among primitive concepts. In this paper, the existing two ontology-based measures are introduced and analyzed in order to determine their limitations with respect to the considered knowledge base. After that, a new similarity measure based on concept name analysis is proposed to solve the weakness of the existing similarity measures for primitive concepts. Using SNOMED CT as the input ontology, the accuracy of our proposal is evaluated and compared against other approaches with the human expert results based on different types of ontology concepts. Based on the correlation between the results of the evaluated measures and the human expert ratings, this paper analyzes the strength and weakness of each similarity measure for all ontology concepts.
  • Tran Sy Bang, Choochart Haruechaiyasak, Virach Sornlertlamvanich
    Frontiers in Artificial Intelligence and Applications, 292 135-144, 2017  Peer-reviewed
    © 2017 The authors and IOS Press. This paper aims to present the improved techniques to classify the user's feedbacks on hotel service qualities. The data were mainly collected from online feedback sources by PHP program. The training set was manually tagged as: NEGATIVE, POSITIVE, and NEUTRAL. In total, 2969 Vietnamese language terms were successfully collected. In the first part, the common machine learning techniques like K-Nearest Neighbor algorithm (KNN), Decision Tree, Naive Bayes (NB) and Support Vector Machines (SVM) were applying for classification. In the second part, we enhanced the efficiency of the text categorization by applying feature selection techniques, χ2 (CHI). At the end of the paper, we concluded that the overall performance of general machine learning techniques was significantly improved by applying feature selection.
  • Wahjoe Tjatur Sesulihatien, Yasushi Kiyoki, Shiori Sasaki, Azis Safie, Subagyo Yotopranoto, Virach Sornlertlamvanich, Aran Hansuebai, Petchporn Chawakitchareon
    Frontiers in Artificial Intelligence and Applications, 292 94-105, 2017  Peer-reviewed
    © 2017 The authors and IOS Press. Dengue fever is a communicable disease that attacks more than 120 countries in the world during 50 years. Therefore, it is to make sense to say that collaboration among the countries, especially neighborhood countries, is one important key to combat the dengue. Currently, except a serological collaboration, collaboration in dengue is sporadic and temporal. This paper addresses the initiative to build vector-control strategy collaborative among Surabaya (Indonesia), Kuala Lumpur (Malaysia), and Bangkok (Thailand). Deriving the global policy from World Health Organization (WHO), we build the system that (1) extracting global feature from the local feature, (2) selecting the significant features, to determine ranking of importance of a feature, by weighting a feature, and (3) matching the pattern of data to the suitable strategy by measuring the similarity. We built the system from the real data of the Surabaya, Kuala Lumpur and Bangkok in 2012. We verified reliability of the system by comparing the data with the actual action in January 2012 The result shows that the system is system feasible to be implemented, however we still need more preparation to implement the system.
  • La Or Kovavisaruch, Taweesak Sanpechuda, Krisada Chinda, Virach Sornlertlamvanich, Pobsit Kamonvej
    2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, ECTI-CON 2016, Sep 6, 2016  Peer-reviewed
    © 2016 IEEE. Visitor behavior analysis has always been a topic of interest for museum operators; however, common key issues arising such endeavors are the management and evaluation of collected data. In this paper we propose a models for analysis visitor behavior in terms of appropriate content presentation. This model will be used as a guide for exhibition arrangement and the determination of content provided at each point of interest. The proposed model analyze the audio content duration versus the visitor spending time at each point of interest. We provide the criteria to evaluate the suitable audio content duration for each point of interest. These methods were used to evaluate and generate recommendations for the Chao Sam Praya National Museum, located in the heart of Ayutthaya province. Evaluation results reveal and need for improvement in content length. In terms of audio content duration, only 9 out of 40 points of interest consist of suitable audio content duration, the others are either too short or too long and require adjustment.
  • Virach Sornlertlamvanich, Akanit Kwangkaew, Yukari Shirota, Takako Hashimoto
    Sep, 2016  Peer-reviewedCorresponding author
  • Virach Sornlertlamvanich, Phat Jotikabukkana, Yukari Shirota, Takako Hashimoto
    Sep, 2016  Peer-reviewedCorresponding author
  • Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, Choochart Haruechaiyasak
    Journal of ICT Research and Applications, 10(2) 177-196, 2016  Peer-reviewed
    © 2016 Published by ITB Journal Publisher. Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF) and Word Article Matrix (WAM) are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.
  • Virach Sornlertlamvanich, Canasai Kruengkrai
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9442 188-199, 2016  Peer-reviewed
    © Springer International Publishing Switzerland 2016. We explore the named entity (NE) recognition and semantic relation extraction technique on the Thai cultural database. Within the limited domain and well-structured database, our proposed method can perform in an acceptable high accuracy to generate the tuples of semantic relation for expressing the essence of the record in terms of infobox and knowledge map. In this paper, we propose a semantic relation extraction approach based on simple relation templates that determine relation types and their arguments. We attempt to reduce semantic drift of the arguments by using named entity models as semantic constraints. Experimental results indicate that our approach is very promising. We successfully apply our approach to a cultural database and discover more than 18,000 relation instances with expected high accuracy.
  • Yasushi Kiyoki, Xing Chen, Anneli Heimbürger, Petchporn Chawakitchareon, Virach Sornlertlamvanich
    Frontiers in Artificial Intelligence and Applications, 280 281-298, 2016  Peer-reviewed
    © 2016 The authors and IOS Press. All rights reserved. Humankind faces a most crucial mission; we must endeavour, on a global scale, to restore and improve our natural and social environments. In this environmental study, we will use context-dependent differential computation to analyse changes in various factors (temperatures, colours, level of CO2, habitats, sea levels, coral areas, etc.). In this paper, we will discuss a global environmental computing methodology for analysing the diversity of nature and animals, using a large amount of information on global environments.
  • Panuwat Assawinjaipetch, Kiyoaki Shirai, Virach Sornlertlamvanich, Sanparith Marukata
    Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies WLSI/OIAF4HLT@COLING(WLSI/OIAF4HLT@COLING), 36-43, 2016  Peer-reviewed
  • Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, Choochart Haruechaiyasak
    ICAICTA 2015 - 2015 International Conference on Advanced Informatics: Concepts, Theory and Applications, Nov 20, 2015  Peer-reviewed
    © 2015 IEEE. Social media text can illustrate significant information of our real social situation. It can show the direction of real-time social movement. However, it has its own characteristics such as using short text and informal language, many unstructured information and argot. This kind of text is hard to classify and difficult to analyze to extract the useful information. In this paper, we propose an effective technique to classify the social media text by utilizing the initial keywords from well-formed sources of data, such as online news. Term frequency-inverse document frequency weighting technique (TF-IDF) and Word Article Matrix (WAM) are used as main methods in this research. We use the extracted keywords from the well-formed source as a main factor to do experiment on Twitter messages. We found a set of the social media keywords can represent the essence of social events and can be used to classify the text effectively.
  • Virach Sornlertlamvanich
    Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014, 2-4, 2014  Peer-reviewed
    Copyright 2014 by Virach Sornlertlamvanich. Text from social media is significant key information to understand social movement. However, the length of the social media text is typically short and concise with a lot of absent words. Our task is to identify the proper keyword representing the message content that we are accounting for. Instead of training the model for keyword extraction directly from the Twitter messages, we propose a new method to fine-tune the model trained from some known documents containing richer context information. We conducted the experiment on Twitter messages and expressed in word cloud timeline. It shows a promising result.
  • Virach Sornlertlamvanich, Kobkrit Viriyayudhakorn
    Frontiers in Artificial Intelligence and Applications, 272 464-468, 2014  Peer-reviewed
    © 2014 The authors and IOS Press. All rights reserved. Text from social media is significant key information to understand social movement. However, the length of the social media text is typically short and concise with a lot of absent words. Our task is to identify the proper keyword representing the message content that we are accounting for. Instead of training the model for keyword extraction directly from the Twitter messages, we propose a new method to fine-tune the model trained from some known documents containing richer context information. We conducted the experiment on Twitter messages and expressed in word cloud timeline. It shows a promising result.
  • Takenobu Tokunaga, Sophia Y.M. Lee, Virach Sornlertlamvanich, Kiyoaki Shirai, Shu-Kai Hsieh, Chu-Ren Huang
    LMF – Lexical Markup Framework, Apr, 2013  Peer-reviewed
  • Verayuth Lertnattee, Sinthop Chomya, Virach Sornlertlamvanich
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8271 LNAI 119-130, 2013  Peer-reviewed
    The popularity of herbal medicines has greatly increased in worldwide countries over recent years. Herbal formula is a form of traditional medicine where herbs are combined to heal patient to heal faster and more efficiency. Herbal formulae can be divided into one or more therapeutic categories. The categories of a formula are usually based on decision from a group of experts. To support experts for classifying a formula, the normalized score centroid-based, is proposed for multi-label herbal formulae classification. The centroid-based classifier with more advanced term weight scheme is used. The normalized scores are calculated. The maximum number and cutoff point are set to adjust the decision for multi-label herbal formulae. The experiment is done using a mixed data set of herbal formulae collected from the Natural List of Essential Medicine and the list of common household remedies for traditional medicine. Moreover, a set of well-known commercial products are used for evaluating the effectiveness of the proposed method. From the results, the normalized score centroid-based classifier is an efficient method to classify multi-label herbal formulae. Its performance is depended on the set values of the maximum category and the cutoff point. © 2013 Springer-Verlag.
  • Canasai Kruengkrai, Virach Sornlertlamvanich, Watchira Buranasing, Thatsanee Charoenporn
    Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing(WSSANLP@COLING), 15-24, 2012  Peer-reviewed
  • L. Kovavisaruch, V. Sornlertlamvanich, P. Kamolvej, N. Iamrahong, G. Prommoon
    Annual SRII Global Conference, SRII, 623-627, 2012  Peer-reviewed
    This paper is a conceptual presentation of a standardized museum database capable of exchanging information across various museum exhibitions in different countries. The design is a result of a study conducted on the database for cultural and artistic museum collections from multiple agencies. The study proposes a standardized database containing a collection of common, required information among museums, as well as additional related information. Implementation of the conceptual database design is expected to ensure important information is not lost during data exchange, allow remote database searches from one museum to another, and enable the exchange of cultural and artistic information internationally between countries that employ the same standard. © 2012 IEEE.
  • Canasai Kruengkrai, Thatsanee Charoenporn, Virach Sornlertlamvanich
    Proceedings of the 3rd Named Entities Workshop(NEWS@IJCNLP), 28-31, 2011  Peer-reviewed
  • Toru Ishida, Yohei Murakami, Eri Tsunokawa, Yoko Kubota, Virach Sornlertlamvanich
    Cognitive Technologies, (9783642211775) 279-298, 2011  Peer-reviewed
    © 2011, Springer-Verlag Berlin Heidelberg. The concept of collective intelligence is contributing significantly to knowledge creation on the Web. While current knowledge creation activities tend to be founded on the approach of assembling content such as texts, images and videos, we propose here the service-oriented approach. We use the term service grid to refer to a framework of collective intelligence based on Web services. This chapter provides an institutional design mainly for non-profit service grids that are open to the public. In particular, we deepen the discussion of 1) intellectual property rights, 2) application systems, and 3) federated operations from the perspective of the following stakeholders: service providers, service users and service grid operators respectively. The Language Grid has been operating, based on the proposed institutional framework, since December 2007.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn
    Proceedings - 2011 2nd International Conference on Culture and Computing, Culture and Computing 2011, 92-97, 2011  Peer-reviewed
    Since 2006, by the Ministry of Culture, there was an effort in creating cultural portal. Each provincial cultural office was assigned to survey and report any cultural practice according to a designed template. The technological potential of each office was so different and most of them had to relied on the local system developer and service provider. As a result, the collected contents cannot fulfill the requirement of documentation standard and service level of media presentation. In contrast, cultural knowledge is redefined to an agreed structure for interoperability and computability. Moreover, the contents must reflect the up-to-date daily practice and thoroughly be covered by any individuals. We reuse the existing contents by recovering any typos and format distortion. The social networking system is prepared to allow individual participation to co-create the contents under an authorized supervision. As a result, more than 80 percents of the original contents have been recovered. Each provincial office is able to strategically plan and co-create their own contents to establish the total cultural knowledge. © 2011 IEEE.
  • Sineenat Tienkouw, Nutvadee Wongtosrad, Paramet Tanwanont, Rattapoom Niraswan, Chaimongkon Khlayprapha, Pisan Taesuwan, Thanate Muangthong, Sapa Chanyachatchawan, Rattapoom Tuchinda, Thatsanee Charoenporn, Virach Sornlertlamvanich
    PICMET: Portland International Center for Management of Engineering and Technology, Proceedings, 2011  Peer-reviewed
    In this paper, we present a strategic marketing plan combined with Technology-Based Marketing approach (TBM) and Competitive Strategy approach for an intelligent travel planning system named Pi-Pe which uses artificial intelligence to help users easily create their one-day trip schedule with three simple steps: (1) Select the date and time (2) State the starting location and (3) Pick the destinations. The purpose of the paper is to develop a set of technology-based marketing plan for Pi-Pe in the pursuit of competitive advantage to drive up product value and create sustained commercial advantage. © 2011 IEEE.
  • Verayuth Lertnattee, Sinthop Chomya, Virach Sornlertlamvanich
    Studies in Computational Intelligence, 283 99-110, 2010  Peer-reviewed
    Creating a system for collecting herbal information on the Internet, is not a trivial task. With the conventional techniques, it is hard to find the way which the experts can build a self-sustainable community for exchanging their knowledge. In this work, the Knowledge Unifying Initiator for Herbal Information (KUIHerb) is used as a platform for building a web community for collecting the intercultural herbal knowledge with the concept of a collective intelligence. With this system, images of herbs, herbal vocabulary and medicinal usages can be collected from this system. Due to the diversities of herbs, geographic distribution and their applications, one problem is the reliability of herbal information which is collected from this system. In this paper, three mechanisms are utilized for improving reliability of the system: (1) information for an herb is divided into several topics. Contributors could select some topics which they are expertise, (2) a voting system is applied and the standard source members (SSMs) are able to contribute their knowledge on text information, (3) a voting system, keywords and comments are implemented for controlling quality and reliable of images of an herb. With these mechanisms, herbal information on KUIHerb is more accurate and reliable. © 2010 Springer-Verlag Berlin Heidelberg.
  • Verayuth Lertnattee, Sinthop Chomya, Virach Sornlertlamvanich
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6483 LNCS 151-160, 2010  Peer-reviewed
    With the fast growing of using herbal medicine, pharmacists now need a basic knowledge of these topics for their professional practices. To serve the need, a set of courses on herbal medicine, is arranged for pharmacy students. Due to a limitation of time, it is hard for a student to familiar with medicinal herbs. In this paper, we introduce KUIHerbRx, a Web-based supplement learning tool on herbal medicine. The KUIHerbRx is a modified version of the Knowledge Unifying Initiator for Herbal Information (KUIHerb), which is used as a platform for building a Web community for collecting the intercultural herbal knowledge with the concept of a collective intelligence. Due to the diversities of herbs, geographic distribution and their applications, social network collaboration is important for enhanced learning. Three types of information creation, i.e., initial, voting and non-voting information, is designed. KUIHerbRx provides a supplement learning to improve knowledge and skill in herbal medicine with a scientific method. Information of medicinal herbs in several regions can be distributed and exchanged among groups of students. © 2010 Springer-Verlag Berlin Heidelberg.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn, Hitoshi Isahara
    Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010, 514-517, 2010  Peer-reviewed
    This paper presents the language resource management system for the development and dissemination of Asian WordNet (AWN) and its web service application. We develop the platform to establish a network for the cross language WordNet development. Each node of the network is designed for maintaining the WordNet for a language. Via the table that maps between each language WordNet and the Princeton WordNet (PWN), the Asian WordNet is realized to visualize the cross language WordNet between the Asian languages. We propose a language resource management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental web service applications for editing, visualizing and language processing. The WNMS is implemented on a web service protocol therefore each node can be independently maintained, and the service of each language WordNet can be called directly through the web service API. In case of cross language implementation, the synset ID (or synset offset) defined by PWN is used to determined the linkage between the languages.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn、Kergrit Robkop, Chumpol Mokarat, Hitoshi Isahara
    JAPIO 2009 Year Book, 2009 276-285[含 日本語文要旨], Sep, 2009  Peer-reviewed
  • Takenobu Tokunaga, Dain Kaplan, Nicoletta Calzolari, Monica Monachini, Claudia Soria, Virach Sornlertlamvanich, Thatsanee Charoenporn, Xia YingJu, Chu-Ren Huang, Shu-Kai Hsieh, Kiyoaki Shirai
    pp. 145-152, Aug, 2009  Peer-reviewed
  • Verayuth Lertnattee, Kergrit Robkob, Virach Sornlertlamvanich
    Proceedings of the 2009 ACM SIGCHI International Workshop on Intercultural Collaboration, IWIC'09, 13-21, 2009  Peer-reviewed
    Traditional knowledge about herbal medicine can be contributed from several cultures. With conventional techniques, it is hard to find a way in which experts can build a self-sustainable community for exchanging their knowledge. To alleviate the problem of gathering intellectual herbal information based on different cultures, the Knowledge Unifying Initiator for Herbal Information (KUIHerb) is used as a platform for building a web community for collecting the intercultural herbal knowledge. KUIHerb provides a capability for the expression of information about images, local names, parts used, indications, methods for preparation, precautions including toxicity and additional information. In cases where multiple opinions are provided, the popular vote will select the most preferable term, used in the community. Herb identification, herbal vocabulary, a list of experts in herbal medicine and multicultural knowledge can be collected from this system. Copyright 2009 ACM.
  • Verayuth Lertnattee, Sinthop Chomya, Thanaruk Theeramunkong, Virach Sornlertlamvanich
    Proceedings - IEEE 9th International Conference on Computer and Information Technology, CIT 2009, 2 178-183, 2009  Peer-reviewed
    Knowledge about herbal medicine can be contributed from experts in several cultures. With the conventional techniques, it is hard to find the way which the experts can build a self-sustainable community for exchanging their information. In this paper, the Knowledge Unifying Initiator for Herbal Information (KUIHerb) is used as a platform for building a web community for collecting the intercultural herbal knowledge with the concept of a collective intelligence. With this system, herb identification, herbal vocabulary and medicinal usages can be collected from this system. KUIHerb provides herbal vocabulary which is dynamically and confidentially applied for searching improvement on the Thai herbal search engine. Three strategies are utilized: (1) providing a set of technical terms in Thai with can be added into the dictionary. These terms are utilized by Thai word segmentation for improving the indexing process (2) A set of synonyms of these technical terms in both Thai and English is built for helping users from a lot of keywords of the same term and (3) a set of keywords from herbal usages can be combined with the name keyword. From the results, information collected from KUIHerb is useful for searching. © 2009 IEEE.
  • Nuansri Denwattana, Tawatchai Iempairote, Athita Chokananrattana, Virach Sornlertlamvanich
    WMSCI 2009 - The 13th World Multi-Conference on Systemics, Cybernetics and Informatics, Jointly with the 15th International Conference on Information Systems Analysis and Synthesis, ISAS 2009 - Proc., 4 334-339, 2009  Peer-reviewed
    Social software is providing new opportunities for individual expression, community's creation, collaboration and sharing. This paper proposes a social software called KuiPOLL to develop online social knowledge, focused on Poll-based Opinion and quesionnaires. KuiPOLL is a derived work of KUI (Knowledge Unifying Initiator) and it is a collaborative tool for opinion collection and event prediction. There are several main features in KuiPOLL varied from posting topic of interest, polled-based opinion and questionnaire supported for member and public, and KuiPOLL news. Since KuiPOLL is an action research, the feedback from community is important. We are planning to implement KuiPOLL to mobile devices.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn, Suphanut Thayaboon,Chumpol Mokarat, Hitoshi Isahara
    Third International Joint Conference on Natural Language Processing, IJCNLP 2008, Hyderabad, India, January 7-12, 2008, 105-106, Jan, 2008  Peer-reviewed
  • Virach Sornlertlamvanich, Thatsanee Charoenporn, Chumpol Mokarat, Hitoshi Isahara, Hammam Riza, Purev Jaimai
    Third International Joint Conference on Natural Language Processing, IJCNLP 2008, Hyderabad, India, January 7-12, 2008, 673-678, Jan, 2008  Peer-reviewed
    We explore an automatic WordNet synset assignment to the bi-lingual dictionaries.
  • Shisanu Tongchim, Virach Sornlertlamvanich, Hitoshi Isahara
    Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference, 1 123-126, 2008  Peer-reviewed
    This paper studies the role of base-NP information in dependency parsing for Thai. The baseline performance reveals that the base-NP chunking task for Thai is much more difficult than those of some languages (like English). The results show that the parsing performance can be improved (from 60.30% to 63.74%) with the use of base-NP chunk information, although the best chunker is still far from perfect (Fβ=1 = 83.06%). © 2008. Licensed under the Creative Commons.
  • Tokunaga Takenobu, Dain Kaplan, Chu Ren Huang, Shu Kai Hsieh, Nicoletta Calzolari, Monica Monachini, Claudia Soria, Shirai Kiyoaki, Virach Sornlertlamvanich, Thatsanee Charoenporn, Xia Yingju
    Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008, 1658-1663, 2008  Peer-reviewed
    Corpus-based approaches and statistical approaches have been the main stream of natural language processing research for the past two decades. Language resources play a key role in such approaches, but there is an insufficient amount of language resources in many Asian languages. In this situation, standardisation of language resources would be of great help in developing resources in new languages. This paper presents the latest development efforts of our project which aims at creating a common standard for Asian language resources that is compatible with an international standard. In particular, the paper focuses on i) lexical specification and data categories relevant for building multilingual lexical resources for Asian languages; ii) a core upper-layer ontology needed for ensuring multilingual interoperability and iii) the evaluation platform used to test the entire architectural framework.
  • Shisanu Tongchim, Randolf Altmeyer, Virach Sornlertlamvanich, Hitoshi Isahara
    Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008, 136-139, 2008  Peer-reviewed
    This paper presents some preliminary results of our dependency parser for Thai. It is part of an ongoing project in developing a syntactically annotated Thai corpus. The parser has been trained and tested by using the complete part of the corpus. The parser achieves 83.64% as the root accuracy, 78.54% as the dependency accuracy and 53.90% as the complete sentence accuracy. The trained parser will be used as a preprocessing step in our corpus annotation workflow in order to accelerate the corpus development.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn, Shisanu Tongchim, Canasai Kruengkrai, Hitoshi Isahara
    IEICE Transactions on Information and Systems, E90-D(10) 1565-1573, Oct, 2007  Peer-reviewed
    Several approaches have been studied to cope with the exceptional features of non-segmented languages. When there is no explicit information about the boundary of a word, segmenting an input text is a formidable task in language processing. Not only the contemporary word list, but also usages of the words have to be maintained to cover the use in the current texts. The accuracy and efficiency in higher processing do heavily rely on this word boundary identification task. In this paper, we introduce some statistical based approaches to tackle the problem due to the ambiguity in word segmentation. The word boundary identification problem is then defined as a part of others for performing the unified language processing in total. To exhibit the ability in conducting the unified language processing, we selectively study the tasks of language identification, word extraction, and dictionary-less search engine. Copyright © 2007 The Institute of Electronics, Information and Communication Engineers.
  • Shisanu Tongchim, Virach Sornlertlamvanich, Hitoshi Isahara
    IEICE Transactions on Information and Systems, E90-D(10) 1557-1564, Oct, 2007  Peer-reviewed
    This study initiates a systematic evaluation of web search engine performance using queries written in Thai. Statistical testing indicates that there are some significant differences in the performance of search engines. In addition to compare the search performance, an analysis of the returned results is carried out. The analysis of the returned results shows that the majority of returned results are unique to a particular search engine and each system provides quite different results. This encourages the use of metasearch techniques to combine the search results in order to improve the performance and reliability in finding relevant documents. We examine several metasearch models based on the Borda count and Condorcet voting schemes. We also propose the use of Evolutionary Programming (EP) to optimize weight vectors used by the voting algorithms. The results show that the use of metasearch approaches produces superior performance compared to any single search engine on Thai queries. Copyright © 2007 The Institute of Electronics, Information and Communication Engineers.
  • Thatsanee Charoenporn, Sareewan Thoongsup, Virach Sornlertlamvanich, Hitoshi Isahara
    SEALS XVII Conference, Aug, 2007  Peer-reviewed
    This is a part of the Development of Language Resource Standard for Semantic Web Applications Project.
  • Thatsanee Charoenporn, Canasai Kruengkrai, Thanaruk Theeramunkong, Virach Sornlertlamvanich
    IEICE Transactions on Information and Systems, E90-D(4) 775-782, Mar, 2007  Peer-reviewed
    Manually collecting contexts of a target word and grouping them based on their meanings yields a set of word senses but the task is quite tedious. Towards automated lexicography, this paper proposes a word-sense discrimination method based on two modern techniques; EM algorithm and principal component analysis (PCA). The spherical Gaussian EM algorithm enhanced with PCA for robust initialization is proposed to cluster word senses of a target word automatically. Three variants of the algorithm, namely PCA, sGEM, and PCA-sGEM, are investigated using a gold standard dataset of two polysemous words. The clustering result is evaluated using the measures of purity and entropy as well as a more recent measure called normalized mutual information (NMI). The experimental result indicates that the proposed algorithms gain promising performance with regard to discriminate word senses and the PCA-sGEM outperforms the other two methods to some extent. Copyright © 2007 The Institute of Electronics, Information and Communication Engineers.
  • Thatsanee Charoenporn, Virach Sornlertlamvanich, Chumpol Mokarat, Hitoshi Isahara, Hammam Riza, Purev Jaimai
    GWC 2008: 4th Global WordNet Conference, Proceedings, 101-110, 2007  Peer-reviewed
    This paper describes an automatic WordNet synset assignment to the existing bi-lingual dictionaries of languages having limited lexicon information. Generally, a term in a bi-lingual dictionary is provided with very limited information such as part-of-speech, a set of synonyms, and a set of English equivalents. This type of dictionary is comparatively reliable and can be found in an electronic form from various publishers. In this paper, we propose an algorithm for applying a set of criteria to assign a synset with an appropriate degree of confidence to the existing bi-lingual dictionary. We show the efficiency in nominating the synset candidate by using the most common lexical information. The algorithm is evaluated against the implementation of Thai-English, Indonesian-English, and Mongolian-English bi-lingual dictionaries. The experiment also shows the effectiveness of using the same type of dictionary from different sources. © University of Szeged, Department of Informatics, 2007. All rights are reserved.
  • Shisanu Tongchim, Virach Sornlertlamvanich, Hitoshi Isahara
    Proceedings - 21st International Conference on Advanced Information Networking and Applications Workshops/Symposia, AINAW'07, 2 283-288, 2007  Peer-reviewed
    This paper provides the results of public web search engine evaluation based on queries written in Thai. Statistical testing shows that there are some significant differences among engines. Besides comparing the effectiveness of web search engines, the returned results are compared in order to illustrate the relation and overlap among these results. The results reveal that the majority of returned results are quite unique. Since the results among engines differ greatly, this encourages the use of metasearch approaches to combine best search results from different engines. We examine metasearch models based on the Borda count voting scheme. We also propose the use of Evolutionary Programming (EP) to optimize the weight vector used by the Borda count algorithm. The results show that the use of metasearch approaches produces superior performance compared to any single search engine on Thai queries. © 2007 IEEE.
  • Renu Gupta, Virach Sornlertlamvanich
    Text Entry Systems, 227-249, 2007  Peer-reviewed
    This chapter describes text entry devices and techniques used in two Asian countries: India in South Asia and Thailand in Southeast Asia. As the script spread through India and Asia, the organizing principle remained intact but each country created its own set of symbols depending on the material used for writing. In north India, where a reed pen was used for writing, the scripts have a distinctive horizontal line, but in south India and Southeast Asia, where a stylus was used to write on palm leaves, the scripts had to be more rounded. Therefore, different languages have mapped different symbols onto this inventory. When computers were introduced into a country, the necessary hardware and software was developed to enable users to enter text. This resulted in numerous solutions that were not compatible across platforms. To ensure consistency, government organizations tried to establish a common standard. In the case of hardware, all the countries chose to adapt the QWERTY keyboard to their scripts, because selecting an entirely new keyboard presented logistical problems. For software, compatibility is now possible because of Unicode. © 2007 Copyright © 2007 Elsevier Inc. All rights reserved.
  • Virach Sornlertlamvanich, Thatsanee Charoenporn, Kergrit Robkop, Hitoshi Isahara
    GWC 2008: 4th Global WordNet Conference, Proceedings, 419-427, 2007  Peer-reviewed
    This paper describes a multi-lingual WordNet construction tool, called KUI (Knowledge Unifying Initiator), which is a knowledge user interface for online collaborative knowledge construction. KUI facilitates online community in developing and discussing multi-lingual WordNet. KUI is a sort of social networking system that unifies the various discussions following the process of thinking model, i.e. initiating the topic of interest, collecting the opinions to the selected topics, localizing the opinions through the translation or customization and finally posting for public hearing to conceptualize the knowledge. The process of thinking is done under the selectional preference simulated by voting mechanism in the case that there are many alternatives. By measuring the history of participation of each member, KUI adaptively manages the reliability of each member's opinion and vote according to the estimated ExpertScore. As a result, the multi-lingual WordNet can be created online and produce a reliable result. © University of Szeged, Department of Informatics, 2007. All rights are reserved.



Books and Other Publications


Research Projects



  • Subject
    Information Retrieval
    Basic and advanced techniques for text-based information systems: efficient text indexing; Boolean and vector space retrieval models; evaluation and interface issues; Web search including crawling, link-based algorithms, and Web metadata; text/Web clustering, classification; text mining.
  • Subject
    Management Information System
    Management Information System explores the use of information systems in today's organizations. This is an exciting field because of the degree of change occurring in technology and how that translates into new opportunities for management and business process. Knowledge about information systems is essential for creating successful, competitive firms, for managing global corporations, for adding business value, and for providing useful products and services to customers. Throughout the course, case studies are provided to illustrate how organizations use IT to manage their businesses. The main topics covered in the course include
    l organizations,management,andthenetworkedenterprise
    l informationtechnology,infrastructure,platforms,andtelecommunications l systemsdevelopmentandmanagement,managingglobalsystems
    l applicationsforthedigitalfirm,includinge-businessande-commerce.
  • Subject
    Big Data Analytics
    The recent explosion of social media, IOT technology, and the computerization of every aspect of economic activity resulted in the creation of big data. It is estimated that 80% and more of the data is in the unstructured form, in parallel with the development of computer which has kept getting ever more powerful and storage ever cheaper. Big data analytics is the process of examining large data sets to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits. This course brings together several key information technologies in Big Data, AI, Machine Learning and Deep Learning to use in manipulating, storing, and analyzing big data.
  • Subject
    Creative Thinking
    Creative thinking and designing tools. The ways to innovation and problem solving.