理工学部 教員紹介

世木 寛之

セギ ヒロユキ  (Hiroyuki SEGI)

基本情報

所属
成蹊大学 理工学部 理工学科 教授
学位
博士(工学)(慶應義塾大学)

J-GLOBAL ID
201501025877783683
researchmap会員ID
B000244685

研究キーワード

 2

論文

 24
  • Ai Mizota, Hiroyuki Segi
    2021 IEEE International Conference on Consumer Electronics (ICCE) 2021年1月10日  査読有り
  • Hiroyuki Segi, Shoei Sato, Kazuo Onoe, Akio Kobayashi, Akio Ando
    Artificial Intelligence: Concepts, Methodologies, Tools, and Applications 3 2021-2037 2016年12月12日  査読有り
    Tied-mixture HMMs have been proposed as the acoustic model for large-vocabulary continuous speech recognition and have yielded promising results. They share base-distribution and provide more flexibility in choosing the degree of tying than state-clustered HMMs. However, it is unclear which acoustic models to superior to the other under the same training data. Moreover, LBG algorithm and EM algorithm, which are the usual training methods for HMMs, have not been compared. Therefore in this paper, the recognition performance of the respective HMMs and the respective training methods are compared under the same condition. It was found that the number of parameters and the word error rate for both HMMs are equivalent when the number of codebooks is sufficiently large. It was also found that training method using the LBG algorithm achieves a 90% reduction in training time compared to training method using the EM algorithm, without degradation of recognition accuracy.
  • Segi Hiroyuki
    INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT 7(2) 53-67 2016年4月  査読有り
  • 世木寛之
    成蹊大学理工学研究報告 52(2) 5-10 2015年12月  
    The 'Kabushiki Shikyo' program broadcast on NHK Radio 2 reports on the daily closing prices and net changes of about 830 stocks listed on the Tokyo Stock Exchange. Reading out the numerical values without making mistakes within the allotted broadcast time can be extremely difficult for the announcers. We have therefore developed an automatic broadcast system for stock-price bulletins, which uses numerical speech synthesis and automatic speech-rate conversion. Our system has been used in experimental digital terrestrial radio broadcasts since October 2006 and also used in NHK radio 2 since March 2010. This article describes the generation of texts to build the speech waveform database, the mechanism used to synthesize numerical speech via the database, and the evaluation of naturalness for the synthesized speech samples.
  • Hiroyuki Segi, Kazuo Onoe, Shoei Sato, Akio Kobayashi, Akio Ando
    Journal of Information Technology Research 7(3) 15-31 2014年7月1日  査読有り
    Tied-mixture HMMs have been proposed as the acoustic model for large-vocabulary continuous speech recognition and have yielded promising results. They share base-distribution and provide more flexibility in choosing the degree of tying than state-clustered HMMs. However, it is unclear which acoustic models to superior to the other under the same training data. Moreover, LBG algorithm and EM algorithm, which are the usual training methods for HMMs, have not been compared. Therefore in this paper, the recognition performance of the respective HMMs and the respective training methods are compared under the same condition. It was found that the number of parameters and the word error rate for both HMMs are equivalent when the number of codebooks is sufficiently large. It was also found that training method using the LBG algorithm achieves a 90% reduction in training time compared to training method using the EM algorithm, without degradation of recognition accuracy.

MISC

 6

書籍等出版物

 1
  • 八木伸行監修, 世木寛之ほか著 (担当:分担執筆, 範囲:第11章音声合成)
    オーム社 2008年7月

講演・口頭発表等

 47

共同研究・競争的資金等の研究課題

 1

産業財産権

 72