Presentation | 2007/7/17 Evaluation of the Similarity between Multiple Sentences using Sampling Techniques Ichiro YAMADA, Yohei NAKADA, Atsushi MATSUI, Takashi MATSUMOTO, Kikuka MIURA, Hideki SUMIYOSHI, Nobuyuki YAGI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In the closed captions, there are a lot of typical expressions to express specific things, for example, first introduction of a guest in a talk show or explanation of a place in travel program. Such information helps us to put metadata to the corresponding scenes. This paper proposes a method to evaluate the similarity between multiple sentences in order to extract a section in which sentences are similar to the typical expressions expressing specific things. The first step generates tree structures from input section of sentences and extracts subtrees from these tree structures. We use Gibbsboost algorithm which samples these subtrees for features and learns the features to evaluate the similarity. In the experiment of judging whether a section of sentences is similar to the section which explains a place with video targeting closed captions of TV programs concerned with travel, we show the effectiveness of our method. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Metadata generation / Typical expression extraction / Tree Structure analysis / GibbsBoost Algorithm / sampling |
Paper # | NLC2007-22 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2007/7/17(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Evaluation of the Similarity between Multiple Sentences using Sampling Techniques |
Sub Title (in English) | |
Keyword(1) | Metadata generation |
Keyword(2) | Typical expression extraction |
Keyword(3) | Tree Structure analysis |
Keyword(4) | GibbsBoost Algorithm |
Keyword(5) | sampling |
1st Author's Name | Ichiro YAMADA |
1st Author's Affiliation | NHK Science & Technical Research Laboratories() |
2nd Author's Name | Yohei NAKADA |
2nd Author's Affiliation | Dept. of Electrical Engineering and Bioscience, Waseda University |
3rd Author's Name | Atsushi MATSUI |
3rd Author's Affiliation | NHK Science & Technical Research Laboratories:Dept. of Electrical Engineering and Bioscience, Waseda University |
4th Author's Name | Takashi MATSUMOTO |
4th Author's Affiliation | Dept. of Electrical Engineering and Bioscience, Waseda University |
5th Author's Name | Kikuka MIURA |
5th Author's Affiliation | NHK Science & Technical Research Laboratories |
6th Author's Name | Hideki SUMIYOSHI |
6th Author's Affiliation | NHK Science & Technical Research Laboratories |
7th Author's Name | Nobuyuki YAGI |
7th Author's Affiliation | NHK Science & Technical Research Laboratories |
Date | 2007/7/17 |
Paper # | NLC2007-22 |
Volume (vol) | vol.107 |
Number (no) | 158 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |