Presentation 2007/7/17
Evaluation of the Similarity between Multiple Sentences using Sampling Techniques
Ichiro YAMADA, Yohei NAKADA, Atsushi MATSUI, Takashi MATSUMOTO, Kikuka MIURA, Hideki SUMIYOSHI, Nobuyuki YAGI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In the closed captions, there are a lot of typical expressions to express specific things, for example, first introduction of a guest in a talk show or explanation of a place in travel program. Such information helps us to put metadata to the corresponding scenes. This paper proposes a method to evaluate the similarity between multiple sentences in order to extract a section in which sentences are similar to the typical expressions expressing specific things. The first step generates tree structures from input section of sentences and extracts subtrees from these tree structures. We use Gibbsboost algorithm which samples these subtrees for features and learns the features to evaluate the similarity. In the experiment of judging whether a section of sentences is similar to the section which explains a place with video targeting closed captions of TV programs concerned with travel, we show the effectiveness of our method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Metadata generation / Typical expression extraction / Tree Structure analysis / GibbsBoost Algorithm / sampling
Paper # NLC2007-22
Date of Issue

Conference Information
Committee NLC
Conference Date 2007/7/17(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Evaluation of the Similarity between Multiple Sentences using Sampling Techniques
Sub Title (in English)
Keyword(1) Metadata generation
Keyword(2) Typical expression extraction
Keyword(3) Tree Structure analysis
Keyword(4) GibbsBoost Algorithm
Keyword(5) sampling
1st Author's Name Ichiro YAMADA
1st Author's Affiliation NHK Science & Technical Research Laboratories()
2nd Author's Name Yohei NAKADA
2nd Author's Affiliation Dept. of Electrical Engineering and Bioscience, Waseda University
3rd Author's Name Atsushi MATSUI
3rd Author's Affiliation NHK Science & Technical Research Laboratories:Dept. of Electrical Engineering and Bioscience, Waseda University
4th Author's Name Takashi MATSUMOTO
4th Author's Affiliation Dept. of Electrical Engineering and Bioscience, Waseda University
5th Author's Name Kikuka MIURA
5th Author's Affiliation NHK Science & Technical Research Laboratories
6th Author's Name Hideki SUMIYOSHI
6th Author's Affiliation NHK Science & Technical Research Laboratories
7th Author's Name Nobuyuki YAGI
7th Author's Affiliation NHK Science & Technical Research Laboratories
Date 2007/7/17
Paper # NLC2007-22
Volume (vol) vol.107
Number (no) 158
Page pp.pp.-
#Pages 6
Date of Issue