Presentation 2013-09-13
Unsupervised word segmentation by enumerating maximal substrings
Yuta KAWACHI, Masato INOUE,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Unsupervised word segmentation is a method for estimating word boundaries from a given sentence itself, without any word delimiters, using rules depending on some specific languages, or man-made dictionaries. It is used to segment words in languages that have no word delimiters, such as Japanese, and aims to simulate the human's language acquisition. In this research, we propose a deterministic searching approach using maximal substrings based on a generative model of sentence. We evaluate the algorithm by using phonemic transcripts in English as an unknown language, which is a sentence that is widely used for evaluating the segmentation methods in this field.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) word segmentation / maximal substrings / language aquisition / generative model
Paper # NLC2013-25
Date of Issue

Conference Information
Committee NLC
Conference Date 2013/9/5(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Unsupervised word segmentation by enumerating maximal substrings
Sub Title (in English)
Keyword(1) word segmentation
Keyword(2) maximal substrings
Keyword(3) language aquisition
Keyword(4) generative model
1st Author's Name Yuta KAWACHI
1st Author's Affiliation Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University()
2nd Author's Name Masato INOUE
2nd Author's Affiliation Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University
Date 2013-09-13
Paper # NLC2013-25
Volume (vol) vol.113
Number (no) 213
Page pp.pp.-
#Pages 6
Date of Issue