Presentation 1995/7/21
A Linear-Time Algorithm for Optimal Generalization of Language Data
Hideki Tanaka,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The proper treatment of structured attributes in inductive learning is getting much attention as this learning technique is now frequently applied to the knowledge extraction in natural language processing. In this context, the problem is finding a set of thesaurus nodes that maximally generalizes words in the learning source, but causes minimum errors. The number of candidate node sets, however, explodes as the thesaurus size increases, and no efficient algorithm has been discovered so far. In this paper, we propose the algorithm T^* which can find the optimal node sets in linear-time. This algorithm first converts the thesaurus into a directed acyclic graph changing this difficult problem into a shortest path problem with a graph where we can use an efficient algorithm. We then show that T^* can also be used to find the optimally pruned decision tree.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Machine Learning / Structured Attributes / Generalization / Thesaurus / Corpus / Machine Translation
Paper #
Date of Issue

Conference Information
Committee NLC
Conference Date 1995/7/21(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Linear-Time Algorithm for Optimal Generalization of Language Data
Sub Title (in English)
Keyword(1) Machine Learning
Keyword(2) Structured Attributes
Keyword(3) Generalization
Keyword(4) Thesaurus
Keyword(5) Corpus
Keyword(6) Machine Translation
1st Author's Name Hideki Tanaka
1st Author's Affiliation NHK Science and Technical Research Laboratories()
Date 1995/7/21
Paper #
Volume (vol) vol.95
Number (no) 169
Page pp.pp.-
#Pages 6
Date of Issue