Presentation 1998/10/14
Algorithms for Mining Optimal Binary Segmentations for Categorical Attributes
Takeshi Fukuda, Yasuhiko Morimoto, Takeshi Tokuyama,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We consider the problem of finding nearly optimal binary segmentations of categorical databases. Our goal is to find tests against explanatory attributes that split databases into two subsets, optimizing the value of an objective function. The problem is intractable for general objective functions. However, when the objective function is convex, there are effective algorithms for finding nearly optimal binary segmentations, and typical criteria, such as "entropy(mutual information), " and "gini index(mean squared error), " are actually convex. We propose practical algorithms that use computational geometry techniques to handle cases where a target attribute is not binary, in which conventional approaches could not be used directly.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) data mining / segmentation / decision tree / computational geometry / randomized algorithm
Paper # DE98-23
Date of Issue

Conference Information
Committee DE
Conference Date 1998/10/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Algorithms for Mining Optimal Binary Segmentations for Categorical Attributes
Sub Title (in English)
Keyword(1) data mining
Keyword(2) segmentation
Keyword(3) decision tree
Keyword(4) computational geometry
Keyword(5) randomized algorithm
1st Author's Name Takeshi Fukuda
1st Author's Affiliation IBM Tokyo Research Laboratory()
2nd Author's Name Yasuhiko Morimoto
2nd Author's Affiliation IBM Tokyo Research Laboratory
3rd Author's Name Takeshi Tokuyama
3rd Author's Affiliation IBM Tokyo Research Laboratory
Date 1998/10/14
Paper # DE98-23
Volume (vol) vol.98
Number (no) 316
Page pp.pp.-
#Pages 9
Date of Issue