Presentation 1998/11/20
Maximizing Agreement with a Classification by Bounded Number of Associated Words
Shinichi Shimozono, Hiroki Arimura, Setsuo Arikawa,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We study the efficient discovery of word-association patterns, defined by a sequence of strings and a proximity gap between them, from a collection of texts with binary labels. We present an algorithm that finds all d strings and k proximity word-association patterns that maximizes agreement with the labels. It runs in expected time complexity O(k∧nlog∧n)and O(k∧n)space with the total length n of texts, if texts are uniformly random strings. It shows a contrast to a known fact that finding a best word-association pattern with arbitrarily many strings is intractable and hard to approximate within a factor arbitrary close to one, i.e., has no polynomial-time approximation scheme unless P=NP.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) data mining / maximum agreement problem / polynomial-time PAC learning
Paper # COMP98-52
Date of Issue

Conference Information
Committee COMP
Conference Date 1998/11/20(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Theoretical Foundations of Computing (COMP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Maximizing Agreement with a Classification by Bounded Number of Associated Words
Sub Title (in English)
Keyword(1) data mining
Keyword(2) maximum agreement problem
Keyword(3) polynomial-time PAC learning
1st Author's Name Shinichi Shimozono
1st Author's Affiliation Dept.of Artificial Intelligence, Kyushu Institute of Technology()
2nd Author's Name Hiroki Arimura
2nd Author's Affiliation Dept.of Informatics, Kyushu University
3rd Author's Name Setsuo Arikawa
3rd Author's Affiliation Dept.of Informatics, Kyushu University
Date 1998/11/20
Paper # COMP98-52
Volume (vol) vol.98
Number (no) 432
Page pp.pp.-
#Pages 8
Date of Issue