Presentation 2000/5/5
Interval Operations and Retrieval Using Tag Information for Efficient Linguistic Data Management
Hideki Mima, Junichi Tsujii,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In our attempt to accumulate a large amount of linguistic information for accelerating language processing, several linguistic tag systems have been proposed, such as by GDA(Global Document Annotation), GENIA, GATE(General Architecture for Text Engineering), as well as corpus development projects using the tag systems. In general, since tagged information is embedded in texts like HTML, multiple tag information sometimes reduces the visual transparency of the corpus. Furthermore, in the case where we allow the tagging scheme to include structural complexity such as nesting and possible candidate combinations of syntax/semantic structure, the difficulty of processing the tags reduces the efficiency of processing using the corpus. Thus, despite recent efforts to develop corpus processing and annotation tools, corpus development tools are still a focus of research interest. In this paper, we primarily discuss how we should handle tagging information in the corpus. Then we propose interval operations and interval retrievals including filtering using linguistic tag information for efficient linguistic data management. Since the proposed schemes supply general operations for tag data handling, we believe the scheme accelerates efficient management of the large volumes of data used in the most recently developed corpora.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) XML / tagging / database / interval operation / interval retrieval / linguistic data management
Paper # TL2000-6
Date of Issue

Conference Information
Committee TL
Conference Date 2000/5/5(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Thought and Language (TL)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Interval Operations and Retrieval Using Tag Information for Efficient Linguistic Data Management
Sub Title (in English)
Keyword(1) XML
Keyword(2) tagging
Keyword(3) database
Keyword(4) interval operation
Keyword(5) interval retrieval
Keyword(6) linguistic data management
1st Author's Name Hideki Mima
1st Author's Affiliation University of Tokyo()
2nd Author's Name Junichi Tsujii
2nd Author's Affiliation University of Tokyo
Date 2000/5/5
Paper # TL2000-6
Volume (vol) vol.100
Number (no) 47
Page pp.pp.-
#Pages 8
Date of Issue