Presentation 2024-03-10
Language Model for Variable Definition Extraction from Chemical Process-related Papers
Shota Kato, Kotaro Nagayama, Manabu Kano,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Extracting variable definitions from scientific papers is crucial for understanding and leveraging research findings, yet the performance of current extraction methods can vary significantly across disciplines. This study addresses the challenge of variable extraction from documents related to chemical processes. We introduce an approach incorporating special tokens to signal the target variable for variable definition extraction. Our methodology, applied to the DeBERTaV3$_text{large}$ model, demonstrated a significant improvement over the current best-performing method, enhancing accuracy by 4.8 points to reach 86.9%. However, when implementing our approach with four BERT models, the results were only on par with the existing method. This discrepancy highlights the limitations of BERT-based models in capturing variable information effectively. Our findings underscore the importance of model selection in the domain-specific application of variable definition extraction.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Variable definition extraction / DeBERTaV3 / BERT / Scholarly document processing / Domain-specific NLP / Chemical process
Paper # NLC2023-25
Date of Issue 2024-03-03 (NLC)

Conference Information
Committee NLC / IPSJ-NL
Conference Date 2024/3/10(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English) Applications of natural language processing, and etc.
Chair Mitsuo Yoshida(Univ. of Tsukuba) / 須藤 克仁(NAIST)
Vice Chair Hiroki Sakaji(Univ. of Hokkaido) / Takeshi Kobayakawa(NHK)
Secretary Hiroki Sakaji(rinna) / Takeshi Kobayakawa(Hiroshima Univ. of Economics) / (JAIST)
Assistant Kanjin Takahashi(Sansan) / Yasuhiro Ogawa(Nagoya City Univ.Univ.)

Paper Information
Registration To Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Language Model for Variable Definition Extraction from Chemical Process-related Papers
Sub Title (in English)
Keyword(1) Variable definition extraction
Keyword(2) DeBERTaV3
Keyword(3) BERT
Keyword(4) Scholarly document processing
Keyword(5) Domain-specific NLP
Keyword(6) Chemical process
1st Author's Name Shota Kato
1st Author's Affiliation Kyoto University(Kyoto Univ.)
2nd Author's Name Kotaro Nagayama
2nd Author's Affiliation Kyoto University(Kyoto Univ.)
3rd Author's Name Manabu Kano
3rd Author's Affiliation Kyoto University(Kyoto Univ.)
Date 2024-03-10
Paper # NLC2023-25
Volume (vol) vol.123
Number (no) NLC-416
Page pp.pp.13-18(NLC),
#Pages 6
Date of Issue 2024-03-03 (NLC)