Presentation | 2024-03-10 Language Model for Variable Definition Extraction from Chemical Process-related Papers Shota Kato, Kotaro Nagayama, Manabu Kano, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Extracting variable definitions from scientific papers is crucial for understanding and leveraging research findings, yet the performance of current extraction methods can vary significantly across disciplines. This study addresses the challenge of variable extraction from documents related to chemical processes. We introduce an approach incorporating special tokens to signal the target variable for variable definition extraction. Our methodology, applied to the DeBERTaV3$_text{large}$ model, demonstrated a significant improvement over the current best-performing method, enhancing accuracy by 4.8 points to reach 86.9%. However, when implementing our approach with four BERT models, the results were only on par with the existing method. This discrepancy highlights the limitations of BERT-based models in capturing variable information effectively. Our findings underscore the importance of model selection in the domain-specific application of variable definition extraction. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Variable definition extraction / DeBERTaV3 / BERT / Scholarly document processing / Domain-specific NLP / Chemical process |
Paper # | NLC2023-25 |
Date of Issue | 2024-03-03 (NLC) |
Conference Information | |
Committee | NLC / IPSJ-NL |
---|---|
Conference Date | 2024/3/10(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Applications of natural language processing, and etc. |
Chair | Mitsuo Yoshida(Univ. of Tsukuba) / 須藤 克仁(NAIST) |
Vice Chair | Hiroki Sakaji(Univ. of Hokkaido) / Takeshi Kobayakawa(NHK) |
Secretary | Hiroki Sakaji(rinna) / Takeshi Kobayakawa(Hiroshima Univ. of Economics) / (JAIST) |
Assistant | Kanjin Takahashi(Sansan) / Yasuhiro Ogawa(Nagoya City Univ.Univ.) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Language Model for Variable Definition Extraction from Chemical Process-related Papers |
Sub Title (in English) | |
Keyword(1) | Variable definition extraction |
Keyword(2) | DeBERTaV3 |
Keyword(3) | BERT |
Keyword(4) | Scholarly document processing |
Keyword(5) | Domain-specific NLP |
Keyword(6) | Chemical process |
1st Author's Name | Shota Kato |
1st Author's Affiliation | Kyoto University(Kyoto Univ.) |
2nd Author's Name | Kotaro Nagayama |
2nd Author's Affiliation | Kyoto University(Kyoto Univ.) |
3rd Author's Name | Manabu Kano |
3rd Author's Affiliation | Kyoto University(Kyoto Univ.) |
Date | 2024-03-10 |
Paper # | NLC2023-25 |
Volume (vol) | vol.123 |
Number (no) | NLC-416 |
Page | pp.pp.13-18(NLC), |
#Pages | 6 |
Date of Issue | 2024-03-03 (NLC) |