Presentation 2022-05-27
[Poster Presentation] A Transformer for Long Medical Documents
Cherubin Mugisha, Incheon Paik,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Natural language processing models are advancing technology by extracting valuable information from different datasets. Biomedical texts are characterized by various challenges that require domain-specific models for effective biomedical text mining. Long sequences require large memory for a quadratic computation. With this work, we are introducing a transformer for long medical documents. We Trained this model with different biomedical datasets and it can handle a sequence length of up 4096 tokens. To build our model, we generated a Byte-leveltokenization using the Byte-Pair Encoding inspired by RoBERTa, in addition to an attention mechanism that scales linearly with the sequencelength. We fine-tuned our model and demonstrated competitive results on question answering tasks.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Transformer / Medical text / Tokenization / MIMIC
Paper # SC2022-8
Date of Issue 2022-05-20 (SC)

Conference Information
Committee SC
Conference Date 2022/5/27(1days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) AI Service and Digital Transformation, and general topics
Chair Shinji Kikuchi(NIMS)
Vice Chair Yoji Yamato(NTT) / Kosaku Kimura(Fujitsu)
Secretary Yoji Yamato(Kobe Univ.) / Kosaku Kimura(Tokyo Univ. of Tech.)
Assistant Shin Tezuka(Hitachi) / Takao Nakaguchi(KCGI)

Paper Information
Registration To Technical Committee on Service Computing
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Poster Presentation] A Transformer for Long Medical Documents
Sub Title (in English)
Keyword(1) Transformer
Keyword(2) Medical text
Keyword(3) Tokenization
Keyword(4) MIMIC
1st Author's Name Cherubin Mugisha
1st Author's Affiliation University of Aizu(UoA)
2nd Author's Name Incheon Paik
2nd Author's Affiliation University of Aizu(UoA)
Date 2022-05-27
Paper # SC2022-8
Volume (vol) vol.122
Number (no) SC-50
Page pp.pp.43-53(SC),
#Pages 11
Date of Issue 2022-05-20 (SC)