大会名称 |
---|
2023年 総合大会 |
大会コ-ド |
2023G |
開催年 |
2023 |
発行日 |
2023-02-28 |
セッション番号 |
D-20 |
セッション名 |
情報論的学習理論と機械学習 |
講演日 |
2023/3/8 |
講演場所(会議室等) |
2号館 2401教室 |
講演番号 |
D-20-19 |
タイトル |
Domain Adaptation for Japanese Sentence Embedding Models with Contrastive Learning |
著者名 |
◎Zihao Chen, Hisashi Handa, Kimiaki Shirahama, |
キーワード |
Domain adaptation, Contrastive learning, Data generation, Low-resource language |
抄録 |
We propose a novel Japanese sentence representation framework JCSE that creates training data for domain adaptation by generating sentences and synthesizing them with sentences available in a target domain. Specifically, a pre-trained data generator is finetuned to the target domain using our collected corpus. It is then used to generate contradictory sentence pairs that are used in contrastive learning for adapting a Japanese language model to a specific task in the target domain. The experimental results show that JCSE achieves significant performance improvement surpassing direct transfer and other training strategies. We believe that JCSE paves a practicable way for domain-specific downstream tasks in low-resource languages like Japanese. |
本文pdf |
PDF download
|